Skip links

  • Skip to primary navigation
  • Skip to content
  • Skip to footer
Zhehan Shi
  • Home
  • Projects
  • Posts
  • About

    Transformer Architecture Tutorial

    less than 1 minute read

    Here is the list of good resources to understand transformer architecture.

    1. Distilled AI on Transformer

    2. Harvard Annotated Transformer

    3. Fast.ai Transformer Tutorial

    4. Dive into Deep Learning on Transformer

    5. The Little Book of Deep Learning

    6. Understanding Deep Learning

    Tags: Artificial Intelligence, Machine Learning

    Categories: Computer Science, Data Science

    Updated: April 25, 2023

    X Facebook LinkedIn Bluesky
    Previous Next

    You May Also Enjoy

    Universal Function Approximator

    less than 1 minute read

    Universal Approximation Theorem The universal approximation theorems (UATs) state that neural networks with a certain structure can, in principle, a...

    Solving International Mathematical Olympiad with GPT-OSS-120B (AIMO3)

    7 minute read

    My Work This post documents my hosted implementation and results. AIMO3 + GPT-OSS 120B Jupyter Notebook HTML Result: On the IMO-style test set, my ...

    Energy-Based Models & Structured Prediction

    8 minute read

    Energy-Based Models (EBMs) assign a scalar energy to configurations of variables and perform inference by minimizing energy. Intro We tackle structured p...

    Curiosity & Perserverance

    less than 1 minute read

    This is a good tweet. Riding my first #ebike today and it feels like the future has arrived.— Ben Cichy (@bencichy) November 22, 2019

    © Zhehan Shi