Intro

ChatGPT is to LLM, as Kleenex is to tissue

What are LLMs?

Tokens

Demo - Manual LLM inference 

LLM generate text

What LLMs predict

Vectors, intuitively

Word embeddings and nearest neighbors

Demo - Semantic meaning of word embeddings 

The architecture for a Large Language Model

How LLMs predict

Self-attention adds context

Demo - Adding "context" to a vector

MLP transforms

Demo - The necessity of non-linearities

How Transformers predict

Absolute positional encoding

Demo Cons of absolute positional bias

Demo - skip connections

Batch norm

RMS norm

How LLMs use position

Workshop-feedback-qa

How LLMs attend

Modern day transformer architectures

Modern LLM connection to papers

You can take the workshop from anywhere in the world, as long as you
have a computer and an internet connection. You also have the opportunity to ask the instructors questions live.

Live and remote

Learn at your own pace, whenever it's convenient for you. With no
rigid schedule to worry about, you can take the course on your own
terms.

Recorded

Learn by building while you learn the concepts.

Build

Join a [vibrant community](/discord) of other students who
are also learning with Fundamentals of transformers. Ask questions, get feedback and
collaborate with others to take your skills to the next level.

Community

In this workshop, we dive deep into Large Language Models (LLMs) to help you understand, build, and optimize their architecture for real-world applications. LLMs are transforming industries—from customer support to content creation—but understanding how these models work, let alone optimizing them, can be challenging. 

In this comprehensive 9-module series, we cover:

The technical essentials of LLMs, including autoregressive decoding, positional encoding, and multi-head attention
The entire LLM lifecycle, from pretraining on massive datasets to fine-tuning and instruction tuning for specialized tasks
Best practices for evaluating LLMs, identifying bottlenecks, and leveraging state-of-the-art architectures for efficiency and scalability

this workshop includes hours of in-depth instruction, hands-on coding exercises, and access to a community forum for support and discussions. You'll also gain exclusive access to source code templates, an expansive reference library, and downloadable materials for continued learning.

It's taught by [Alvin Wan](https://www.linkedin.com/in/alvinwan/), a Senior Research Scientist at Apple and a PhD student at UC Berkeley with international recognition for his impactful contributions in efficient AI and design. With his practical industry experience and research insights, you’ll be guided from fundamentals to advanced concepts with clarity and precision.

By the end of this workshop, you’ll not only understand how to create and optimize LLMs but also how to apply this knowledge across various applications in tech and business.

Everyone knows chatgpt, but how do modern large language models fully work? The fundamentals start at the transformer. This workshop is a workshop to dymstify the transformer and be able to run through concept to code on how the transformer work. This workshops combines concept at an intutive level, to code, to math all with the intent at providing an end to end understanding at the fundamentals of large language models.

Fundamentals of transformers - Live Workshop

Miniaturized version of Huggingface’s LLM inference utility

Interactive demos to understand word embeddings and model representation capacity

Visualization utilities for real-world LLMs, to understand self-attention

Intro to LLM Basics: Learn foundational concepts, including terminology like models, data, algorithms, and optimization.

Autoregressive Decoding: Grasp how LLMs predict words through conditional generation, supported by manual inference demos.

LLM Prediction Mechanism: Explore LLM architecture, with an intuitive look at vectors and word embeddings.

Semantic Meaning in Embeddings: See how word embeddings represent semantic meaning through nearest neighbors and vector demos.

Transformer Core Mechanics: Unpack the inner workings of a transformer layer, including self-attention and context addition.

Non-Linear Transformations: Discover why non-linearities are essential, supported by hands-on matrix multiplication and MLP demos.

Positional Encoding: Learn absolute and relative positional encoding techniques, plus RMS Norm for positional bias management.

Differences between absolute and relative positional encoding

Attention Mechanisms: Delve into “forward-facing” and multi-head attention to understand attention values.

Advanced Attention: Study grouped-query attention and its importance in handling large data.

Current Transformer Models: Analyze academic and modern transformer diagrams, identifying bottlenecks in today’s LLMs.

Build a Mini LLM Inference Tool: Create a simplified version of Huggingface’s LLM utility to understand LLM operation.

Understand Word Embeddings: Develop interactive demos exploring word embeddings and how models represent words

Visualize Self-Attention: Use visualization tools to understand the role of self-attention in language models.

Learn

The newline Guide to Building Your First GraphQL Server with Node and TypeScript

Teach

Amelia Wattenberger

Author of Fullstack D3

Community

Masterclasses

Tutorials

Fullstack React with TypeScript