Live Workshop

Fundamentals of transformers - Live Workshop

Everyone knows chatgpt, but how do modern large language models fully work? The fundamentals start at the transformer. This workshop is a workshop to dymstify the transformer and be able to run through concept to code on how the transformer work. This workshops combines concept at an intutive level, to code, to math all with the intent at providing an end to end understanding at the fundamentals of large language models.

5.0 / 5 (2 ratings)
Published 9 months ago
Updated 6 months ago

On demand video

2 hrs 42 mins

Video Lessons

20 Videos

Workshop Instructor

Alvin Wan

Previously he was a Senior Research Scientist at Apple working on large language models with Apple Intelligence. He formerly worked on Tesla AutoPilot and graduated with his PhD at UC Berkeley with 3000+ citations and 800+stars for his work.

How The Workshop Works

01Live and remote

You can take the workshop from anywhere in the world, as long as you have a computer and an internet connection. You also have the opportunity to ask the instructors questions live.

02Recorded

Learn at your own pace, whenever it's convenient for you. With no rigid schedule to worry about, you can take the course on your own terms.

03Build

Learn by building while you learn the concepts.

04Community

Join a vibrant community of other students who are also learning with Fundamentals of transformers. Ask questions, get feedback and collaborate with others to take your skills to the next level.

Workshop Preview

What You Will Build In This Workshop

Workshop Overview

What You Will Learn

Intro to LLM Basics: Learn foundational concepts, including terminology like models, data, algorithms, and optimization.
Autoregressive Decoding: Grasp how LLMs predict words through conditional generation, supported by manual inference demos.
LLM Prediction Mechanism: Explore LLM architecture, with an intuitive look at vectors and word embeddings.
Semantic Meaning in Embeddings: See how word embeddings represent semantic meaning through nearest neighbors and vector demos.
Transformer Core Mechanics: Unpack the inner workings of a transformer layer, including self-attention and context addition.
Non-Linear Transformations: Discover why non-linearities are essential, supported by hands-on matrix multiplication and MLP demos.
Positional Encoding: Learn absolute and relative positional encoding techniques, plus RMS Norm for positional bias management.
Differences between absolute and relative positional encoding
Attention Mechanisms: Delve into “forward-facing” and multi-head attention to understand attention values.
Advanced Attention: Study grouped-query attention and its importance in handling large data.
Current Transformer Models: Analyze academic and modern transformer diagrams, identifying bottlenecks in today’s LLMs.
Build a Mini LLM Inference Tool: Create a simplified version of Huggingface’s LLM utility to understand LLM operation.
Understand Word Embeddings: Develop interactive demos exploring word embeddings and how models represent words
Visualize Self-Attention: Use visualization tools to understand the role of self-attention in language models.

In this workshop, we dive deep into Large Language Models (LLMs) to help you understand, build, and optimize their architecture for real-world applications. LLMs are transforming industries—from customer support to content creation—but understanding how these models work, let alone optimizing them, can be challenging.

In this comprehensive 9-module series, we cover:

The technical essentials of LLMs, including autoregressive decoding, positional encoding, and multi-head attention The entire LLM lifecycle, from pretraining on massive datasets to fine-tuning and instruction tuning for specialized tasks Best practices for evaluating LLMs, identifying bottlenecks, and leveraging state-of-the-art architectures for efficiency and scalability

this workshop includes hours of in-depth instruction, hands-on coding exercises, and access to a community forum for support and discussions. You'll also gain exclusive access to source code templates, an expansive reference library, and downloadable materials for continued learning.

It's taught by Alvin Wan, a Senior Research Scientist at Apple and a PhD student at UC Berkeley with international recognition for his impactful contributions in efficient AI and design. With his practical industry experience and research insights, you’ll be guided from fundamentals to advanced concepts with clarity and precision.

By the end of this workshop, you’ll not only understand how to create and optimize LLMs but also how to apply this knowledge across various applications in tech and business.

Our students work at

Module 1

What are LLMs?

2 Lessons 11 Minutes

Demystifying terminology behind LLMs

01Intro
00:07:09
Introduction
02ChatGPT is to LLM, as Kleenex is to tissue
00:04:06
Introduction to the LLMs

Module 2

What LLMs predict

3 Lessons 20 Minutes

Introduction to Autoregressive Decoding

01Tokens
00:07:12
02Demo - Manual LLM inference
00:09:28
03LLM generate text
00:04:01

Module 3

How LLMs predict

3 Lessons 19 Minutes

The architecture for a Large Language Model

01Vectors, intuitively
00:04:37
02Word embeddings and nearest neighbors
00:04:17
03Demo - Semantic meaning of word embeddings
00:10:13

Module 4

How Transformers predict

4 Lessons 40 Minutes

The innards of a transformer layer

01Self-attention adds context
00:09:19
02Demo - Adding "context" to a vector
00:15:31
03MLP transforms
00:07:27
04Demo - The necessity of non-linearities
00:08:34

Module 5

How LLMs use position

5 Lessons 39 Minutes

How to Leverage Positional Bias

01Absolute positional encoding
00:04:08
02Demo Cons of absolute positional bias
00:08:29
03Demo - skip connections
00:05:31
04Batch norm
00:14:34
05RMS norm
00:06:46

Module 6

How LLMs attend

1 Lesson 7 Minutes

How to find the needle in the haystack

01Workshop-feedback-qa
00:07:31

Module 7

Modern LLM connection to papers

2 Lessons 23 Minutes

Connection to papers

01Modern day transformer architectures
00:03:55
02Q&A
00:19:48

Subscribe for a Free Lesson

By subscribing to the newline newsletter, you will also receive weekly, hands-on tutorials and updates on upcoming courses in your inbox.

What Students are Saying

Meet the Workshop Instructor

Alvin Wan

👋 Hi, I’m Alvin Wan, a Senior Research Scientist at Apple and a PhD student at UC Berkeley specializing in efficient deep learning. My focus is on speeding up Large Language Models (LLMs) and advancing computer vision applications for self-driving cars and virtual reality. My work has earned international recognition for its design and social impact, and I’m here to share insights gained from the front lines of AI research. this workshop is your gateway to understanding, building, and optimizing LLMs—minus the jargon and complexity. (Oh, and my Mandarin? Just about on par with a kindergartener’s!) Join me to master LLMs in real-world applications.

Purchase the workshop today

Frequently Asked Questions

How is this workshop structured, and what topics does it cover?

this workshop introduces Large Language Models (LLMs) from foundational concepts to practical applications. Key topics include terminology, architecture (such as transformers and embeddings), ecosystem overview, market applications, developer tools, product integrations, and predictive mechanisms in LLMs. It also delves into advanced topics like autoregressive decoding, multi-head attention, and performance optimization techniques.

Is this workshop suitable for my skill level?

The course is designed for learners with a basic understanding of programming and machine learning concepts. However, it covers a range of levels. For beginners, fundamental concepts are introduced in detail, while advanced sections, such as transformer architecture and multi-query attention, offer in-depth insights. You can skip advanced sections and focus on introductory modules if you’re newer to the topic.

Will I get real-world examples and practical applications in this workshop?

Yes, the course emphasizes practical, real-world applications of LLMs, with use cases such as chatbots, coding assistants, and data analysis tools. For example, demonstrations of how embeddings and transformer layers work are provided to help bridge theoretical knowledge with application.

How frequently is the course content updated?

The course content is reviewed and updated regularly to keep pace with advances in LLM technologies and AI development practices. This includes updates to information about tools, libraries, and popular AI-centric platforms, ensuring relevance in a rapidly evolving field.

Does this workshop cover current AI tools and integrations?

Yes, we cover a broad spectrum of contemporary tools and integrations. This includes popular platforms like Apple Intelligence, Google AI, and tools for developers such as vector databases and fine-tuning frameworks, ensuring a comprehensive understanding of the LLM ecosystem.

How are complex concepts like self-attention and autoregressive decoding explained?

Complex topics are broken down through visualizations, interactive examples, and analogies. For example, self-attention and autoregressive decoding are explained step-by-step, using diagrams that “unroll” the process for easy understanding.

Will I be able to access this workshop on my mobile or tablet?.

Yes, the course is optimized for access across multiple devices, including mobile, tablet, and desktop, ensuring flexibility to learn on the go.

Is there a certificate upon completion of the course?

Yes, you can get a certificate by sending us a message.

Can I ask questions during the course?

Yes, you can ask questions in the comments section of each lesson, and our team will respond as quickly as possible. You can also ask us questions anytime through the community driven Discord channel.

Can I download the course videos?

No, the course videos cannot be downloaded, but they can be accessed online at any time.

What is the price of the course?

The course is currently priced at $197 USD.

How is this workshop different from other content available on LLMs?

this workshop on Large Language Models (LLMs) stands out from others by delivering not only the foundational understanding of LLMs but also diving into practical, real-world applications tailored to industry-specific challenges. We focus on case studies, interactive labs, and personalized support that go beyond theoretical knowledge, ensuring you’re equipped to implement LLMs effectively in your own work environment. By the end of this workshop, you'll gain actionable insights and hands-on skills that are immediately applicable, setting you apart in a rapidly evolving tech landscape.

Learn

The newline Guide to Building Your First GraphQL Server with Node and TypeScript

Teach

Amelia Wattenberger

Author of Fullstack D3

Community

Fundamentals of transformers - Live Workshop

01Live and remote

02Recorded

03Build

04Community

What You Will Build In This Workshop

Our students work at

Workshop Syllabus and Content

What are LLMs?

What LLMs predict

How LLMs predict

How Transformers predict

How LLMs use position

How LLMs attend

Modern LLM connection to papers

Subscribe for a Free Lesson

What Students are Saying

Meet the Workshop Instructor

Alvin Wan

Purchase the workshop today

One-Time Purchase

Frequently Asked Questions

Masterclasses

Tutorials

Fullstack React with TypeScript