Live Workshop

Fundamentals of transformers - Live Workshop

Everyone knows chatgpt, but how do modern large language models fully work? The fundamentals start at the transformer. This workshop is a workshop to dymstify the transformer and be able to run through concept to code on how the transformer work. This workshops combines concept at an intutive level, to code, to math all with the intent at providing an end to end understanding at the fundamentals of large language models.

  • 5.0 / 5 (1 rating)
  • Published
  • Updated
Workshop Instructor
Avatar Image

Alvin Wan

Currently at OpenAI. Previously he was a Senior Research Scientist at Apple working on large language models with Apple Intelligence. He formerly worked on Tesla AutoPilot and graduated with his PhD at UC Berkeley with 3000+ citations and 800+stars for his work.

How The Workshop Works

01Live and remote

You can take the workshop from anywhere in the world, as long as you have a computer and an internet connection. You also have the opportunity to ask the instructors questions live.

02Recorded

Learn at your own pace, whenever it's convenient for you. With no rigid schedule to worry about, you can take the course on your own terms.

03Build

Learn by building while you learn the concepts.

04Community

Join a vibrant community of other students who are also learning with Fundamentals of transformers. Ask questions, get feedback and collaborate with others to take your skills to the next level.

Workshop Preview

What You Will Build In This Workshop

Workshop Overview

What you will learn
  • Intro to LLM Basics: Learn foundational concepts, including terminology like models, data, algorithms, and optimization.

  • Autoregressive Decoding: Grasp how LLMs predict words through conditional generation, supported by manual inference demos.

  • LLM Prediction Mechanism: Explore LLM architecture, with an intuitive look at vectors and word embeddings.

  • Semantic Meaning in Embeddings: See how word embeddings represent semantic meaning through nearest neighbors and vector demos.

  • Transformer Core Mechanics: Unpack the inner workings of a transformer layer, including self-attention and context addition.

  • Non-Linear Transformations: Discover why non-linearities are essential, supported by hands-on matrix multiplication and MLP demos.

  • Positional Encoding: Learn absolute and relative positional encoding techniques, plus RMS Norm for positional bias management.

  • Differences between absolute and relative positional encoding

  • Attention Mechanisms: Delve into “forward-facing” and multi-head attention to understand attention values.

  • Advanced Attention: Study grouped-query attention and its importance in handling large data.

  • Current Transformer Models: Analyze academic and modern transformer diagrams, identifying bottlenecks in today’s LLMs.

  • Build a Mini LLM Inference Tool: Create a simplified version of Huggingface’s LLM utility to understand LLM operation.

  • Understand Word Embeddings: Develop interactive demos exploring word embeddings and how models represent words

  • Visualize Self-Attention: Use visualization tools to understand the role of self-attention in language models.

In this workshop, we dive deep into Large Language Models (LLMs) to help you understand, build, and optimize their architecture for real-world applications. LLMs are transforming industries—from customer support to content creation—but understanding how these models work, let alone optimizing them, can be challenging.

In this comprehensive 9-module series, we cover:

The technical essentials of LLMs, including autoregressive decoding, positional encoding, and multi-head attention The entire LLM lifecycle, from pretraining on massive datasets to fine-tuning and instruction tuning for specialized tasks Best practices for evaluating LLMs, identifying bottlenecks, and leveraging state-of-the-art architectures for efficiency and scalability

this workshop includes hours of in-depth instruction, hands-on coding exercises, and access to a community forum for support and discussions. You'll also gain exclusive access to source code templates, an expansive reference library, and downloadable materials for continued learning.

It's taught by Alvin Wan, a Senior Research Scientist at Apple and a PhD student at UC Berkeley with international recognition for his impactful contributions in efficient AI and design. With his practical industry experience and research insights, you’ll be guided from fundamentals to advanced concepts with clarity and precision.

By the end of this workshop, you’ll not only understand how to create and optimize LLMs but also how to apply this knowledge across various applications in tech and business.

Our students work at

  • salesforce-seeklogo.com.svgintuit-seeklogo.com.svgAdobe.svgDisney.svgheroku-seeklogo.com.svgAT_and_T.svgvmware-seeklogo.com.svgmicrosoft-seeklogo.com.svgamazon-seeklogo.com.svg

Workshop Syllabus and Content

Section 1

What are LLMs?

Demystifying terminology behind LLMs. ChatGPT is to LLM, as Kleenex is to tissue. Model, Data, Algorithms, Optimization

Section 2

What LLMs predict

Introduction to Autoregressive Decoding. Conditional generation. Demo: Manual LLM inference

Section 3

How LLMs predict

The architecture for a Large Language Model. Vectors, intuitively. Word embeddings. Nearest neighbors. Demo: Semantic meaning of word embeddings

Section 4

How Transformers predict

The innards of a transformer layer. Self-attention adds context. Demo: Adding “context” to a vector. Matrix multiplies, intuitively. MLP transforms. Demo: The necessity of non-linearities

Section 5

How LLMs attend

How to find the needle in the haystack. Absolute positional encoding. Demo: Cons of absolute positional bias. Relative positional encoding. Demo: Long-term Decay. RMS Norm.

Section 6

Modern LLM connection to papers

Connection to papers. Forward-facing” attention. Demo: Exploring attention values. Multi-Head Attention. Grouped-Query Attention

Subscribe for a Free Lesson

By subscribing to the newline newsletter, you will also receive weekly, hands-on tutorials and updates on upcoming courses in your inbox.

Meet the Workshop Instructor

Alvin Wan

Alvin Wan

👋 Hi, I’m Alvin Wan, a Senior Research Scientist at Apple and a PhD student at UC Berkeley specializing in efficient deep learning. My focus is on speeding up Large Language Models (LLMs) and advancing computer vision applications for self-driving cars and virtual reality. My work has earned international recognition for its design and social impact, and I’m here to share insights gained from the front lines of AI research. this workshop is your gateway to understanding, building, and optimizing LLMs—minus the jargon and complexity. (Oh, and my Mandarin? Just about on par with a kindergartener’s!) Join me to master LLMs in real-world applications.

Purchase the course today

One-Time Purchase

Fundamentals of transformers - Live Workshop

$197$500$303.00 off
Fundamentals of transformers - Live Workshop
  • Discord Community Access
  • Full Transcripts
  • Money Back Guarantee
  • Lifetime Access

Frequently Asked Questions

How is this workshop structured, and what topics does it cover?

this workshop introduces Large Language Models (LLMs) from foundational concepts to practical applications. Key topics include terminology, architecture (such as transformers and embeddings), ecosystem overview, market applications, developer tools, product integrations, and predictive mechanisms in LLMs. It also delves into advanced topics like autoregressive decoding, multi-head attention, and performance optimization techniques.

Is this workshop suitable for my skill level?

The course is designed for learners with a basic understanding of programming and machine learning concepts. However, it covers a range of levels. For beginners, fundamental concepts are introduced in detail, while advanced sections, such as transformer architecture and multi-query attention, offer in-depth insights. You can skip advanced sections and focus on introductory modules if you’re newer to the topic.

Will I get real-world examples and practical applications in this workshop?

Yes, the course emphasizes practical, real-world applications of LLMs, with use cases such as chatbots, coding assistants, and data analysis tools. For example, demonstrations of how embeddings and transformer layers work are provided to help bridge theoretical knowledge with application.

How frequently is the course content updated?

The course content is reviewed and updated regularly to keep pace with advances in LLM technologies and AI development practices. This includes updates to information about tools, libraries, and popular AI-centric platforms, ensuring relevance in a rapidly evolving field.

Does this workshop cover current AI tools and integrations?

Yes, we cover a broad spectrum of contemporary tools and integrations. This includes popular platforms like Apple Intelligence, Google AI, and tools for developers such as vector databases and fine-tuning frameworks, ensuring a comprehensive understanding of the LLM ecosystem.

How are complex concepts like self-attention and autoregressive decoding explained?

Complex topics are broken down through visualizations, interactive examples, and analogies. For example, self-attention and autoregressive decoding are explained step-by-step, using diagrams that “unroll” the process for easy understanding.

Will I be able to access this workshop on my mobile or tablet?.

Yes, the course is optimized for access across multiple devices, including mobile, tablet, and desktop, ensuring flexibility to learn on the go.

Is there a certificate upon completion of the course?

Yes, you can get a certificate by sending us a message.

Can I ask questions during the course?

Yes, you can ask questions in the comments section of each lesson, and our team will respond as quickly as possible. You can also ask us questions anytime through the community driven Discord channel.

Can I download the course videos?

No, the course videos cannot be downloaded, but they can be accessed online at any time.

What is the price of the course?

The course is currently priced at $197 USD.

How is this workshop different from other content available on LLMs?

this workshop on Large Language Models (LLMs) stands out from others by delivering not only the foundational understanding of LLMs but also diving into practical, real-world applications tailored to industry-specific challenges. We focus on case studies, interactive labs, and personalized support that go beyond theoretical knowledge, ensuring you’re equipped to implement LLMs effectively in your own work environment. By the end of this workshop, you'll gain actionable insights and hands-on skills that are immediately applicable, setting you apart in a rapidly evolving tech landscape.

Fundamentals of transformers - Live Workshop

$197

$500