lesson
Advanced RAG & Retrieval MethodsAI Bootcamp- Analyze case studies on production-grade RAG systems and tools like Relari and Evidently - Understand common RAG bottlenecks and solutions: chunking, reranking, retriever+generator coordination - Compare embedding models (small vs large) and reranking strategies - Evaluate real-world RAG outputs using recall, MRR, and qualitative techniques - Learn how RAG design changes based on use case (enterprise Q&A, citation engines, document summaries)
lesson
Full Transformer Architecture (From Scratch)AI Bootcamp- Connect all core transformer components: embeddings, attention, feedforward, normalization - Implement skip connections and positional encodings manually - Use sanity checks and test loss to debug your model assembly - Observe transformer behavior on structured prompts and simple sequences - Compare transformer predictions vs earlier trigram or FFN models to appreciate context depth
lesson
Multimodal Finetuning (Mini Project 6)AI Bootcamp- Understand what CLIP is and how contrastive learning aligns image/text modalities - Fine-tune CLIP for classification (e.g., pizza types) or regression (e.g., solar prediction) - Add heads on top of CLIP embeddings for specific downstream tasks - Compare zero-shot performance vs fine-tuned model accuracy - Apply domain-specific LoRA tuning to vision/text encoders - Explore regression/classification heads, cosine similarity scoring, and decision layers - Learn how diffusion models extend CLIP-like embeddings for text-to-image and video generation - Understand how video generation differs via temporal modeling, spatiotemporal coherence
lesson
Feedforward Networks & Loss-Centric TrainingAI Bootcamp- Understand the role of linear + nonlinear layers in neural networks - Explore how MLPs refine outputs after self-attention in transformers - Learn the structure of FFNs (e.g., two-layer projection + activation like ReLU/SwiGLU) - Implement your own FFN in PyTorch with real training/evaluation - Compare activation functions: ReLU, GELU, SwiGLU - Understand how dropout prevents co-adaptation and improves generalization - Learn the role of LayerNorm, positional encoding, and skip connections - Build intuition for how transformers encode depth, context, and structure into layers
lesson
Instructional Finetuning with LoRA (Mini Project 5)AI Bootcamp- Understand the difference between fine-tuning and instruction fine-tuning (IFT) - Learn when to apply fine-tuning vs IFT vs RAG based on domain, style, or output needs - Explore lightweight tuning methods like LoRA, BitFit, and prompt tuning - Build instruction-tuned systems for outputs like JSON, tone, formatting, or domain tasks - Apply fine-tuning to real case studies: HTML generation, resume scoring, financial tasks - Use Hugging Face PEFT tools to train and evaluate LoRA-tuned models - Understand tokenizer compatibility, loss choices, and runtime hardware considerations - Compare instruction-following performance of base vs IFT models with real examples
lesson
Building Self-Attention LayersAI Bootcamp- Understand the motivation for attention: limitations of fixed-window n-gram models - Explore how word meaning changes with context using static vs contextual embeddings (e.g., "bank" problem) - Learn the mechanics of self-attention: Query, Key, Value, dot products, and weighted sums - Manually compute attention scores and visualize how softmax creates probabilistic context focus - Implement self-attention layers in PyTorch using toy examples and evaluate outputs - Visualize attention heatmaps using real LLMs to interpret which words the model attends to - Compare loss curves of self-attention models vs trigram models and observe learning dynamics - Understand how embeddings evolve through transformer layers and extract them using GPT-2 - Build both single-head and multi-head transformer models; compare their predictions and training performance - Implement a Mixture-of-Experts (MoE) attention model and observe gating behavior on different inputs - Evaluate self-attention vs MoE vs n-gram models on fluency, generalization, and loss curves - Run meta-evaluation across all models to compare generation quality and training stability
lesson
Triplet Loss Embedding Finetuning for Search & Ranking (Mini Project 4)AI Bootcamp- Triplet-Based Embedding Adaptation - User-to-Music & E-commerce Use Cases
lesson
N-Gram Language Models (Mini Project 3)AI Bootcamp- Understand what n-grams are and how they model language with simple probabilities - Implement bigram and trigram extraction using sliding windows over character sequences - Construct frequency dictionaries and normalize into probability matrices - Sample random text using bigram and trigram models to generate synthetic sequences - Evaluate model quality using entropy, character diversity, and negative log likelihood (NLL) - One-hot encode inputs and build PyTorch models for bigram and trigram neural networks - Train models with cross-entropy loss and monitor training dynamics - Compare classical vs. neural models in terms of coherence, prediction accuracy, and generalization
lesson
RAG & Retrieval Techniques (Mini Project 2)AI Bootcamp- Understand the full RAG pipeline: pre-retrieval, retrieval, and post-retrieval stages - Learn the difference between term-based and embedding-based retrieval methods (e.g., TF-IDF, BM25 vs. vector search) - Explore vector databases, chunking, and query optimization techniques like HyDE, reranking, and filtering - Use contrastive learning and cosine similarity to map queries and documents into shared vector spaces - Practice retrieval evaluation using `recall@k`, `precision@k`, and `MRR` - Generate synthetic data using LLMs (Instructor, Pydantic) for local eval scenarios - Implement baseline vector search pipelines using LanceDB and OpenAI embeddings (3-small, 3-large) - Apply rerankers and statistically validate results with bootstrapping and t-tests to build intuition around eval reliability
lesson
Multimodal Embeddings (CLIP)AI Bootcamp- Understand how CLIP learns joint image-text representations using contrastive learning - Run your first CLIP similarity queries and interpret shared embedding space - Practice prompt engineering with images — and see how wording shifts retrieval results - Build retrieval systems: text-to-image and image-to-image using cosine similarity - Experiment with visual vector arithmetic: apply analogies to embeddings - Explore advanced tasks like visual question answering (VQA) and image captioning - Compare multimodal architectures: CLIP, ViLT, ViT-GPT2 and how they process fusion - Learn how modality-specific encoders (image/audio) integrate into transformer models
lesson
Tokens, Embeddings & Modalities — Foundations of Understanding Text, Image, and AudioAI Bootcamp- Understand the journey from raw text → tokens → token IDs → embeddings - Compare word-based, BPE, and advanced tokenizers (LLaMA, GPT-2, T5) - Analyze how good/bad tokenization affects loss, inference time, and semantic meaning - Learn how embedding vectors represent meaning and change with context - Explore and manipulate Word2Vec-style word embeddings through vector math and dot product similarity - Apply tokenization and embedding logic to multimodal models (CLIP, ViLT, ViT-GPT2) - Conduct retrieval and classification tasks using image and audio embeddings (CLIP, Wav2Vec2) - Discuss emerging architectures like Byte Latent Transformers and their implications
lesson
Prompt Engineering — From Structure to Evaluation (Mini Project 1)AI Bootcamp- Learn foundational prompt styles: vague vs. specific, structured formatting, XML-tagging - Practice prompt design for controlled output: enforcing strict JSON formats with Pydantic - Discover failure modes and label incorrect LLM behavior (e.g., hallucinations, format issues) - Build early evaluators to measure LLM output quality and rule-following - Write your first "LLM-as-a-judge" prompts to automate pass/fail decisions - Iterate prompts based on analysis-feedback loops and evaluator results - Explore advanced prompting techniques: multi-turn, rubric-based human alignment, and A/B testing - Experiment with `dspy` for signature-based structured prompting and validation