Workshop-feedback-qa
Get the project source code below, and follow along with the lesson material.
Download Project Source CodeTo set up the project on your local machine, please follow the directions provided in the README.md
file. If you run into any issues with running the project source code, then feel free to reach out to the author in the course's Discord channel.
This lesson preview is part of the Fundamentals of transformers - Live Workshop course and can be unlocked immediately with a single-time purchase. Already have access to this course? Log in here.
Get unlimited access to Fundamentals of transformers - Live Workshop, plus 70+ \newline books, guides and courses with the \newline Pro subscription.
[00:00 - 00:13] Okay, got it. How are you guys doing? Do you guys feel like you're getting enough value from it? Feel free to unmute yourself, talk, and you don't have to just, yeah, clap.
[00:14 - 00:23] Yeah. Okay. All right, I'm going to go into a little bit of PowerPoint presentation.
[00:24 - 00:38] Oh. Oh, and this part is supposed to be interactive. So don't, not like a lecture.
[00:39 - 01:00] So feel free to unmute yourself and about do live questions. So we surveyed over 120 people and to understand what the biggest pains were and some of the problems with the biggest pains were they're unable to unable to understand the lingo, be able to not be able to judge what's good or not.
[01:01 - 01:59] Be able to apply AI, be able to understand the internals of AI and be able to build a foundational model yourself and be able to Build on top of AI. So I'm curious. How do you guys feel after after two hours of conversation with Alvin? How do you do where would you basically feel like you made progress on Now feel free to unmute yourself and just talk I'll go first if nobody wants to This is my It helps. I do feel like I have a better understanding of what's happening a bit more internally there, but I guess the one thing that I've been dying to learn is I want to understand how data needs to be prepared to do these things like to train the foundational model itself.
[02:00 - 02:23] And then potentially to augment it using rag or my understanding is you can make use of vector databases as well to layer on top of a foundational model. And the reason it's a little selfish. The reason I'm curious is that I've been a data guy for a long time supporting machine learning engineers.
[02:24 - 03:05] So I have a deep understanding of how to prepare data, for example, that might go into speech or Audio event detection where you have audio that gets force aligned and then gets converted into mel bins and then then the training occurs. And so I'm really curious, like, how would I help Support fine tuning LLM's from a data perspective. How did what does that look like is the inputs are not just like big text files right. And I've seen some references where it shows like repetitive sequences of inputs. So it'd be like, I, I am, I am cool. Right.
[03:06 - 03:31] Yeah, that's a, that's a great question. My understanding is there's a lot of prepared text files already. And, and so there's rapid jam. There's a lot of prepared attacks. Alvin, do you want to go into kind of how to prepare the, the text to actually for pre training. Yeah, it's a good question. We can tack down at the end right here.
[03:32 - 04:07] Basically, because of how the transformer predicts, you actually don't need to do anything too special. So you did mention, yeah, we do have repetitive text. But this will take a little bit more explaining. You have some of the, you actually have some of the core concepts already needed to explain this, but you actually don't. Yeah, the only thing you need to prepare is, for example, stripping out like random formatting or pieces or strings that don't contribute to the meaning of the text. Right. So there's not much to compare in terms of the format of the text. So, for example , but we can talk about that at the end too. I think it's a great question. And thanks for, thanks for sharing.
[04:08 - 04:19] Appreciate it. Thank you. Anyone else. How do you guys feel about kind of about being able to understand everything related to AI.
[04:20 - 04:39] The punctuation formatting is really mysterious to me. I mean, because Alvin, you just, you've been doing fantastic. Alvin, by the way, you obviously have a deep mastery and great exposition skills. I'm not just saying that because we both got our PhDs Berkeley, but whatever.
[04:40 - 04:55] Anyway, I just find that because you just said to my dad that some of the formatting gets stripped out the formatting stripped out. However, as the form that didn't get preserved in this auto regressive model.
[04:56 - 05:49] And I think what really I always knew it, but what was really helpful in the presentation so far is just there's an outer loop and an inner loop and it's not just like a deep neural network one pass thing. It needs it's really an app and it's I think that's something that bothers me, especially in the press that LLM. Okay, it's a model. No, it's really not what it is. It's an app and it's got a bunch of tricks and a bunch of loops and some of it is inner products and some of it is matrix algebra. It's just is a lot more to what is a lot more finesse and human contribution than your sort of the press would lead you to believe so.
[05:50 - 07:30] Yeah, what would help me a little bit was I read a post on GPD without the Wolf ram founder and Yeah, I read that one. That was fantastic because he's basically a genius and he said he doesn't understand how it works. I thought that was just a really awesome post. Yeah, I thought what was really valuable is that at particular section, he just basically said this is the this is just the recipe on how they figured it out and it doesn't work any other way and he doesn't know why. So it's I don't know, like cooking like there are certain steps that you don't know why it works, but you still do that step and about tastes good in the end. Yeah. But there are transformless approaches now that are yielding decent output. It 's definitely not the transformers and this the be all end all. It's something having to do with producing a Kodak for our huge corpus and I can believe that works a lot better than the automatic programming that people are trying to get out of it because there are only a finite number of ways to say the knowledge that humanity has whereas apps and code are much more vast and more creative or more unique in a way. So I'm not surprised that the LLMs don't do aren't hitting home runs with that.