Intro to AI bootcamp
Get the project source code below, and follow along with the lesson material.
Download Project Source CodeTo set up the project on your local machine, please follow the directions provided in the README.md
file. If you run into any issues with running the project source code, then feel free to reach out to the author in the course's Discord channel.
This lesson preview is part of the Fundamentals of transformers - Live Workshop course and can be unlocked immediately with a single-time purchase. Already have access to this course? Log in here.
Get unlimited access to Fundamentals of transformers - Live Workshop, plus 70+ \newline books, guides and courses with the \newline Pro subscription.
[00:00 - 00:04] Yeah, that makes sense. Yeah, I just want to continue a little bit.
[00:05 - 00:16] Part of the inspiration for the live course is that we realized that in our survey, people actually wanted more of a community. They wanted live interaction with the instructor.
[00:17 - 00:23] And so we constructed this live workshop in that context. And if you want to go fast, go alone.
[00:24 - 00:31] And if you want to go far, do it together. And so I think you guys saw this in the emails I've sent over.
[00:32 - 00:36] But these are the open AI. So I'm just curious from about the group.
[00:37 - 00:44] Are you guys interested in AI because you want to build AI startup? Do you want to work at an AI company?
[00:45 - 00:49] Do you want to do it for your own side projects? I'm just curious about your relative goals.
[00:50 - 01:05] Oh, sure. So I work in AI, and I plan to continue having a primary position working for an organization.
[01:06 - 01:08] But I'm also building a platform. I'm trying to build a multi-modality.
[01:09 - 01:29] I'm calling it an omni-modality data platform in support of machine learning. So that's why I'm trying to understand more about the tooling that I will need to basically provide data curation, data set allocation for training LLMs.
[01:30 - 01:33] That makes sense. Thank you.
[01:34 - 01:38] Anyone else? I like to think of myself as a lifelong learner.
[01:39 - 01:49] As I said in the beginning, I'm brand new to AI. And where I work, I work for a privately held insurance company.
[01:50 - 02:13] And it's been told to us that we need to venture more into AI and programming from an insurance standpoint, the claims side of things. And it would probably be more on the quoting side, the policy side, how much is this policy worth?
[02:14 - 02:33] And what are the predictions that AI can show that an actuarial person or someone that's responsible for writing the business and proposing the business? How can I help in an IT perspective in that regards?
[02:34 - 02:38] But again, it's brand new to me. So you are doing a great job, Alvin.
[02:39 - 02:44] I like the live lecture aspect of it. Is some of it over my head?
[02:45 - 02:51] Yeah. But I haven't learned machine learning AI like I need to.
[02:52 - 03:06] And so one of my questions would be, what is the best platform or training platform? And maybe Zao, you can help me with this for getting an introduction into AI and ML.
[03:07 - 03:28] I certainly don't want to hold back anybody that people like Maya and can have a vast experience with AI. I would say from my experience, generative AI, I think it's helpful to have a machine learning background, certainly.
[03:29 - 03:36] But I think generative AI is a different beast. And so the way I think about it is there's three layers.
[03:37 - 03:49] One is the foundational model area. The second is programming AI, which is fine-tuning, rag, and dealing with all aspects of dealing with the model.
[03:50 - 04:18] And then the third aspect is AI agents, which is a more stage, composing different AI APIs, and then tools, and then using the LLM to either reason, or to basically control the outputs for these other things. So I think basically in order to get productive as soon as possible for AI without going through a lot of introductory material where actually I actually want to get into that.
[04:19 - 04:30] So I can actually get into that a little bit later, like maybe five minutes later, where I can go into both aspects. Thank you for providing that context.
[04:31 - 04:39] You may have seen in the email, these are the software engineering about salaries in OpenAI. L3 is the entry level.
[04:40 - 04:43] It's not a junior engineer. It's a professional engineer.
[04:44 - 04:52] And the total compensation is pretty substantial. And these are the OpenAI data science roles.
[04:53 - 05:01] These are the people that work on foundational models. It ranges from 7.50 to 885 Kanbakke.
[05:02 - 05:22] Not only is that the opportunity-- so for a lot of you guys, you guys have full-time Kanbak position, but there's also an opportunity to build AI projects. And so I'm highlighting a couple of different AI projects that people have built that are full-time engineers, and they've built these things on the side.
[05:23 - 05:34] So this person basically built $3,000 a month AI coloring application. This guy has a $500,000 a year AI video editing startup.
[05:35 - 05:47] And this guy created a $5,000 a month meme generator using AI. And there's a $20,000 a month MRI headshot applications.
[05:48 - 06:00] And then there's $15,000 a month voice transcription in Kanbakke services. Because Transformers is such a fundamental tool, basically it's multimodality.
[06:01 - 06:04] It's voice. It's graphics like a mid-journey.
[06:05 - 06:17] You have video like Runway and Hagen, where you can create lifelike avatars. So the fundamental architecture of Transformers is fairly revolutionary.
[06:18 - 06:33] And you may have seen this in the email again, where I mentioned the two phases of technology, Kanbak shifts. Generally speaking, in the very beginning, you have this frenzy where there's a pure infrastructure installation phase.
[06:34 - 06:44] So in the dot com, a boom and crash, you have the installations of network infrastructure, like Cisco, switches, routers. And you have optic fibers.
[06:45 - 06:51] And everything was basically installed. But at that time, there was also the application phase, which happened at the same time.
[06:52 - 06:54] Google started around then. Amazon started around then.
[06:55 - 07:05] But as you get into the application deployment buffets, you get into a phase where you have even more applications. So Facebook came into that era.
[07:06 - 07:07] The iPhone came into that era. You have mobile.
[07:08 - 07:12] You have social. These were all about different applications of the internet.
[07:13 - 07:28] As you basically see, you have multiple phases-- water, steam, electric power. So there's actually a quite predictable kind of a phase where the internet, technology, and finance, you basically have a combination of speculative boom.
[07:29 - 07:39] Like for example, I think NVIDIA right now is the largest tech company in the world. I think as of this week, it surpassed Apple in a three trillion market cap.
[07:40 - 07:48] And conceivably, it could basically be a $5 trillion market cap company. And then AI seems to be following the same pattern as well.
[07:49 - 08:00] So last year was basically the beginning. And we're in that phase right now where people are trying to figure out the infrastructure as well as the AI application.
[08:01 - 08:13] So I'm curious, do you guys want to apply all the knowledge into our AI? How do you guys think about this?
[08:14 - 08:23] When I mention a lot of the AI transformations and a lot of the salary, what's your initial take? Well, this is Julius.
[08:24 - 08:28] Yeah, salary is nice. Also, the challenge is, I think, that presents pretty interesting.
[08:29 - 08:43] I joined this relevant intellectual curiosity as well, just wanted to know how they work out the context, which was a great presentation, by the way. And that was it for me.
[08:44 - 08:48] What do you mean, "work out the context"? So you have a word.
[08:49 - 09:08] He used the RE cool in context for certain word, like the query and other word for keys. So you have the word "wind" or W-I-D would be a wind or wind, which could be like a adjective or a noun or even a verb, right, but the context in which it's used.
[09:09 - 09:18] This is how to determine the context, essentially, that word or that key. Yeah.
[09:19 - 09:30] Yeah, I agree. Yeah, it's one of the things that Alvin spent a lot of time on, and when we went back and forth on the best way to display it, originally, he went through multiple phases.
[09:31 - 09:39] Like, originally it was a diagram, then he went into a whiteboard and then bought the current anchor required on multiple iterations. Thank you, Julius.
[09:40 - 09:41] I'll go ahead. Yeah.
[09:42 - 10:04] This is my-- again, just one thing I didn't mention before is I personally played a supporting role in machine learning for a long time, now, over 10 years. And most recently, I even managed a team of machine learning engineers working on speech and AED, and then also some computer vision at the end.
[10:05 - 10:12] But I've never been the machine learning guy whose hands-on actually modeling. And also, we're talking about production models here.
[10:13 - 10:23] Like, my joke is anybody can train a Hello World model. And so how realistic is it to say, I'm a guy that, unfortunately, I'm a dark horse.
[10:24 - 10:31] I did not go to college, and I take college classes still. But I don't have that strong mathematics statistics background.
[10:32 - 10:40] Is it realistic for me to think that I can get hands-on and really wrap my head around the modeling process? Yeah, absolutely.
[10:41 - 10:54] I think not a doubt. What'll happen is based on how comfortable you are with certain concepts, like maybe some things will take longer to really sit deep for you to be comfortable with, if you understand really deeply.
[10:55 - 11:07] But to get started playing with even production models, to fine tune it for some use case and deploy it, I think the barrier to entry is not very high. And this fact was more important.
[11:08 - 11:26] And it's true of all of you in this room is being a good engineer matters, then knowing all of these topics from the ground up. And I think, especially if I'm starting with simple things, like linear progression and perceptrons, pretty straightforward compared to the real wild advanced stuff.
[11:27 - 11:29] Yeah. Yeah.
[11:30 - 11:42] Oh, go ahead. I was going to say the funny thing is whether it's a linear model, like linear regression, or if it's a large language model, I don't think any will actually be more difficult for you to deploy.
[11:43 - 11:58] If anything, the large language model is probably just a little bit more spenicky to deal with because it's so large and you're moving like gigabytes and gigabytes of data and parameters around, that's probably the only challenge. I don't think conceptually it's going to be any more difficult for you to work with and to deploy.
[11:59 - 12:08] It may take longer to comprehend like fully, but even today I'm still learning more things about what I'm actually working with. And I think it's fine.
[12:09 - 12:40] If I could just ask one more clarifying question, because I'm curious, again, having worked with ML engineers and always having like several PhDs and leaders VP level, people helping to guide the more creative modeling when it gets down to the papers we're reading and influencing the machine learning engineers to help them try to find different solutions. I've heard a lot, obviously, about these various layers and architectures.
[12:41 - 12:57] And I'm a little confused because I'm familiar with convolutional neural networks, recurrent neural networks, and understanding that GAN is where the architecture for generative solutions. But now I'm hearing like transformers.
[12:58 - 13:13] I was under the impression like transformer is another architecture as well, right? But I guess I'm confused at where something becomes a layer versus an architecture because the perceptrons and feed forward neural networks.
[13:14 - 13:24] But meanwhile, I saw feed forward mentioned in this particular case, almost as if it's a layer or in some action. Yeah, that's a good question.
[13:25 - 13:38] And for the most part, I don't actually think the distinction is well defined. So whether something like neural network, like the whole large language model could be a neural network, some subset of it could be a neural network.
[13:39 - 13:53] The funny thing is, there's only just, I don't know, like three or four months ago, where we're building something internally. And we had roughly the same debate over what to call something, whether it was this a network, was this a layer, was this just an operation?
[13:54 - 14:10] I don't think that's well defined. And so what matters-- So just to jump in, then I think that Ken had it right when he said, oh, this is an-- it's not necessarily that it's this purest neural network solution that it really is an application of sorts.
[14:11 - 14:10] Yeah. OK.
[14:11 - 14:12] Absolutely. Absolutely.
[14:13 - 14:21] Yeah. And so really what matters when you're thinking about, like convolutions versus transformers, you could compare those two and look at, OK, these are two different styles of doing inference.
[14:22 - 14:31] But for example, if you look at an RNN versus a transformer, their main difference is not in what kinds of operations are used. The main difference is how they do prediction.
[14:32 - 14:43] LLMs have one token per input. RNN has one token for the entire input, like the entire corpus, no matter how large is summarized into a single token.
[14:44 - 14:47] And that's the main bottleneck for RNNs. So depending on what it is, you're right.
[14:48 - 14:53] Some are like paradigms of doing prediction. And then others are like actual operations.
[14:54 - 14:59] And I guess to be fair, there is actually four different categories. We can talk about this later of categorizing all this knowledge.
[15:00 - 15:04] And so I can help sift through that madness a little bit at the end. Cool.
[15:05 - 15:26] Yeah. And the other thing that I'm self-taught in machine learning, and so I went through what people recommended, which is I took a bunch of courses, I went into the theory, I went through a bunch of textbooks, and then I did the same thing for deep learning around 2016, 2017.
[15:27 - 15:45] And then for generative LM, my observation is that a lot of the theory I learned, they're somewhat helpful, but not entirely helpful, basically. And someone that's basically trying to do the stack-- and this is partially how we designed this-- is we're trying to design this so that it's at a code level.
[15:46 - 16:08] So you're not going into an understanding, like calculus and gradient descent and all the traditional machine learning about things, because it doesn't necessarily apply on a day-to-day level when you're actually trying to build a foundational model, or you're trying to apply a foundational model, basically. And so that's the overall kind of context to this.
[16:09 - 16:14] So-- Sounds great. Yeah.
[16:15 - 16:20] Yeah. And then-- so all of you guys are here, because you guys are curious about AI.
[16:21 - 16:26] Is it OK if I challenge you guys a bit? Yeah.
[16:27 - 16:32] Yeah. Some of you guys are working machine learning.
[16:33 - 16:43] And so the question is, there's a generative AI kind of a trend. Arguably, if you assume the dot com kind of a boom and bust, that lasted nearly 30 years, and it's still going.
[16:44 - 16:53] So we're in year one of the generative AI kind of a madness. So the question is, can you hop on to the AI bandwagon?
[16:54 - 17:07] And so there's three kind of a key questions. Like, what's stopping you from working at an AI company and achieving those salaries if you desire about to work at an AI company?
[17:08 - 17:19] What's stopping you from building an AI company? So as you see, what I mentioned back there, a lot of those projects were actually built by engineers on the side.
[17:20 - 17:33] And at the very beginning of the dot com, kind of a boom, people were being paid a million dollars to build websites. Websites are fairly trivial nowadays, and there's all sorts of print and things and resources.
[17:34 - 17:40] But in the very beginning, the knowledge is quite scarce. And what you can produce is quite surprising.
[17:41 - 17:53] And the other question is, what's the stock you becoming an AI expert consulting with others? And so this can basically be you working in an insurance company being the AI guy at your company.
[17:54 - 18:04] There's a variety of ways to apply this. So as we mentioned, so when we survey people, we wanted to understand the biggest pains.
[18:05 - 18:27] And a lot of you guys mentioned, like Maya Ken, and you mentioned, you may not have-- you have some context on it, but you don't necessarily have time to go back to go get your bachelors and PhD and other things. You need to basically be able to practically build what it is.
[18:28 - 18:33] Be able to understand the concepts. Be able to make judgments on the concepts.
[18:34 - 18:39] Use and apply about AI. And be able to build about the right direction.
[18:40 - 19:05] So what I wanted to-- the reason why I mentioned this is that for the first time where New Line is launching a live cohort with Alvin and we're focused on end-to-end building, you can think of it as a zero-to-hero course. So the idea is to take you from zero to hero and be able to go through a lot more things in detail.
[19:06 - 19:19] This live course is focused on the transformers, whereas this course is focused on the entire lifecycle of a model. So be able to build a whole world with LMI-M inferencing.
[19:20 - 19:25] Understand the entire lifecycle of a model. Be able to build a transformer-based model.
[19:26 - 19:31] Then once you build one from the ground up, then you're able to adapt it. So what does that mean?
[19:32 - 19:35] There's fine tuning. There's instructional fine tuning.
[19:36 - 19:40] There's compressing it. And then adapting it for search and private data.
[19:41 - 19:59] Adapting it for search and private data is something called retrieval-augmented generation. And then at the very end, tying it everything together and using LMI-M index to adapt it to your own application.
[20:00 - 20:12] And then as part of this cohort, we're throwing in a FANG machine learning cheat sheet. So Alvin has been at Facebook, Tesla, and then has interviewed at all the major companies.
[20:13 - 20:19] So the machine learning generative AI cheat sheet is designed for-- kind of by interviews. Should you go down that road?
[20:20 - 20:38] There's technical review of your project with Alvin and a project about business review with me. And then we're going to be doing more of the workshops that are more at the conceptual level, focus on different things, and then you get one additional live workshop with Alvin.
[20:39 - 20:56] And the guarantee for this is that we will help you complete your desired project at the end of this. So not only will you build this, it will help you build a-- obviously, we're not going to help you build a startup.
[20:57 - 21:16] The beginning foundations of whatever it is that you desire to do. So whether it's building the prototype and foundation for generative AI, like what Maya is basically doing or helping with AI at the insurance company, whatever it is, we're going to help you complete your project.
[21:17 - 21:27] How is this different from a bootcamp with a capstone project? Is it-- Yeah, this effectively is the bootcamp with a capstone project.
[21:28 - 21:37] And next year, we're going to turn it into a full bootcamp. And so for right now, basically, it's primarily focused on a live cohort.
[21:38 - 21:48] It is practically AI engineer, a bootcamp, to answer your question. And the value of this is you've seen the salaries with AI being an AI engineer.
[21:49 - 21:57] So should you go that way, you should be able to increase your salaries at least 50,000 a year. That's kind of a modest increase.
[21:58 - 22:05] And equated over 10 years, that's 500,000. The cheat sheet on generative AI interviews with fangs.
[22:06 - 22:14] Should you get a fang job, that's worth more than 50k a year. You'll get a complete force on an end-to-end streaming lane chain.
[22:15 - 22:30] Of course, that we launched, that actually provides a template that you can actually use for any startup or anything that you want to do. You'll be able to run-- be able to speak the lingo and be able to consult, whether internally at your company or outside of your company.
[22:31 - 22:39] And then you'll be able to build a foundation, AI company. All of this, we think, is actually very valuable at the very beginning of the AI boom.
[22:40 - 22:51] And then not only that, you'll get personalized attention from Alvin and I, both in the technical side, as well as on the business side. And so we think this is worth $3 million in value.
[22:52 - 23:06] And then because we're launching, this is going to be a $20,000 course next year, designed to bring AI engineers up to speed. But because we're launching this right now, it's only going to be $5,000 over three months.
[23:07 - 23:19] There's also a payment plan, as well, which is $1,600 per month. And then if you decide to pay it all at once, you can get it within the next 24 hours with $4,000.
[23:20 - 23:24] I'm not sure if you guys have any kind of questions about this. Feel free to ask any questions.
[23:25 - 23:42] And so this is part of the reason why we launched this was we heard these concerns about not being able to get up to speed with being an AI engineer and wanting to be participate within AI engineering. Yet you don't want to basically start from the very beginning.
[23:43 - 23:49] So this is what I mentioned. I think it was Jeff's question, basically, on where do you get started?
[23:50 - 24:00] And I can address that to you, as well. I took a bunch of courses that were basically in very disparate places to go from zero to hero on machine learning.
[24:01 - 24:11] But what I did, practically, I was self-studying for over a year, self-studying a lot of these concepts. And so we decided to produce this and focus on building.
[24:12 - 24:17] And this is effectively a zero to hero course. >> Okay.