A Glimpse into our Group Coaching Calls 3

This lesson preview is part of the Power AI course course and can be unlocked immediately with a single-time purchase. Already have access to this course? Log in here.

This video is available to students only
Unlock This Course

Get unlimited access to Power AI course with a single-time purchase.

Thumbnail for the \newline course Power AI course
  • [00:00 - 00:01] - Where are you? - Go ahead, Michael, how's it going? - This is your project coming along.

    [00:02 - 00:10] - It's okay. Right now I'm working on just building it as like a basic like rag question and answer, give it a textbook and be able to turn the content from it. - Yeah.

    [00:11 - 00:13] - Slow but steady. - You think that one more time, Terry? - Slow but steady.

    [00:14 - 00:20] - Slow but steady, yeah. Hey, Diven. - Hey, so how are you guys doing with your project and the bus road project?

    [00:21 - 00:23] - James, you wanna go first? You can. - Yeah.

    [00:24 - 00:41] - I've been working a little bit on getting it to just a basic state of rag where I can upload a text and ask questions about it. And then I've been doing a bit with chat JPT, like describing the project as I'm trying to build it and then working with it.

    [00:42 - 00:56] I think trying to get it to give me a roadmap or to take it from like a rag app to a fine-tuned app and then to a fine-tuned model. And then trying to get a picture of the steps I need to actually build an application on top of it.

    [00:57 - 01:13] So I'm not so much doing the coding of it right now as I am trying to get a clear picture of, yeah, I guess the milestones along the way, something working. - Yeah, so where are you in the last project coaching call, like the group coaching call?

    [01:14 - 01:17] - Last week's or? - Last week's. - I think so, yeah.

    [01:18 - 01:30] - Yeah, I uploaded the recorded video under the rag. So that will be very useful for you. Basically it discloses the systematic approach on how to start building out your rag application.

    [01:31 - 01:38] And the mini project too, it's like specifically based on that. So maybe what you can do on there is like a building for yourself.

    [01:39 - 01:51] So you can replace the dataset with your dataset instead. So that way you can basically build a simple rag application, complete the mini project, but at the same time you are hitting your milestone on your personal project.

    [01:52 - 02:01] But yeah, I'll get into the mini project, I can take the time for it. And now I think we've got the coaching call one on one later on today to get into like more specifics about, am I on the right track or not?

    [02:02 - 02:12] - Yes, thanks. - Thanks. - Yeah, so Nip and I continue to get a rate with my, one of the main customers I'm working with.

    [02:13 - 02:17] And there's always like a different format of data. It's coming in, I think that's my challenge.

    [02:18 - 02:28] It's never like consistent. So the thing I started like the last couple of days, I spent a lot of time like looking at land graph and like how to build an agent using that framework.

    [02:29 - 02:41] And actually it was probably, I went through their whole, they call it like the land graph academy. It was a really useful tutorial just to show how their framework works with human in the loop and judging and all that other kind of stuff, memory.

    [02:42 - 02:46] And so I was really thinking about that. The customer has given me a new assignment.

    [02:47 - 03:08] And so today I was just starting to build like the tools and land graph that I need to like read these spreadsheets and then probably evaluate like the headings and the spreadsheets to interpret what they're likely 'cause there's a domain model that's very specific for what I'm dealing with. But it's just that every one of these sheets has a column heading that's slightly different.

    [03:09 - 03:22] When they send an over to me or a PDF that is obviously in a different format, there's just no uniformity. So I think right now I wanna get the agent flow to work in the reasoning fashion, I guess is the best way to put it.

    [03:23 - 03:27] Hey, I've got this doctorate. Okay, I understand that's your list of products you wanna do.

    [03:28 - 03:43] Okay, I can see it's not just one brand. Let's figure out like if I gotta crawl the brand's websites, do you have any other spreadsheets to give me that is what I'm envisioning if I can get the agent to think that way, but that's what I'm working on right now.

    [03:44 - 03:53] Yeah, so I published like agent architecture. Maybe that will be useful for building out your agent, EI agent application.

    [03:54 - 04:13] So like there are different types of agent architecture and we are gonna be going over on the more updated ones in the upcoming lectures. But in meantime, you can also refer to that video and see how different types of architecture works for specific use cases.

    [04:14 - 04:19] Yeah. Yeah, what you published is that in one of the courses section or is it somewhere else?

    [04:20 - 04:23] Yes, it's in the courses section. Okay, okay.

    [04:24 - 04:40] I'll just publish the document parser, but I don't think that may be useful for you. Yeah, what I've been surprised is like a lot of the, at least the main client I'm working with today, they don't tend to like like the PDFs and stuff that the brands give them.

    [04:41 - 04:55] I don't know why sometimes it looks like good information, but I think they just are using the Excel files or going to their website to look at it. I think between crawling and reading those Excel files and then asking them to write new versions, I think there's a path forward.

    [04:56 - 05:01] I'm just trying to make it as non-hard-coded as possible. Yes.

    [05:02 - 05:36] 'Cause that to me is the tricky part of this. Every time they have a job to do something and it's the change of the code, it's not gonna work, it's gonna be too much. Yes, for the first iteration, it's always, it's always like lots of errors, but slowly eventually, so what you mentioned, like the hard-coded stuff, it's always in that direction, but then slowly the more use cases we have, more flexibility we add on the prompt, but at the same time, you have to make sure the flexibility does not hurt the accuracy of the existing passing of data.

    [05:37 - 05:44] It's something, it should be like, it will be something in that direction of IT. Have either of you used land graph?

    [05:45 - 05:47] It was an agent framework? Yes.

    [05:48 - 05:58] So there are multiple agent frameworks. One of the primary ones that I use is crew EI, and then the phi data.

    [05:59 - 06:02] By your data? Phi data, DHI.

    [06:03 - 06:09] Okay, phi data, okay. I haven't seen that one.

    [06:10 - 06:20] I looked at the crew EI one, and I don't know why I just gravitated, maybe more towards the land graph one, but. So both of them have the advantage and disadvantage.

    [06:21 - 06:38] The land graph one will go more on the foundational side, and you will be able to go like, thoroughly in technical depth of what you want to add it in the land graph. While crew EI is more on the orchestration side.

    [06:39 - 06:47] So for example, you know, multi-agent. It's easier to orchestrate this multi-agent architecture to make it working for your pipeline.

    [06:48 - 07:14] So crew EI is more on the usability side. - Yeah, one thing I liked, I don't know, James, if you would be interested in this, but I'm gonna share my screen because I've been working through it, I guess, let me show this, 'cause like visually what I liked was like, is you build your graph, I mean, you call it creating a graph, then you can actually visually see the workflow like inside of it, which I thought was really helpful in a methodical way.

    [07:15 - 07:33] And then they have this thing called Langsmith, which actually shows you like the call history and like what goes in and out. And today, I mean, I had a simple tool I built, which was just like, hey, given a product, maybe determine the brand and the brand's URL, 'cause I need to crawl.

    [07:34 - 07:46] - Yeah. - I can look at the runs and I can see, hey, taking 7.5 seconds, but where's the slowdown? Is it in the assistant that's trying to figure out what you're trying to do and pick the tool, the tool call, not too bad?

    [07:47 - 07:53] I don't know, at least it gives me insight into where the bottlenecks might be in the process. So I found that really helpful.

    [07:54 - 07:58] And then they have that studio tool, which gives you a visual way. I don't know if I have that.

    [07:59 - 08:09] And I think I killed it, but it'll show you how to, you can stop the flow and interrupt it along the way. I don't know, it's been enlightening to me to work that way.

    [08:10 - 08:27] It's been a little bit easier. - Yes, yes, that's why you can basically, what you mentioned, like printing the graph, you can basically go like as much technical depth as possible in the framework that you are using to easily run the app.

    [08:28 - 08:41] - Through AI, it's like for the people who wants to build like AI products faster, but at the same time, they don't want to worry about carrying the heavy lifting of technical side. - Yeah.

    [08:42 - 08:52] - I want to ask a question, like, how does that work? So when you post like the product name, like how is it gonna recognize what you are able to look into? Is it like solely based on the SEO?

    [08:53 - 09:02] - I don't know if I understood your question. Say it one more time. - So you mentioned that you are passing the product name and then it's gonna look for the URL, right?

    [09:03 - 09:17] So is it like a Google search or? - Like, boy, I'll share my screen again so you can just see the flow 'cause I literally just started building this morning 'cause all my other Python notebooks had been, like I said, very hard coded.

    [09:18 - 09:23] Here's a spreadsheet to do this to that. So the brand tool I created was just this.

    [09:24 - 09:39] Here's the prompt, determine the brand of a product based on its name and optionally vendor ID or style ID. So given the product name and the vendor style you supplied, determine the brand and the URL for the brand website, focus on official brand websites, not retailers.

    [09:40 - 09:47] And I also put like a confidence interval in it too. And so, like, where's my, oh yeah, here's the class.

    [09:48 - 09:56] So I just, I had to find a name, a URL and a confidence score for it. And then I'll skip over that one for now.

    [09:57 - 10:03] So then down here, I was experimenting with different models. I'm still using that deep-seat chat one.

    [10:04 - 10:13] And then I could pass in the tools, like buying the tools to the LLM. And then I create the Assistant and I just say you're an expert merchandiser, you have these tools at your disposal.

    [10:14 - 10:26] So you can determine the brand of a product based on its name and vendor style ID or given a URL to like a Google Drive file and you can read in an Excel. And so that's how it does it.

    [10:27 - 10:42] And then I create the graph. And so really the Assistant has to recognize I'm asking it to get a, to get the brand's website or not or my next tool is, or I'm gonna give you a file to read in, which would have a list of products.

    [10:43 - 10:54] And then if you look at the sheet I have, this input, the name, it's pretty obvious. I guess as a human looking at it, it's the Brooks glycerin 22 running shoe women's, right?

    [10:55 - 11:27] But I also pass in the style ID 'cause like when I've done some experiments with fire crawl to do the search, to try to find the brand's page, if I pass the style ID in, it's more accurate at putting the brand's website at the top. But I think what I'm gonna have to do when I do that search to find the brand's URL, I'm gonna have to give it like two and LLM 'cause there's a set of results that come back from fire crawl, like five or six and say, make sure to pick out the URL to the product page.

    [11:28 - 11:40] 'Cause what I found when I did this before the client's feedback was like, hey, sometimes you've got a URL to like Amazon and they don't want the Amazon content. So I gotta, I gotta stop that.

    [11:41 - 11:42] Does that make sense? - Yes.

    [11:43 - 11:54] - And the other thing like they've given me like this is an example of Patagon new jackets. And so they have best vacations for the jackets.

    [11:55 - 12:05] And the reason they repeat is 'cause they're just features when they have to be plumbed. And so he's basically asking me, "Can I fill in all the values for this?"

    [12:06 - 12:15] And again, I think he's not giving me this time, any vendor input. So I'm gonna have to crawl the website of Patagonia to pull what I can down.

    [12:16 - 12:26] And then hopefully assign the specifications. This one will be a little more interesting because he also gave me a set of values.

    [12:27 - 12:38] So it's like a dictionary of things that can be used. Though I think I'm gonna have to create a table and load this in and treat it like memory in a way.

    [12:39 - 12:59] - Yeah. - So that when I say go look at the website but you gotta go look at this table and figure out if there already a value in there that you can reuse 'cause they're trying to keep standardization. When I've been trying to figure out in my head is do I use something like LanceDB and start putting this stuff into it or just use a regular Python database?

    [13:00 - 13:20] Because what made this question for you and maybe James has thought about this but these specs, they want them to match exactly. But what I'm worried about is that their version of, I don't know what you're gonna call it, manufacturers warranty of 12 years might look different than what Patagonia puts on there like 12 y comma r.

    [13:21 - 13:28] So I'm gonna have to have like a fuzzy match to get it to pick the right thing. Do you know what I mean?

    [13:29 - 13:58] - Yes. So for, I mean for data, if the data is like long right now, the one that you showed can be parsed in the context itself. But if the data gets longer, you would have to use LanceDB, not specifically to DB, but you can use like, you can just use, it's a function called up insert where you can just specifically insert the data without converting them into vectors.

    [13:59 - 14:08] And basically it's through the data via it to you. And then you can check what retrieval works the best for your use case.

    [14:09 - 14:33] So different types of retriever, the one with the traditional one where it matches the, it matches the words, the one with the embeddings, but at the same time, if you use like embeddings, then you're gonna have to store this data into embeddings in the vector DB. So you can use the embeddings or you can structure storage via both.

    [14:34 - 14:49] So the mat, it can have a better data as traditional retrieving, which is like basically a word matching plus the embedding, such that it's not focused on the word itself. It's focused more on the context.

    [14:50 - 15:00] So what you mentioned earlier, like some of them might have one year in different style, but it says one year. So the context is like one year.

    [15:01 - 15:17] So for that basically you would have to use embeddings to get like highly accurate reasons. - That's kind of what I was thinking that, 'cause I, in their documentation, we talked about the hybrid approach, like where you have both the traditional keyword and then the vector.

    [15:18 - 15:34] And I was thinking that was gonna be my only way to find a likely match, where it wasn't just 100% period, whenever and the way it was worded. My thought process was like in that agent, it's gonna have to say something like, do you have any, some kind of config or setup stage?

    [15:35 - 15:38] Do you have any tables or reference tables? You wanna set up how are they used?

    [15:39 - 15:49] You load them in, maybe you have to create some kind of memory table that says, hey, this is what this is used for. And then what I was trying to figure out is can I have like dynamic tools?

    [15:50 - 16:04] So you might have a generic rewrite update to lead LanceDB tool based on a table. Do you need to tell the agent when to use that table to do a reference or a look up?

    [16:05 - 16:06] Do you know what I mean? - Yes.

    [16:07 - 16:22] So that's, so that's basically something similar on the, on the user end where you have an option of tags. And these tags are basically of including those things, different tags for different things.

    [16:23 - 16:32] And then basically these tags defines the location of what to do from 30. So for example, this is short of what you are mentioning.

    [16:33 - 17:07] So if you are dealing with PDF in future, maybe you can go over the document processing. I have a document processing, but the thing that you're mentioning is basically this one, where the feature is still in the development stage, but if you see over here, it has the tag on the user.

    [17:08 - 17:22] I'm going to provide like for flexibility on removing this tag. And then basically if you have another time where it says like lecture, then it should basically take the lecture location.

    [17:23 - 17:28] And then you can basically ask question. So it will be very specific with that.

    [17:29 - 17:37] So that's basically what you can do on a user end. - Other tags or labels that you assign to direct it on where to go search.

    [17:38 - 17:51] - Yes, exactly. - Maybe one more question for you. Since I haven't really used Lance to be other than just walking through their tutorial, but do you create like concepts of foreign keys like with it too or should?

    [17:52 - 18:01] I think what my question is like a lot of search databases, you generally have like big, long, wide tables. And like in my case, a product can have any number of specifications, right?

    [18:02 - 18:21] So if I was going to do that in Postgres and probably put a table for specifications and have a foreign key back to the product, like in this case, probably not going to have more than 10 or 12 features and 10 or 12 specifications. And I'm just wondering, should I keep them all in one wide table or split it?

    [18:22 - 18:38] Like, and then again, just not having used Lance to be a lot, I don't know what the capabilities are at this point on that design. - So what the DBs are usually with name species, different name species, the pairs and different things.

    [18:39 - 18:53] So it does not hold table and stuff, but what it holds is the architecture itself. So what I'm using for the EI, what I'm using for the EI chat is basically the turbo buffer.

    [18:54 - 19:03] And if you see, I'm already testing with 12.3 DB queries. What I did over here was, in my use case, I have these different communities.

    [19:04 - 19:13] So these different communities has different IDs. So I use that specific IDs to map out like each name species.

    [19:14 - 19:26] And then for example, threads, for example, all these threads, I have a specific name species for the threads. I have a specific name species for the events.

    [19:27 - 19:33] I have a specific name species for the members and all sorts of other things. - Is that like a table, if I were to compare it?

    [19:34 - 19:36] Or not me? - More of a database.

    [19:37 - 19:41] - Okay. - More of a database. And then, so basically, these are the DBs.

    [19:42 - 19:52] So I cannot like access and see what's going on in there. But think of these as, let me see if I can show you the architecture inside, inside one of the name species.

    [19:53 - 19:58] So each name species should have specific architecture. And these architecture can be created by your own.

    [19:59 - 20:07] It doesn't have to be like in the table and all those traditional database. It's more on how you want to retrieve the data.

    [20:08 - 20:15] So this is the architecture somebody. Each of the architecture are different from each of the name species.

    [20:16 - 20:30] The main one that I'm using right now, like recently the API that I created was called lessons. And a pattern is basically the attached content from the lesson converted to embeddings.

    [20:31 - 20:38] And it has the metadata. So the metadata here are lesson description, objective, content snippets.

    [20:39 - 20:43] Or it can be to be more precise. So this is generated by EI.

    [20:44 - 20:50] But slowly, I realized it had the lesson ID. It had the title.

    [20:51 - 21:02] And then the meta title from the web page. And then to retrieve the specific thing what I'm using is basically just the lesson ID.

    [21:03 - 21:16] Because whenever you see, whenever you go over to one of these courses, you will always have lesson ID on the URL. And I'm using that on the client side, automatically getting attached on the EI chart.

    [21:17 - 21:28] So now this is what, this is in the similar fashion you can use with your own architecture. And basically the embeddings are gonna be the things that you want to contextually capture.

    [21:29 - 21:48] Other structure, like metadata and the metadata structure basically to in order to retrieve that specific things are gonna be all the parameters that should be stored in the metadata. And then you can basically use a hybrid retrieval to retrieve a specific thing.

    [21:49 - 21:59] If you want to retrieve, if you want to make the retrieve rigid, then you would have to just use, for example, in my use case, lesson ID. In your use case, it can be a unique identifier.

    [22:00 - 22:07] In that specific means use. - You're using Turbo Puffer to do this, right?

    [22:08 - 22:17] - Yes. Basically, it gives you an ability to directly scale your EI application. - Lamped to be in Turbo Puffer or anything are very similar, correct?

    [22:18 - 22:35] - Yes, Lamped to be on the other end. So, Lamped to be is like more updated version of Turbo Puffer. So at some point there are some use cases where your EI application will have lots of data at some point.

    [22:36 - 22:49] And then you don't, and then you are just tired of data ingestion because data ingestion requires like continuous iteration to do. And what they did was they introduced multi data DV.

    [22:50 - 22:56] So you don't have to worry about anything. You just convert those files into multi data.

    [22:57 - 23:06] It takes time. And it basically stores, you think without worrying about how to ingest those data on the specific platform.

    [23:07 - 23:10] You don't have to worry about metadata. You don't have to worry about any structure.

    [23:11 - 23:25] You just take those files, convert that into a multi data using a model called call, but, or it can be any multi vector model from the hugging face. And that basically used the DV from the lens DV.

    [23:26 - 23:35] And it stores all these things into multi vector form. Now, the downside of this is it time, it is like time consuming, it's slow.

    [23:36 - 23:56] But at the same time, you are not worrying about storing in specific structure and you are just throwing down the documents without worrying about the structure. - That's more like a data lake or something, where you're just taking whatever you've got, video, images, text, PDFs, and they're all getting ingested and put into this that's in searchable.

    [23:57 - 24:08] - Yes. - I guess in my case, I'm trying to think about whether I could apply that. It came going back to what I was saying earlier, like I've got spreadsheets.

    [24:09 - 24:16] I've got from the brands, I've got PDFs from the brands. There's only other docs, like, I guess that could all get ingested.

    [24:17 - 24:33] But if I'm in that cycle then where I get a list of products and they say, "Hey, I want to go update these," then I have to start doing searches on lens DV to say, "Do you have anything that relates to this product?" Then I have to figure out are the results worth using?

    [24:34 - 24:38] Is that fair? - I think the multi vector one cannot be edited.

    [24:39 - 24:56] Let me see, because the multi vector lag is still like the ongoing research and there's not all the same accuracy as the one that we were discussing, the turbo pathway and the traditional ones. - Okay.

    [24:57 - 25:34] - Yeah, the other thing that I thought was interesting with lens DV that probably I could use, but it's, I would say, a little more, there wouldn't be as much hue within the loop and maybe that's a good thing, but either they have this thing called user-defined functions like EDFs on the columns, or example, if I created a table for product, then I had columns for specifications or features. Like I could create a function, I think that basically says any time I import or add a product, then the function gets invoked to fill out the data in any of those columns.

    [25:35 - 26:03] - Yes. - It will scale it for you, like it can do this in scale, assuming I have to use their enterprise plan or whatever it is that it can sell me, but then I don't know, maybe that would be the easiest of all, assuming the model and the outputs are good. I just thinking I would probably have to have more checks or L.L.N. as a judge to make sure something's good versus just relying on like a one-shot.

    [26:04 - 26:12] - Yes. So I don't know, I didn't know if you've used that feature, but I thought that was kind of interesting. - Yes.

    [26:13 - 26:56] All right, now I don't think I have any use cases or use that specific thing, but for function calling and tool calling, I have used these things for, that's your bot, one-of-a-year application, are basically calling web searches, wherever whenever it lacks the writing, and then it continues the writing based on these web searches and creates a assume optimized article for you like that. So it's multi-hope reasoning plus writing, plus function calling at the same time whenever there is a lack of data while writing that specific section.

    [26:57 - 27:17] - No, that's a good example. So this is what we have, let me show. So it mentioned two of them only has the ability to edit the multi-vector data quadrant and don't know if it's still like a viable solution for building an AI application.

    [27:18 - 27:29] - Okay. - So this is the best example of multi-agent. And this is back, I and one of my other teammates, we both built this book in just one month.

    [27:30 - 27:56] And what we have in here is, we have this interface, the user, the user at the website, for example, it can be your website. What it does is it generates or it extracts with the help of firewall, all these different settings, like the website name, type of platform, the summary, block theme, all these different settings.

    [27:57 - 28:19] In order for us to gather more contextual information on what SEO thing that should be done. And then when you click on add headlines, what it does is it triggers, it triggers an agent, it basically generates the headline based on the update settings, based on not update settings, but website settings.

    [28:20 - 28:29] Now, it's not solely based on these website settings. What happens in the backend is like add headlines has multiple agents.

    [28:30 - 28:43] So what it does is it takes these settings, it constructs this taxonomy based on the settings. So these are all the taxonomy that is automatically generating.

    [28:44 - 28:59] So this is one of these and that's doing that. Now, once it's done generating the taxonomy, what it will do is it will trigger another agent, where it will get the, so far I have also set a pointer.

    [29:00 - 29:11] So for example, the pointer might be somewhere here. Now that another agent is basically using this taxonomy to retrieve high quality SEO keyword.

    [29:12 - 29:29] And then basically this agent sees this high quality SEO keyword in our personal database which is isolated from italics. And then we use that optimized SEO keyword to generate the headline.

    [29:30 - 29:41] So the taxonomy though, with auto created, or you guys put the taxonomy and pieces of topics that we care about. It's created by, it's created by a agent.

    [29:42 - 29:54] So now everything is created by a agent. These two, so that's the reason I wanted to explain you meta agent architectures because slowly it gets too complex.

    [29:55 - 30:05] So what users have to do over here is they just add a website, they don't have to do anything, they just add a website. And then there are five called seed on our earlier conversation.

    [30:06 - 30:11] I'm using that in the backend. It extracts all the conceptual information which is basically the website settings.

    [30:12 - 30:21] And it saves on the database, like on the backend. - Yeah, so that's effectively, it's learning about your business looking at your website.

    [30:22 - 30:32] - Yes. And then it automatically triggers, or at the first time, it automatically triggers this ad headline agent. But now we already have the headline.

    [30:33 - 30:51] So once this is triggered, the ad headline agent, what happens is it takes these website settings or website contacts. On top of that, we have additional contacts and it generates, an AI agent generates the taxonomy based on the contacts.

    [30:52 - 31:14] And then once it is done, it's gonna go back and tell to orchestration agent that I'm done building the content taxonomy. So now what happens is it triggers another AI agent which is basically making this taxonomy to high-quality data as CO keywords.

    [31:15 - 31:25] And then it's saving all of these high-quality data as CO keywords in our separate data. Because that's like valuable keywords.

    [31:26 - 31:37] And now what happens is I have another agent, but right now let's be on the floor. So once we have this data for SEO keywords on the backend, then it triggers headline generation.

    [31:38 - 32:04] Based on the taxonomy plus website contacts plus the high-quality SEO keywords that is being stored on the backend. Now, what happened was the data for SEO or the mapping of high-quality SEO keywords where many times misaligned.

    [32:05 - 32:27] So for example, the misalignment was a misalignment, no misalignment. But in the first citation, what it was doing was basically taking the artificial intelligence and the traditional retriever from the high-quality data, SEO was retrieving the wrong keywords.

    [32:28 - 32:44] So what I had to do was add another agent that verifies with the content taxonomy and the website taxonomy. Like basically LLMS church to filter out keywords that are only mapped to the specific content taxonomy and the website settings.

    [32:45 - 32:51] And then it triggers the headline. Can this, that sort of agent framework you built in CRU AI?

    [32:52 - 33:09] - Yes. I started with CRU AI plus Phi data. But then I realized, maybe I just create my own AI agent architecture instead of relying on frameworks because basically at the end, it's all Python coding.

    [33:10 - 33:19] So you can basically in-work functions and tool calls and all sorts of things without using this approach. - Those jobs though, they run for a long time, right?

    [33:20 - 33:27] You're not waiting at the prompts. There was no framework you used to kick those off or to monitor over it.

    [33:28 - 33:34] 'Cause I was assuming they want to queue and something has to pick them up and process them. Is that a fair statement?

    [33:35 - 33:48] - I did always just try and understand because it's a long running process and that was something I was thinking about with mine. If I've got 100 products I want to attribute, nobody wants to sit around and answer 100 questions during that process.

    [33:49 - 34:03] So I got to get it running in the background and then maybe pool up some that don't work and figure out how to have some human interaction or just report back a these 10 records out of the 100 didn't work successfully. - And yes, they need your review.

    [34:04 - 34:17] So I was just trying to understand like when you deploy this. And again, as far as the reason I was leaning towards like lane graph because of this whole deployment framework when it takes some of that headache out about how you run it to kill the day.

    [34:18 - 34:33] - How I'm running these things. So I'm running, I'm running basically on trigger.eu and I have conditional triggers on each files. It's solely based on the LLM input.

    [34:34 - 34:42] If the LLM input are basically, if the input are basically some specific numbers. So for example, zero or one.

    [34:43 - 35:01] Then basically it triggers another region. If it's not like that specific input then the architecture basically tells the region to restart the process again and change the variable on the restart process.

    [35:02 - 35:18] The variable over here are basically for my use case I have temperature all the other variables that I have until it reaches a certain point of accuracy. So at the end like I'm having an LLMS charge where for example, okay, so now this looks good.

    [35:19 - 35:29] So now it should return one because it's about 87% of accuracy. And then basically it's returning one and then the workstation will take nothing.

    [35:30 - 35:45] - Okay, thank you. - Yeah, but it's much more easier if we use like framework. The reason I did not use framework was it was giving me errors because I have my own deep research agent.

    [35:46 - 36:05] So that deep research agent, like I wanted to build on multi-op plus RL based and to mimic the similar fashion of how deep research module is articulating your answers from open EI or Gemini. So I was trying to do that.

    [36:06 - 36:15] And at that time when I was building like I see what there were not many resources. And when I was building this deep research agent in Langra, I was having some matter.

    [36:16 - 36:28] And for some reason, it was just giving me, at the end I was just receiving output that was basically a simple web search plus error. So I did not wanted that.

    [36:29 - 36:40] So I had to like remove all the frameworks. But I still have frameworks on some of the agent architecture which is basically the headline one because it's working, while on the article one I removed the framework completely.

    [36:41 - 36:46] - Okay, okay, thank you. - Hey, Bashir. - Hey, Gippin, how are you?

    [36:47 - 37:01] - I'm doing great, how are you? - Good. So last time we had a call, the one thing that we talked about was that like because my pipeline works right with, my rendering pipeline works with a binary format.

    [37:02 - 37:08] We talked about me having to move that to something that's human readable. - Yes.

    [37:09 - 37:33] - I've been working with that right now. And so you can imagine that by going from a binary format to something that's human readable, like a lot of the data growing size or at least a lot more bloated. So I've had to do things to try to compress the data, remove redundancy and things like that in my data.

    [37:34 - 37:43] It's going fairly well. I guess some question that I had was first things like indentation in like some files or in adjacent files.

    [37:44 - 37:54] I'm trying both at the time to see what I wanna work with. Does that make any difference from the standpoint of having, for example, up training?

    [37:55 - 38:18] I'm just, I'm wondering like if I make my data for training or some of my data, I just make it easier to read and train with that, but production, I want something that's a little bit more concise. Will there be a discrepancy in meaning or understanding by my models or with things like indentation spaces?

    [38:19 - 38:27] Like right now I have some nice variable names. I'm just thinking how can I make all these texts much lighter?

    [38:28 - 38:43] 'Cause like right now I had the first time my first pass I had a file, a binary file that was about a made. I was starting to agree to adjacent files with all the information I needed was like 50 megs, right?

    [38:44 - 39:09] So I'm down to, I've done a lot of things then I'm down to having something that, let's say it's 500K looking all pretty and I understand that I can't reach through it. But if I get rid of all the spacing and it kept me go down to 100K text file and if I go in and make some of the names maybe not as pretty, I can bring that down further, right?

    [39:10 - 39:31] I don't know, I don't know if you have any feedback or any thoughts with respect to, at which point do I lose efficiency in trying to make it smaller? And with respect to the amount of text, with respect to things like spacing, with respect to trying to get my model eventually to derive meaning from all the data that I will see too.

    [39:32 - 39:43] - Yeah, so make sure the formatting and the spacing and the things are right. That should be, that is like the backbone of what it will generate.

    [39:44 - 39:56] So the things that you will have while fine tuning is gonna be the things that you will have on the output. Big problem, if you have apple, comma apple without species, it's gonna have that.

    [39:57 - 40:06] If it's like, it's gonna have, on the output, it's gonna have apple, comma, banana or something, other things, correlated things. Peace will not be there.

    [40:07 - 40:26] So make sure what format you think for the rendering is the most precise one, but at the same time, you are not taking bunch of unadditional details, the one that you mentioned earlier, where you have 50 MB of one JSON file. Now you can find it KB if I'm right.

    [40:27 - 40:58] So maybe, can you show me your screen, like the difference because still I'm. - I'm working from machines right now, so I guess could, I was gonna send you the stuff anyway, so I could share the files with you, but it's like, I have had to customize my serialization system, just to remove redundancy and, and I guess it's just going from binary to text.

    [40:59 - 41:11] There are things that in binary I never cared about. Because I want things to be generated pretty quickly and close to time, I would write a big deal in 500K of data rather than 50 MBs.

    [41:12 - 41:31] - Yes. And for example, if you try to reduce further down to 100 KB, you have to make sure you're not losing any details. - So there is a trade-off between what you mentioned, and it's like one of the very important concerns where people building a fine-tuned model have this problem.

    [41:32 - 41:47] So imagine you want to use a smaller model, a smaller model, 7 billion parameter model. And you are using, for example, 500 KB of JSON file fine-tuning.

    [41:48 - 42:01] Now, the 7 billion parameter model will be capable of understanding at some extent. It's not going to understand like on every extent because it's a 7 million parameter model.

    [42:02 - 42:11] But now you reduce down a JSON file to 100 KB or 50 KB, it will have a complete understanding on what needs to be done. The time of fine-tuning.

    [42:12 - 42:26] So then it's a trade-off. Now, if you want to use like 500 KB, then you would have to basically go, because 7 billion parameter model now, we already trained it and it's not working great after data cleaning and all those stuff.

    [42:27 - 42:32] We have to jump above and find like 33 billion parameter model. Try again.

    [42:33 - 42:54] And once we find the right match for the right, I think the 33 billion parameter model will be, basically it works on an average, it works for everyone. But if you're like small models where you don't need like, you know, GPU resources, try to make your data set as smaller as possible.

    [42:55 - 43:09] But at the same time, you have to make sure you are not using valuable information. Now, in order, so what you are doing right now is converting this binary, converting this binary data into a JSON file.

    [43:10 - 43:18] You can basically create a file or 10 JSON files and basically generate like synthetic data out of it. - Okay.

    [43:19 - 43:39] - And then basically use synthetic data, in order for you to check what like, if this is working great, or should I go with larger data set and larger? - So like, I tested like some of the things that I thought I knew were going to be pretty big.

    [43:40 - 43:46] And I know we talked about having to make like, small examples, I just draw a canvas. And like, those things are going to be small.

    [43:47 - 44:06] Those things are going to be, I'm going to have small examples, but ultimately the big meaning of where I'm trying to get to is to be those, those, I would say, an average 200-- - Yeah, so-- - To buy files. - It will basically need like higher million parameter money.

    [44:07 - 44:15] So don't even wish your time on make sure you can try, you can always try it. So sometimes it, sometimes, usually this is the pattern I see.

    [44:16 - 44:26] A girl from previous cohort was fine-tuning Mitchell 7 billion parameter model. And what happened was at the stage of fine-tuning, she was having good results.

    [44:27 - 44:42] But then when she saw the model, and when she saw the logs, those, the model was only generating like, in specific four directions, okay. And it was working great on those specific four directions.

    [44:43 - 44:49] So what I told her was maybe there is something wrong. When you are done generating the fine-tuning model, maybe something wrong on your prompt side.

    [44:50 - 44:56] So edit the prompt side to include like, more direction of generating things. And it struggled.

    [44:57 - 44:58] It didn't work. So this is what happens.

    [44:59 - 45:07] You have a larger data, she had larger data set. When you have a larger data set, but then you want to use like smaller model.

    [45:08 - 45:17] So it might work with brute force and check if it works. So what size do you think?

    [45:18 - 45:24] I mean, your intuition right now, what size? Number of parameters you think I might end up around?

    [45:25 - 45:27] Obviously I'm going to try. What do you think?

    [45:28 - 45:34] I think the 33 billion parameter one is the one where you should start 33 billion. Okay.

    [45:35 - 45:48] If you want to go on the efficient route, people have different requirements. So some of the people will not care about the GPU consumption or anything.

    [45:49 - 45:58] They would care about accuracy. Then basically you would just go on higher models and use a specific generation technique, basically three generation technique.

    [45:59 - 46:08] And I forgot it's in the unit one. And the third one takes longer time, but at the same time, it produces like higher accuracy.

    [46:09 - 46:18] But the drawback is it needs like higher amount of time, higher amount of GPU resource and stuff. We will have to eventually figure out those things once we are on that track.

    [46:19 - 46:25] Okay. So I'm still working on getting down my schema, I think that down, it's coming along pretty well.

    [46:26 - 46:29] I'm going to drop, but thank you. Appreciate it, deeply.

    [46:30 - 46:31] Okay. Okay, let's see.

    [46:32 - 46:33] Thank you, James and Bishina. Thanks everyone.

    [46:34 - 46:46] Thank you. Yeah, but I was talking about like the tax generation was the greedy search, beam search and sampling from top.

    [46:47 - 46:52] That is usually the one that is used on tax generation. Okay.

    [46:53 - 46:56] From the LLM. And you in the chat, in the chat.

    [46:57 - 47:10] The beam search is the one that takes like the longer amount of time on the generation, but at the same time, it's like higher accuracy. Well, the greedy search and so all these different search has different tax generation patterns.

    [47:11 - 47:13] Okay. And these are basically used in all LLM models.

    [47:14 - 47:19] So at what point does it become a perm after fine tuning the model? No, after fine tuning the model.

    [47:20 - 47:24] Okay. So, and so like you would recommend I try all of those and see how they perform .

    [47:25 - 47:28] Yes. I think I'm working a little bit more on this.

    [47:29 - 47:33] Maybe I can give you a sample of my file. Okay.

    [47:34 - 47:36] By the weekend. Okay.

    [47:37 - 47:54] Of one by that, I looks like and what I'm trying to do. I have been working and like I've been in both JSON and XML just to see what it would look like and if it made a difference, but in the end, seems pretty much the same.

    [47:55 - 47:59] So LLM are better on generating the JSON format. Okay.

    [48:00 - 48:09] Okay, so I will probably go with that. Missing all the mission model, like the small mission model, they are optimally generating JSON format.

    [48:10 - 48:23] Okay. So just again, just to get intuition in the binary format, sometimes I have floating points, I have integers of things, colors are represented by bytes, RGB and things like that.

    [48:24 - 48:42] It's interesting to see when in my JSON, like the position of a text and a frame one position is here and that frame 200, the position of the text is over there. It's numbers and I'm going to have lots of examples.

    [48:43 - 49:03] It doesn't work out against my when it gets down to numbers at floating point numbers and things like that. Like what's the other sense of like how well an LLM is able to figure out the meaning of numbers and delete them.

    [49:04 - 49:18] I don't think the meaning of numbers, they will make the numbers. So fine tuning is mimicking the numbers, okay, but if we want to.

    [49:19 - 49:43] If we want to like meanings to the numbers, then basically you would have to fine tune plus instruction find system, yes. So for example, if you throw in the fine tuning 0.1, it doesn't have doesn't have the science of range on what to generate well.

    [49:44 - 50:10] So you would have to define the range such that it can capture conceptual meaning of those things in this things basically you can you don't have to add like specific instruction on each data set, you can find the range of these attributes and then maybe instruction find unit with the range of the data, okay. The numbers are only generated within that range.

    [50:11 - 50:12] Okay. Thank you.

    [50:13 - 50:23] At the end of my call, I had this idea about just. So I have my rendering samples and I can generate videos from those rendering samples.

    [50:24 - 50:51] And I wanted to to basically take the output videos and feed them to something else, such as DPT, Gemini, something else. I think we mentioned call the video when I would feed that to those models just to see what they see and use that data as input in during my my training or my fine tuning.

    [50:52 - 51:14] So I got the question for you is because I am manipulating videos and manipulating images of videos or even I might have some text. How do I, if I need, for example, my output video to Gemini, if my input, let's say is a white square and it's manipulated and things happen.

    [51:15 - 51:47] And like, so Gemini might tell me, oh, there is a white square that's moving from the top of the street to the bottom of the street and something like that. But if I give it a photo of a dog and something that happened to the corner change or it changed from black and to black and white or and so how do I get, I mean, like the data generated to focus on what the focus on the transformation and not on the content.

    [51:48 - 52:09] So if you are using video model, there is a high chance it will start generating its own thing because that's what diffusion model are designed to do. But you can try and see the disruption between origin and the edited video.

    [52:10 - 52:22] Basically what you are mentioning is decent prompting. The technique that what you are mentioning is basically decent prompting plus the picture or the video.

    [52:23 - 52:32] So that's a technique to use. You can test it out with the view view or the Gemini video model, feel how it's working.

    [52:33 - 52:49] There is a huge chance it will start generating things on its own. But if the Jason, like you have to look into the Jason prompting of, so it's a completely different world Jason prompting for video is a completely different world.

    [52:50 - 52:59] But people are making good money just by using that. But that was just an idea of that I had at the end of our call last time.

    [53:00 - 53:04] I guess I don't want that to take me off track of like my main thing. But it's interesting.

    [53:05 - 53:06] Oh, yes. Okay.

    [53:07 - 53:07] Thank you. I have lots of things to think about.