A Glimpse into our Group Coaching Calls 2

This lesson preview is part of the Power AI course course and can be unlocked immediately with a single-time purchase. Already have access to this course? Log in here.

This video is available to students only

Unlock This Course

Get unlimited access to Power AI course with a single-time purchase.

$Thumbnail for the \newline course Power AI course$

[00:00 - 00:05] Hello, everyone. Do you guys had any chance to look over to the mini projects?

[00:06 - 00:11] I'm actually looking at the first mini project as I have the chance to go. But yeah, so I'm doing that now.

[00:12 - 00:14] OK, that's cool. Yeah. Hey, Craig. Yes.

[00:15 - 00:40] So the purpose of our project coaching is basically asking any question about notebook or mini project, or it can be anything relating to your own personal project. I'm here to give you like detailed guidance on how you should be proceeding and overcoming that error.

[00:41 - 00:55] So if there's anyone have encountered any confusion or any errors during the journey of it can be mini projects or notebooks or your personal projects. So I do have something about my personal project.

[00:56 - 00:58] Yes. So OK, so we handle one on one.

[00:59 - 01:12] Yes, for this, I sent you the diagram. So yes, well, just want to make sure, like, if it's gone through, if that's what it's supposed to look like, or if you need it more from it.

[01:13 - 01:28] So that we can say a chat through the private message. So I do have like the small projects that I personal project that I was working on, and I'm done with it.

[01:29 - 01:36] I'm pretty happy with the output. And I think my question is, I feel like I need more testing.

[01:37 - 01:49] And I'm finding a way to have a better because only I tested only with like because it's an imaging output. And I only was able to test like about 20 images.

[01:50 - 02:13] And I do have some images from online, but I just want to make sure, like, if there's a good testing strategy, like for more, like bigger number of images . Usually for the images, you can just choose over with the cues that you are targeting.

[02:14 - 02:36] And then basically you can choose like around 50 images or 100 images to taste on. I would suggest 50 because when you publish this application, what you do is you integrate that with the brain trust, which is a logging centric logging system for your EI application.

[02:37 - 02:48] Yeah. Reserve or thumbs down whenever you have like a final output, says that you can keep track of things when you publish to your application.

[02:49 - 02:57] And if user is liking what you are showing, it can be the table, it can be the value. So thumbs up or thumbs down.

[02:58 - 03:15] And once you have those logging system set up, then basically you can. So for example, let's say you have a cluster of people who are having thumbs down for specific, for specific things.

[03:16 - 03:46] So now you look at those respective data on what they are having things with the thumbs down and then you try to refine it only in that specific direction. So the thumbs up and thumbs down provides like valuable feedback when you are in the refinement of EI application while tasting has to be just to make sure that you have enough diversity to publish the application.

[03:47 - 03:55] So you are done with your personal project like or. Yeah, it's I mean, I have two different ones.

[03:56 - 04:04] It's three different personal projects, but this one was a lot more straightforward. So then I'm done with it and once the QA is done, I want to put it out there.

[04:05 - 04:13] And what are the prompt techniques? Like, did you use a COT or any advanced prompt techniques?

[04:14 - 04:24] So the prompting technique is what is that the multi one? No, actually, I think it's the.

[04:25 - 04:30] Multi reasoning or fee shot? First what I hope for the story.

[04:31 - 04:39] I had that page here, not reasoning, but it's. It's a very straightforward prompting like just specifying.

[04:40 - 04:51] Remove the background to the size, remove certain aspects in the image. And as missing part.

[04:52 - 05:06] So basically it's a passport image generator. So you can input any types of image and usually like I got the idea from.

[05:07 - 05:16] When people take a selfie, it's very it's it doesn't go as far as it should. Like the passport image requires.

[05:17 - 05:25] So you have to take a selfie like with either really long arm or with what is that? The selfie stick, otherwise it's not going to meet the.

[05:26 - 05:40] What is that? Those best specifications. So it's going to generate after removing the white background after removing the background, it's going to generate the rest of the body parts and it cannot look plastic.

[05:41 - 05:46] Like it have to keep the realistic face. The skin, the skin texture, right?

[05:47 - 05:53] Yeah, the skin texture and no accessories, like no glasses, no earrings. Yeah.

[05:54 - 06:04] So, yeah, so doing that part and I just want to make sure that it's consistent when. I have like 50 different types of pictures.

[06:05 - 06:20] Yeah, I was I was I got confused with your other project, which is basically the one with the tracking of nutrition. Yeah, that's the one that's in private message because I don't want to bring that in here because it's just too much stuff.

[06:21 - 06:26] Yes, because I'm going to get confused while I'm talking. So, yeah, sounds.

[06:27 - 06:49] Yeah, for passport, maybe at some point, you might have to specify the requirements for the end user where they have like a full picture of their face. Maybe it should not be a side face picture, like examples of bad photo.

[06:50 - 06:57] Right. Yeah. And examples, because you can only find unit at certain extent, but then.

[06:58 - 07:06] But then you will have to set a requirement for the end user in order to make it like useful for them. Not fair.

[07:07 - 07:13] Yes. Yeah, or you can even grab, you know, landmarks on the face, which is not bar.

[07:14 - 07:23] And do it real time. And then when the you cannot position the person in a frame and measure where the eyes are, the nose and whatnot.

[07:24 - 07:30] So you can estimate whether it's doing something like this. And when the features are facing strong, you can capture it.

[07:31 - 07:39] And then that way you can you can make sure everybody's facing the camera, basically. Yeah, that is pretty simple to do, actually.

[07:40 - 07:41] Yeah. So that's what I have.

[07:42 - 07:54] You can move up the next question. So I would I would recommend everyone to start with the mini projects and start booking calls with us.

[07:55 - 08:04] So in order to like for your personal project, these are some steps that you would be using. Basically, you would have to make a copy of this document.

[08:05 - 08:14] It's under on boarding and step nine. And then make a copy of your document and then try to add as many ideas.

[08:15 - 08:27] As you have. And then we will discuss and then book a time with Zao or me, and then we'll basically discuss in detail on what is viable in the book camp.

[08:28 - 08:41] And then from there, we will go over to the different steps of completing your personal projects. And for the mini project, I have posted like mini project, which is basically a drag.

[08:42 - 08:47] It's a simplistic drag. And over here, you would have to follow.

[08:48 - 08:57] You would just have to follow the instruction. Basically, you use like any one of the PDF from the data set, or you can use this PDF.

[08:58 - 09:10] And then you would have to chunk in different parameters with the combination of different parameters. And then basically you use any.

[09:11 - 09:19] Embedding model from the open EI, it can be. So there are three embedding models on open EI.

[09:20 - 09:30] Basically, you can choose like these three embedding models as an API to. Convert the tags to embeddings and then and then further.

[09:31 - 09:44] Use a combination of these embedding models plus the combination of parameters. To build different name spaces for each of those combinations.

[09:45 - 09:55] And then you generate synthetic data set for each name species, where you do a sequential. Where you do a sequential chunk retrieving.

[09:56 - 10:11] Where basically you retrieve chunk one by one and assign the chunk ID and then generate synthetic data or the questions for the specific chunk. And then you basically use that synthetic data set to evaluate your rank.

[10:12 - 10:21] So I have mentioned all the details in here. And you can also visit the notion doc if it's hard for you to read.

[10:22 - 10:31] So this is the notion document and basically just you would you would just have to follow the instruction that I mentioned. There are some things that are optional.

[10:32 - 10:44] Basically, the things that are optional are under bonus ideas. And also I have mentioned some things in the text itself, where I have mentioned it's optional.

[10:45 - 10:53] So you wouldn't if if you are not familiar with those things, you can just keep those things. We are going to cover it anyway in the advanced, right?

[10:54 - 11:07] But in meantime, did you guys had any questions with the mini project one? Like, has anyone completed mini project one from this session?

[11:08 - 11:19] So Julia has completed the mini project for the synthetic data set. Yeah, but I think I skip one of the steps that I can clearly understand.

[11:20 - 11:29] I think I skip the manual labeling. I did only I go through all this through LLM as judge.

[11:30 - 11:43] Even after it finds the issue in the question and response. And they will try ask them to do second shot.

[11:44 - 11:47] Yes. I found that my.

[11:48 - 12:08] Proms are two texts, yeah. And I was the reason by dancing and I think I can improve on doing more model base from compared to what I did.

[12:09 - 12:21] But because I don't have a schema base here, but I decided to do it like late. I want to focus on my on the second project.

[12:22 - 12:27] Yes. Yeah. So basically that's like one of important things.

[12:28 - 12:36] That's where the refinement comes from. The reason is like once you have the LLM as judge, you have already completed like 90% of your project.

[12:37 - 12:49] And then at this stage, what you do is basically you do the same, but you you act as in judge and then you label. Similar to the LLM, like in the similar fashion.

[12:50 - 13:15] And then you find the gap between those labels where you might have some other judgments on some some labels and LLM will have some other judgments on those labels. And then you try to refine the prompt with these gaps and try to mimic the same behavior that you are having while evaluating this synthetic data.

[13:16 - 13:23] In order for you to automate the labeling process. Can you give more explanation about second pass?

[13:24 - 13:48] I did the second pass and I was trying to tell LLM that this is a question with the response and this is the metrics that you need to improve on. I wasn't sure what exactly a second pass mean like I did on charge GPT, but is paying a special extra technique and you'll operate on with second pass, please .

[13:49 - 14:12] So in the second pass, like when you mentioned like improve on those things, what you basically do is not try to just give them the label and label and in the prompt itself ask them. Improve in that specific area. What you do is you refine your existing prompt for labeling each of these labels.

[14:13 - 14:35] So for example, let's say if you have a misjudgment in incomplete answer where LLM is saying success for an incomplete answer while we are judging it as fear. Then what you do is you define an explicit prompt where you give a few short example of what incorrect answers are.

[14:36 - 15:00] And then basically you try to improve in similar fashion for other things. So for example, in here unrealistic tools. And if LLM is judges marking failure for unrealistic tools and you are marking as success, then what you do is you try to lower down the strictness of labeling that specific label.

[15:01 - 15:13] In order for you to make that limit, judge more aligned with your label. Okay, it looks like I did wrong with stuff.

[15:14 - 15:25] Okay. You know, dipping what happened to me there is I couldn't, I couldn't really create examples that will give me unrealistic tools or like safety violations for some reason.

[15:26 - 15:50] And it always gives like reasonable, you know, responses. So I really, even though you told me I don't do this first, they did the human part first, but I tried. And even I put up like a more vague prompt in the second time to try to see if it would kind of start, I don't know, giving me laughs, technical or less thought of procedures still did really good.

[15:51 - 16:05] So I didn't really know how to actually come up with kind of bad answers or something. Somehow I was, I wasn't even thinking on, intentionally, so tell a give me answers that are not really accurate.

[16:06 - 16:13] There's something going on in the is going to read your entire pipeline down. Yeah, but I didn't think that that was the actually the good things.

[16:14 - 16:42] But so honestly, yeah, what you do in this case is basically you measure, you like, you create your own benchmark. So we have these different types of metrics. And in this case, like, let's say if you, if there is a, if there is a failure like success in unrealistic tool, and that unrealistic tool is basically a form or a knife.

[16:43 - 17:08] And then LLMS just is smarting it as failure, or maybe success, then you, what you do is you do vice versa, you try to, you try to label it vice versa and then find the gap. And then maybe what you do is refine the problem in that direction, if that's making sense.

[17:09 - 17:27] So for example, there are some, there are a few examples that you can define as unrealistic tools, which can pass LLMS just on first pass. But for you, these are unrealistic tools. So what you do is now, you have to go , I see what you mean.

[17:28 - 17:35] Yes, I can maybe classify like a multimeter for voltage, so I'm not realistic tool because I just don't know how to use it. And then you will.

[17:36 - 18:01] Okay, that's exactly, yes. Then similarly, similarly for, for a safety violation, for safety violation, water and electricity. So it should always have, for example, LLM might pass on safety violation if it 's just producing some safety text.

[18:02 - 18:27] But when you see water and electricity, you want a tax that explicitly defines, hey, you have to wear gloves in order for you to be isolated from the electricity stock. So then you mark that as failure, and then you go back and then now you check on your tax. Okay, so these are the reason I'm marking this.

[18:28 - 18:46] This is the reason I'm marking these as safety violation. So now you define the prompt in that direction. So the main purpose of these assignment is, or this project is basically to get you familiarize on generating synthetic data set, as we are going to generate similar fashion synthetic data set for RAG.

[18:47 - 19:04] What we do in RAG is basically we did three, like sequential chunks, and then we use that chance to, to convert those embeddings back into the text. And then we use that incomplete text to create question. It can be like three questions per chunk.

[19:05 - 19:30] And then we have like a bowl label, and then basically the bowl label is basically the answer. And then we use those synthetic data set we evaluate the next system. So this is what I mean by, so this is how you would do the mini project for RAG . So basically.

[19:31 - 19:56] So these are the reason I included all the tools over here is just to get you familiarize. It's also on the thread to get you familiarize with all these tools. And you start from passing the PDF. Now passing the PDF can be with PDF lumber, or it can be multiple other libraries.

[19:57 - 20:13] Then what you do is basically you have these different set of parameters for chunking. So one, this is the one set of parameter. This is another set of parameter. Similarly, another set of parameter. And then we have three embedding models over here.

[20:14 - 20:28] And then we use one emitting model. To go through each of the to go through and create vector DB for each of these parameters, and then each will have like different names faces.

[20:29 - 20:57] So imagine, yes, go ahead Julia. I have a question. I was asking before I was one task before the computer for good. The first line, it's set in structure, generate lesson and structure it create. Generate maybe a question, not lessons. Where. Just the first line on the tools .

[20:58 - 21:06] Name is instructor and purpose generate lessons. Is it on the thread.

[21:07 - 21:19] Like in every way in I'm looking from your notion right now, by the way, it's much easier to read. We are entering your eight lessons.

[21:20 - 21:28] Yeah, it's it's a type. It's it's it's questions question. Okay. Thank you. Yeah, I edited the thread.

[21:29 - 21:40] Basically, and then you have this combination of parameters. Imagine you have like four combination of parameter and three embedding models.

[21:41 - 21:55] How many names faces or vector DB. You will have with this combination basically you have three parameters and.

[21:56 - 22:14] Three combo parameters and and three embedding models that you will be using for this project. So how many name spaces you would need.

[22:15 - 22:27] That is nine right because you have you'll have like one for each set of parameters with each. Yes, each model. Yes, exactly. So you will have nine.

[22:28 - 22:41] Name spaces if you have four combination of parameter, then you have 12 name spaces where you have like each combination with each embedding model. And then those net the name spaces sign for each.

[22:42 - 23:02] Now, once you have sued this data, what you do is you use all these different namespaces, basically those 24 different name spaces. And then you start with the first name space and then you start with sequential retrieval of junk.

[23:03 - 23:09] And you save. And then you create synthetic questions.

[23:10 - 23:24] And save those synthetic positions. So basically, for example, imagine you have like an incomplete text, you have an incomplete text in chunk ID the first chunk.

[23:25 - 23:42] Now what you do is you ask an LLM model to generate a question for that specific chunk. Similarly, for second chunk, similarly for third chunk, and then you also see the chunk ID assigned for assigned with each of the synthetic.

[23:43 - 23:55] Quitions that we are generating for this patient. Once we have the data set, what we do is now we run an evaluation with.

[23:56 - 24:32] And then the reviewer will basically answer or retrieve the chunk IDs, the top five chunk IDs that are matched with the question. And then your job is to here find out the metrics core, which is basically recall precision MRR. And then you can use like other ritual metrics like MAP or NBCG or other metrics, but the main metrics are basically recall precision and MRR.

[24:33 - 24:41] And then you will have scores similar to these. I also see like an average time.

[24:42 - 24:53] Now we have different score for different combination of parameter with each embedding models. In here, I have used only two embedding models, which is basically small and large.

[24:54 - 24:58] And then I have more combination of parameters. Now, yes.

[24:59 - 25:07] Is that a data frame that used it there for visualization of that table, or is just a manually created table. Is this a manually.

[25:08 - 25:16] It's an asky. Basically, what I do is I see those result in.

[25:17 - 25:21] Jeeson file here somewhere. Yeah.

[25:22 - 25:34] And then I just use chat GPT or any other model. I'm not saying create me as a table on my project one.

[25:35 - 25:45] And then basically, so this is this one is more detailed one. This evaluation is more detailed, because it shows you like recall at the rate one, three, five and 10.

[25:46 - 26:07] While on here, it just shows you at direct five, which might be misleading, because when you present precision at the rate five, it will always have lower precision, because it's considering like top five chunks. But what you do is you just look at the MRR and recall.

[26:08 - 26:17] And then you find out the best combination of parameter that is working for you . In my case, it was a text embedding three small PDF lumber.

[26:18 - 26:31] And then the chance is of 256 and will up of 50. And then, and then what you do is, so this is from the retrieve part.

[26:32 - 26:46] And then what you do is basically you create a pipeline. Create a pipeline where if a user is asking something.

[26:47 - 27:01] What it does is, it retrieves from the vector DB, the retriever retrieves from the vector DB. And then you have an allen inference, which is another Python file, where you attach the retrieve context from the vector.

[27:02 - 27:18] And then in here you use a prompt, which is basically, this is the user question, and then you will have something like. You will have something like Q, and then it can be a parameter question, which is like a literal question asked by user.

[27:19 - 27:29] And another will be context, which is basically the context that is being retrieved from the retriever. So this will be in the prompt itself.

[27:30 - 27:45] And then it will basically generate the answer for an end user, a complete text answer for an end user. Now what you do here is you measure this complete text generated answer with fitfulness or relevance and the ground to metrics.

[27:46 - 27:59] Basically, you can use like fitfulness and relevance to measures that you don't need to use like ground to. And then see if you are having a decent score.

[28:00 - 28:16] If you are not having the decent score, then can anyone guess, like where's the problem, like what step we have the problem. If we are not having the right score at this stage, like on the generation stage.

[28:17 - 28:30] So we got the right school from here, because we already started with the evaluation center approach, which is basically, we got the right school from here. But somehow, we are not having good school at the generation stage.

[28:31 - 28:57] And then you will guess like what, what will be the problematic thing, if we did not have a decent score in fitfulness or relevance. Could you, could you are, could you elaborate more on your, on your, on your question. Initially you mentioned that we did well on evaluation.

[28:58 - 29:13] Well, well, well, is that correct? Yes, we did well on the evaluation at this stage, which is basically the retrieval metrics, because we did like, we already have a bunch of different combination of results.

[29:14 - 29:34] Okay, similar to these where we have found the best result. Now, using that best combination result in the generation stage, but on the generation stage, we are evaluating it with the fitfulness score and relevance score.

[29:35 - 29:48] But at this point, we are not having good score for fitfulness and relevance. So what might be the possible reason on why we are not having that would do so.

[29:49 - 29:52] As it says, there's the prompts that you're using. Yes.

[29:53 - 29:54] System prompts. Yes.

[29:55 - 30:10] Yes, exactly. So the problem that we are using in the generation phase with the LLM, it might be problematic, or it can be LLM itself that is problematic, which is not capturing the prompt engine prompt that we are using.

[30:11 - 30:21] So for example, one time I was building a rag with llama model. Don't ever use llama model, because it's the worst model you will ever have.

[30:22 - 30:30] I got like bash results at this stage. And then at this stage, I was not having good results.

[30:31 - 30:42] So now the, like, if you are excluding a rag for the first time, what you will do is you will start again from zero. Like what we are doing, like, there is something that you are doing wrong in here.

[30:43 - 31:10] And then I eventually figured out an approach on making like everything systematic such that once you have any error in any of the stage, you just go back to the previous stage in order to solve that error. So the reason this seems more structured and more aligned with the evaluation is because it's from my experience.

[31:11 - 31:43] Like whenever you try to build a rag, like from the very first insight or very first infusion, you will just basically scramble everything up, because you will just use your intuition to check the intuition, but basically you will see by yourself from the combination of different parameters and you will start a win from the very zero. And then it will just bring you to nothing.

[31:44 - 31:57] So what I did was I created a structured approach on how to start without any questions. When you certainly use any PDF on the description.

[31:58 - 32:02] Does that prefer to any PDF on that data set or any PDF? Okay. Yes.

[32:03 - 32:09] Any PDF from the data set because the data set is too large. It's going to take like huge amount of time to complete the mini project.

[32:10 - 32:13] Okay. Just don't with one document is is this fine.

[32:14 - 32:15] Yeah. Oh, okay.

[32:16 - 32:22] Okay. Yeah.

[32:23 - 32:38] And then what I basically do is instead of like having about like a lot of different parameters at the stage of chunking and overlapping. I just evaluate with the time and with the max character.

[32:39 - 32:54] So basically, this gives you the context window, like imagine you are building a real application with rag technique. So you would, you would want to have lower cost on each inference call.

[32:55 - 33:18] If you would have like on on premise NLM model, then basically it does not matter, but still it matters at some extent because if you have like a PTTU inference call with larger context window is going to affect the performance. But to keep the cost lower, what you do is you would basically choose like GPT 3.5 turbo.

[33:19 - 33:26] Now, GPT 3.5 turbo has lower context. So what you do is.

[33:27 - 33:39] The, the, now we are selecting GPT 3.5 from the very beginning because I don't want to go for expensive inference call. So what you do is basically you evaluate this different combination of parameter.

[33:40 - 34:01] Not while feeding it directly into the vector DB, but at the stage where you calculate this with the time. And then with the help of like max characters and max words and total chunk size, you would basically find out that these are some parameters that seems viable considering the 3.5 turbo.

[34:02 - 34:18] And then basically you choose those different set of parameter for our experimentation on finding out which is the best one with regards to recall. MIR.

[34:19 - 34:31] And basically, in that table what the MIR did you use how many, how many ranks. So what MIR basically.

[34:32 - 34:56] MIR is like a mean mean matrix that gives you a mean of retrieval metrics, while recall over here and precision are basically with respect to the key that you are asking. And I did like 1, 3, 5 and 10.

[34:57 - 35:08] Now, for example, let's say, so we're here. This like at first inside, you will just choose this one, like the first one because it has the highest right.

[35:09 - 35:20] But for me, it's the time considering the school, but it's also the time. It's taking like 4.470. So I will choose the best.

[35:21 - 35:37] I'll choose the best numbers with regards to the average time. So in this case, it can be these one or I can reduce the time drastically and then choose between these two.

[35:38 - 35:58] Now I go back to the parameter evaluation with the detail. And then I see you were here. Okay, so I choose the 71 and we can see it is actually retrieving things on the top 10 chance.

[35:59 - 36:14] So we can post the score to 85 by using re-ranker. Now re-ranker is like, I don 't answer it. It will be covered in direct once. But what it does, it pushes the golden chunk from 10 to 1, or maybe, I mean, sorry, 3.

[36:15 - 36:38] And then basically you will have like lower time at the rate, same like at the same time, you will have decent accuracy considering longer time and higher track missing. So over here, I'm using recall that there are one three by 10. So how it basically works is.

[36:39 - 36:51] So imagine like your chunk ID is being retrieved on like your top. In this case, our top key is basically 10, k equal to 10.

[36:52 - 37:11] And what we, what we're doing is retrieving is based on the question, it's retrieving the 10 chunks at the same time. In, and then more like 60% of the time, we are having goal label in first jump.

[37:12 - 37:23] 70% of the time we are having goal label on the, on the second or third jump. And then 80% of the time for the first five.

[37:24 - 37:36] And then similarly 85 for the first 10. So now you in this case, what I do is I retrain the, sorry, I train the cohere.

[37:37 - 37:48] Reranking model for the five, because I like point five, like five percent does not matter. So I to maintain the precision I use like the five.

[37:49 - 38:04] And then basically it will increase the activity to 80%, which is basically the precision to 80%. So basically you will ask the question, you'll get back chunks, right?

[38:05 - 38:09] Okay. Like a list of chunks.

[38:10 - 38:24] Yes, you will get the chunks. Basically, you will ask the question, you'll get back chunks, right?

[38:25 - 38:27] Imagine it as incomplete paragraphs. Yeah.

[38:28 - 38:46] And then basically now, in order, now you don't show these incomplete paragraphs to the end user. But what you do is basically you use this in, in complete paragraph in like another Python file, you hide from the engineering technique to generate like a full answer from the user position.

[38:47 - 39:15] And in order to find the measure of recall, where the actual answer, the correct answer is within the one, three, five, whatever is at manual or use an MOM for that also, like to understand these metrics. No, for the recall, when you say, I mean, I'm asking a question, get three chunks back and I want to know if the real answer is, is within that, right? Is that what the measure is?

[39:16 - 39:22] Yes. So these metrics are being measured. It's not being measured with the LLM. Okay.

[39:23 - 39:36] Yeah, it's being measured with the chunk ID, which is basically from the synthetic data, which is this one. So he tries to check the chunk ID retrieved from the retriever.

[39:37 - 39:47] But you actually have to read that chunk and understand if that's the actual answer or not why. Or just need to compare whether it's the actual ID that is retrieved.

[39:48 - 40:01] You would have to just place it as a placeholder in the recall metrics. Like, I think the library, the library. It's, it's a library.

[40:02 - 40:13] Maybe it will be more clear on the mode, because we have that on our. I understood your question. Basically, how we, how should we calculate the those MRR or recall?

[40:14 - 40:35] Yeah. Yeah. Because one thing I'm not saying yet is, if you give the chunk ID to generate the question, basically, when you generate, why would it not give you that chunk ID on the, on the answer? I mean, it's, it's, it's in that something that you're actually giving. I mean, it's, it's in that thing that chunk ID and input parameter of the synthetic data.

[40:36 - 40:42] I, I did not understood your question. In, in that case, if you were showing there, can you go back to that, Jason?

[40:43 - 40:51] Yes. But had the, the question in the, in the chunk ID.

[40:52 - 41:01] Shinkish. Yeah, this one. That relevant chunk ID is what it comes back or it's something you get.

[41:02 - 41:12] So these, at this stage, the question is being generated by the LLM. But what we are doing is basically, we are going through the, like, we are going sequentially in the vector.

[41:13 - 41:31] We are not using any retriever. Okay. Imagine it as pointer. And then this pointer basically extracts the chunk ID, but at the same time, it generates the question based on those text.

[41:32 - 41:55] We are using calf to set this chunk in order to go to a vector database. We are using it to calculate networks. Yes. After exactly. We have the database gave a response and this is our ID to verify that with question related to this chunk.

[41:56 - 42:11] Yes. So, so now at this point, like we are going sequential and generating like questions from the chunk. And then we are also showing at the same time chunk ID or the position from where the questions are being generated.

[42:12 - 42:26] And then what we do at the time of evaluation is we use just the question, but on the metric side. We try to calculate the metrics with the help of the label or the chunk ID.

[42:27 - 42:37] If this chunk ID is being present on the top 10 metrics. Sorry, top 10 retrieved chunk IDs from the retriever.

[42:38 - 42:55] Yeah. Okay. Yeah. I guess that wasn't the question. If you are actually creating the question from the chunk ID, why wouldn't it. Spit it back when you get the answer, because it's just the question is based on the chunk ID. You know what I mean. Yes. Yes. Basically, this is like this is sick.

[42:56 - 43:09] So imagine these as imagine these as like you have like different paragraphs on different locations and you are generating questions for each paragraph. Yeah. But you at the same time, you are also noting the paragraph number.

[43:10 - 43:33] So question one is generated from paragraph one, similarly to from two three from three for the question four from four. Now you ask what to do with the retriever. I see. Now, now retriever gets you back with the chunk IDs, but it does not know the real chunk ID because we know the real chunk ID now.

[43:34 - 43:46] Yeah. Nice thing. So these are just calculate the. And there is where, you know, the chunk size matters because if it's too repetitive text or whatever you may start screwing up and mixing paragraphs and stuff like that.

[43:47 - 44:08] Yeah. Exactly. That's why, that's why, that's why you go with the combination of different parameters with different names. And then at the same time, we are not creating these questions directly from the directly from the PDF is we do not get like we do not know the chunk ID.

[44:09 - 44:33] But at the same time, we do not have the emphasis is of that specific chunk because when we generate questions directly from the PDF, we won't know like what questions are on what chunk position. So the emphasis is of those chunks are basically not present. And neither we have the location to calculate the metrics.

[44:34 - 45:03] Believe me, I haven't followed following up question. Yeah. When computing the recall, for example, or the position after retrieving the D is specific. The specific chunk ID and then generate a synthetic synthetic answer and start checking.

[45:04 - 45:36] Right. If this answer is close to this chunk ID retrieve initially, how do we score in terms of the position that this chunk is like in the top five, for example. If it's like a top number, if it's top one, and it matches our relevant chunk ID, how different would be the score versus maybe it's at top five.

[45:37 - 46:04] This is just like a binary thing. It's in the top five. It would be like, I'm just curious if we if we pretty much have some level of. I mean, how how would you do that? I'm a little bit confused because if it's a binary, we can we can count that as a pass or, or fail, that would be the easiest way.

[46:05 - 46:14] But because we are talking about a rank of top five or top 10, maybe. How can we assign like a more suitable score.

[46:15 - 46:27] Basically, are you asking like, how, how can we check if it's being retrieved in top five or top 10. Yeah. Yeah.

[46:28 - 46:38] That's, that's where the key comes from. So, the key is very important in drag. It defines the number of chunks that are being retrieved from the retriever.

[46:39 - 46:57] At the same time, it is, it is, it is defying the, so for example, if you are retrieving 10 chunk IDs for one question. Then you would have to calculate the score with regards to recall at the rate one, three, five and 10, because you are, you are already retrieving 10 chunk IDs.

[46:58 - 47:25] And then basically, you use a library where you use a library where it calculates the recall for each of the key value. So for example, if the goal, like 60% of the time, the goal level or the original chunk ID is present on the position one.

[47:26 - 47:43] And then over here, 70% of the time, it's on top three position. And then similarly, for top five position it's 80% of the time. And then similarly for top 10, it's 85% of the time.

[47:44 - 47:52] And when you say a percent of the time is because you, you have avoided all the questions you did. And you found it or not within those different windows.

[47:53 - 48:16] Yes. And then basically, once you, so in here, I have total. So to make it very simple, like, make it very simple. I had to create like, I mean, I will publish these templates of many projects for everyone. But I want everyone to first complete by their own says that I can see how everyone is thinking and I'm, and I can correct on the thinking pattern.

[48:17 - 48:31] But in here, I have used like 20 questions, 20 question example. And then I am generating all these different metrics with those 20 examples. Did that answer your question or so over those 20 questions.

[48:32 - 48:40] So you start generating synthetic data, right? So these are the synthetic data that. Okay.

[48:41 - 49:03] So how, so for those 20 questions, or for those 20 queries, how many, how many times, I mean, how did you start building? How did you start building the recall position and get that distribution of, I mean, answers per.

[49:04 - 49:13] K equal to one, three, five and 10. Yes. So that's basically with the, with the PDF.

[49:14 - 49:23] So I think I use the retrieval method over here. And then, are you, I will publish a notebook.

[49:24 - 49:30] It's much more clear. Basically, you just use our, you just use a library similar to this.

[49:31 - 49:43] And then you pass like metrics known as a recall. And then you pass the retrieval chunk ID and then comma, your golden chunk ID.

[49:44 - 49:56] And then it will basically calculate things for you. But if you, if you wanted to manually just go to the retriever, you try to answer the question and then you'll get chunk IDs back right.

[49:57 - 50:06] And then you start for a manual. Yes. And then you can just check the same position one. Yes, or no, same position three, yes, or no, same position five and so forth.

[50:07 - 50:08] Right. Yes.

[50:09 - 50:16] Exactly. So basically, if you're going for manual approach, then, then you would have to basically for first question.

[50:17 - 50:32] You have, you would have to see like, like the 10 retrieve chunks that are, that are being retrieved from the retriever. And then you manually check if it's in position one, which God's mentioned and check these.

[50:33 - 50:43] Go label. If it's basically these label, if it's one, if it's two, if it's three, and then basically wherever it is, you can see similarly for all 20.

[50:44 - 50:53] And then you have to calculate with a formula on calculating, let me call it the rate one three. Yeah, I see.

[50:54 - 51:07] I see Maria missed the rag notebook. I can only see the parser notebook exercise from the student field.

[51:08 - 51:12] So I will add. Where is that?

[51:13 - 51:21] Yeah, maybe a million missed adding the notebook. So basically, we had like extra.

[51:22 - 51:30] Basically, in the, we, yes, go on the last side in the lesson, you mean in the right lesson or where it's actually in the portal. Yes, in the, in the right.

[51:31 - 51:32] Okay. Yes.

[51:33 - 51:39] We had two things in the right. One was specific for passing and scanning the PDFs and photos and stuff.

[51:40 - 51:47] And then another one was specific to RAG. So, yeah, it's a pretty requisite.

[51:48 - 51:53] The one that is there, which helps it to parse the data converts to basic support and all that. Yes.

[51:54 - 51:59] But there is no RAG. So I will, I will include the RAG such that it's much more.

[52:00 - 52:03] Easier for everyone to. Yeah.

[52:04 - 52:09] Honestly, when, when you put the mini project, I was going back to the lesson, trying to find something like that. I couldn't find it.

[52:10 - 52:11] I said, okay. Yeah.

[52:12 - 52:19] Always, always like post a, always post a thread. I'm continuously looking all on the threads.

[52:20 - 52:28] So if there is like, if you, if you find any, like, post, like the threads, I will try to solve that. No, you said, you said mini process.

[52:29 - 52:33] And it started increasing the, the nine complexities. I thought, okay, it already happened.

[52:34 - 52:35] Yeah. Yes.

[52:36 - 52:40] Yeah, I also had some edits in here because. I included some of the reranking stuff.

[52:41 - 52:49] But I thought it would be overwhelming for. People in the current board. So I removed the reranking.

[52:50 - 52:59] Can you guys see if it's published? Is it under courses? It's under, it's under a module.

[53:00 - 53:04] Yeah. And in the lecture notes, there, there are there now, I can see them.

[53:05 - 53:06] Okay. Yeah.

[53:07 - 53:10] Yeah. Like I mentioned, I should new open work exercise says that it's much.

[53:11 - 53:12] Yeah. For everyone.

[53:13 - 53:20] I also posted a thread. Oh, sorry, index or sizes now, not the lecture notes.

[53:21 - 53:22] No. Yes.

[53:23 - 53:28] It's under courses. Courses and, uh, yeah, rides.

[53:29 - 53:30] Okay. Lecture notes.

[53:31 - 53:35] Okay. And we book a one-on-ones depend to go through these things.

[53:36 - 53:42] Or the one-on-ones are just for the, the main project. A one-on-one is just for your personal projects.

[53:43 - 53:54] Well, uh, this, this session, the current session that we are having. The one in the morning and, um, sorry, in the afternoon, um, right now.

[53:55 - 54:05] Those are the one where you can, uh, ask anything like with any of the coding. And that it can be a personal project.

[54:06 - 54:18] So, um, when do you expect that to start us to work on personal project? I had a session, uh, and we identified list of projects.

[54:19 - 54:23] It's. And I pick one.

[54:24 - 54:32] So I should start working now and how I will start working. Should I, for the next session to plan?

[54:33 - 54:39] Uh, yes. So, uh, in next session.

[54:40 - 54:47] What we are expecting is basically. We show you the road map.

[54:48 - 54:53] Things. So in next session, what we are expecting is basically you.

[54:54 - 54:58] Uh, uh, once you have like a defined direction and problem in. Yeah.

[54:59 - 55:06] That we already did in your use case where we also mentioned like the techniques. That we're using in each of those different projects.

[55:07 - 55:14] Once you, once you've decided the project, what you do is you plan out the. Workflow in detail for your spacing project.

[55:15 - 55:21] And then we are going to, and then you book a call with us and then we are going to review that. Uh, workflow and then suggest.

[55:22 - 55:32] EI techniques that you will be using in different pieces. And then basically, uh, these might need like one or two iteration and then.

[55:33 - 55:41] Uh, one or two iteration or one or two bookings. And then, uh, finally, once these is like said, considering.

[55:42 - 55:46] You have planned out like each of the fees is for your personal project. Then you start cool.

[55:47 - 55:54] Then you start coding with each of the fees. And then you evaluate each of the fees, which with the metrics that we have defined.

[55:55 - 56:03] And then, uh, at then you will combine all these pieces or the components to have a complete EI application running. Okay.

[56:04 - 56:10] So it, uh, it means that they can cook a book and multiple calls with your. Yes.

[56:11 - 56:13] Okay. You can move multiple calls with me.

[56:14 - 56:23] Uh, once we are, uh, at this stage where you have a work. Okay.

[56:24 - 56:36] Work through is basically you just have, you can, it can be, it doesn't have to be fancy or something. It just has to be something like a flow where I can see how, uh, you are thinking and how.

[56:37 - 56:46] Like, I will be giving some feedbacks on what should be done on editing the workflow. And then basically we are going to add like a techniques in that workflow.

[56:47 - 56:55] And then, uh, once that is being that is placed. We will divide these things into different phases.

[56:56 - 57:05] So first phase can be, if you are building a rag application, the first phase can be. Uh, outputting different text.

[57:06 - 57:09] Piggles and tables. In structured manner.

[57:10 - 57:18] And then you evaluate it with the things that we mentioned. And then once you are successful, we go with the second phase, third phase and fourth phase.

[57:19 - 57:22] And then you will have an EI application. You guys have any questions?

[57:23 - 57:38] Deep thing, but what, um, I'm just checking right now that the, the, the principles of chunking and I see the table. And there are suggestions in terms of chunking size and checking overlap as well.

[57:39 - 57:42] Yes. In terms of the chunking overlap.

[57:43 - 57:49] Is it in pleasing the fact of. Feeling with.

[57:50 - 57:57] Tromcation and pairing. Do we need to be, it might be.

[57:58 - 58:08] At the end, I don't think it will cause any issue. The reason is. In between that, the main reason of what we are doing for the chunking.

[58:09 - 58:23] We're using the chunking role map is basically. Is basically we are just repeating information on both of the hands.

[58:24 - 58:31] Of that specific chunk. So we don't need it then, right?

[58:32 - 58:37] Yes. Okay.

[58:38 - 58:42] Okay. Is open router.

[58:43 - 58:46] Yes. And I don't.

[58:47 - 58:56] Maybe I'm wrong, but I, but I, you know, I'm not able to see. Whether they have embedding models.

[58:57 - 59:20] No, they don't correct. Okay, I'm planning to use a GPT 4.1 nano.

[59:21 - 59:33] That's, I think that's different. The embedding models are different. Yeah, yeah, I know, I know those are different, but I just, just, I want to use these LLM, but it needs to work.

[59:34 - 59:48] It needs to be compatible with the embedding model, right? So if I, if I'm using like an open, open AI model. I must, must, I need pretty much to use the, the open AI embedding models as well.

[59:49 - 01:00:05] No, the reason for that is, okay, so we are not doing text generation. The only, the only point we are using embedded models are basically converting text to embedding, which is basically a bunch of numbers and then storing it in vector DB.

[01:00:06 - 01:00:18] And then basically we retrieve those embeddings with the help of retriever and then use. Again, we use these embedding models to convert those bunch of numbers to text.

[01:00:19 - 01:00:30] Okay. So the embedding, the, the purpose of the model over here is just to store those numbers in vector DB.

[01:00:31 - 01:00:52] But the embedding model here affects the, the effects, the contextual meaning of paragraphs. So the large embedding model large might dream down some of the things or might dream down some of the details from the PDF.

[01:00:53 - 01:01:07] While some embedding models might might be very keen on each details. And so that's the purpose of a meeting model here, while what you mentioned is basically at the text generation.

[01:01:08 - 01:01:21] So when you're building a foundational LLM model, for example, you're building your own chat GPT model. So that's, that's the case where you have to vary, where you have to be very keen on the embedding model.

[01:01:22 - 01:01:40] But does, so you're using convenient model for rock, but after that you're sending it with to a vector database. Do you need to have it's compatible that with vector database support with embedding models?

[01:01:41 - 01:02:01] In most of the cases, I don't think you need any compatible embedding models because when, when you hear like vector DBs, that means they are compatible with all the embedding models. The reason for that is embedding models with basically vector DB are something that can store a bunch of numbers.

[01:02:02 - 01:02:08] And embedding models basically generate bunch of numbers from the text. For question, can I use for with exercise pine comb?

[01:02:09 - 01:02:19] Yes, you can use vector DB pine comb vector DB. You can use your local vector DB, which is basically FASS.

[01:02:20 - 01:02:45] You can use lance DB that is basically a, like it's one of the good vector DB or one of the most recent one, where basically in here, we are going to have one exercise where you are going to be building like multi-vector vector DB. So, these supports multi-vector.

[01:02:46 - 01:03:05] So, the multi-vector is only on the online side, but on the local side, like l ance DB supports vector DB. So, the platform of vector DB, like you can go with any vector DB. Yes, go ahead Kevin.

[01:03:06 - 01:03:40] Yeah, yeah, I think I have this misconception of thinking that if I use like ADA embedding model for mobile AI, and then dump the embeddings into the into PyCon database, for example, PyCon vector database, and then retrieve the embeddings. So, I thought that I need to be pretty much keep on using the same embedding model for something more.

[01:03:41 - 01:03:56] You see embedding models. After you retrieve those embeddings from the pine cone, you have to convert back to text. In order for you to convert back to the original text, you would have to use the same embedding model.

[01:03:57 - 01:04:04] The same embedding model, right? Okay. Yes. But now you are feeding the text to a LLM model.

[01:04:05 - 01:04:24] Okay, I can use N number of embedding models down that those numerical representations into the vector database. By the time of retrieving, right, I need to make sure that I'm using the right embedding model to get the text, right?

[01:04:25 - 01:04:39] Yes, exactly. Okay, okay. But in general, those vector databases support any kind of numerical representation, regardless of what embedding model I use.

[01:04:40 - 01:05:06] Okay. The only the only thing that I experience while using different vector DB s is the retrieving speed and sometimes garbage value. It's very rare, but sometimes because what happens is like, let's say you have a very keen embedding model, which focuses on on on on like detailed things on each text.

[01:05:07 - 01:05:21] So now at this point, imagine those embedding models, it's going to generate like a huge decimal point for specific text. And some of the vector DB, what it does, it rounds up those decimal points.

[01:05:22 - 01:05:39] And then that's why you will have garbage value. So it's very rare, but it do happen. And so I have a follow up question. Why did I just store the numerical representations locally as a.

[01:05:40 - 01:05:49] As a dots something file. Versus the above these numerical representations into a third party vector database.

[01:05:50 - 01:06:00] Then you wouldn't have any retrieval to retrieve it. From the question, you would have to build a retailer.

[01:06:01 - 01:06:14] Okay. So let's say if you if you store these numbers, then basically what you are asking is, why can't I build like my vector DB? That's your question. That's the right way to put it.

[01:06:15 - 01:06:34] Basically, if you store all these numbers locally, then you would have to be like a retriever. That like a huge dictionary, right? Yes. And that's basically building a vector DB on your local machine. Yeah.

[01:06:35 - 01:06:55] I would take that into account as well, because when when doing, I saw that to pretty much explain what what is the tradeoff between latency and accuracy, right? Yes. No, accuracy. And for you, you explained that you're more you care more about the time.

[01:06:56 - 01:07:05] But how long we take to pretty much. Yes. The treatment formation, right? Yes. Okay. Okay. I'm just going to take down into account.

[01:07:06 - 01:07:29] Yes. If you want to use like local storage local vector DB, just keep install Lance DB. Lance. Yes. Lance DB. L-A-N-C-S-L-C-R-L-E-N-C-E DB. Oh, yeah.

[01:07:30 - 01:07:40] And is there a local option for that? Like it's a cloud. Yes. Yes. So there is a local option for Lance DB.

[01:07:41 - 01:07:59] Let me sign you a proper, it takes me straight into signing up for their cloud option. So this is the one. Yeah. So you can just do people install Lance DB. And then basically, you can use the Lance DB library to create a local vector DB.

[01:08:00 - 01:08:15] But in terms of the scaling, this project, for example, right? So let's say that we want to deploy these and to make it production ready. What is like?

[01:08:16 - 01:08:36] Well, Mark, from the architecture perspective, what would be your recommendation in terms of pay for a third party vector database or, I mean, or just handling locally. And I mean, are you asking this question for a skill? Yeah.

[01:08:37 - 01:08:51] No, or the mini project to be like production ready. Let's say that we have like thousands of PDF files. Then you would have to go with the online API option.

[01:08:52 - 01:09:04] Because what happens is, that's the reason I mentioned like only one PDF from the details. So what happened was, I had a program that started passing through this PDF.

[01:09:05 - 01:09:16] And this, there were many PDF in the data set. And then I started showing this PDF. And when I woke up, my computer started l agging, like it was like very slow.

[01:09:17 - 01:09:25] The reason was, I was showing everything on the local vector. So that's why I mentioned just use one PDF for the mini project.

[01:09:26 - 01:09:28] One PDF. Okay. Yes.

[01:09:29 - 01:09:47] Okay. Yeah, I think that helps you pretty much the couple, the memory consumption from processing all these. It's going to, it's going to occupy your, it's going to occupy your, and then at some point, at some extent, it's going to also occupy your hard drive.

[01:09:48 - 01:10:07] And then you would have to manually delete those files from the library or delete me. Do you know, do these databases skate horizontally? Like, can you maybe do something where you use this, some premises and then you take care of the managing the servers, what they run or get to start a scaling or whatever.

[01:10:08 - 01:10:17] Like, you know what I'm saying, I mean, is it possible to horizontally scale a lens, or a vector, or are they. So do you really?

[01:10:18 - 01:10:19] Yes. Yeah.

[01:10:20 - 01:10:34] Um, instead of building my own sort of, uh, vector to be, I would like to build my own landscape, if you will. By using landscape on the open source side or like whatever license it asks or whatever.

[01:10:35 - 01:10:46] But I'm thinking at some point, I'll have to just start scaling up the same, adding more servers like horizontally. Um, do you know if these, uh, are.

[01:10:47 - 01:10:56] Thought and meant for that sort of thing, or they do any sort of internal magic for that to happen. Yes, I mean, the last DB is basically a very nice.

[01:10:57 - 01:11:06] So basically you are asking if the last DB is scalable, if I start extending my hardware. Uh, yeah, adding more servers on the side, basically.

[01:11:07 - 01:11:08] Yes, it's scalable. Yes.

[01:11:09 - 01:11:19] Okay. Yeah, I bet the question is whether that's easier than actually signing up for the cloud or just having your. Out of scaling sort of group and then adding more instances as you.

[01:11:20 - 01:11:23] As you start growing with some sort of metric. Yes.

[01:11:24 - 01:11:32] So that's why I use like turbo buffer for the cloud, because like turbo buffer is one of the best. Uh, vector DB.

[01:11:33 - 01:11:42] Uh, that you can get with the price point, but at the same time, you can use like. Uh, you can use like, uh, it would basically it's being used by customer.

[01:11:43 - 01:11:53] It's being used by all the white pudding software agencies. Like most of the people are using turbo buffer because of the price point at the same time.

[01:11:54 - 01:12:01] It's so as you can see, like I queried. These many of, uh, now if you do the same.

[01:12:02 - 01:12:09] Or local, you can imagine the cost of a hardware, right? Right.

[01:12:10 - 01:12:11] Yes. Yeah.

[01:12:12 - 01:12:16] It's not even possible. But, um, you can, why did you recommend lands instead of this one?

[01:12:17 - 01:12:18] Just. Minutes ago.

[01:12:19 - 01:12:25] This one is like paid for the experimentation. Like these are only speed for the expedition.

[01:12:26 - 01:12:31] You can just use Lance TV on your. Okay, but you can just download it and use it local.

[01:12:32 - 01:12:33] Okay. Yes.

[01:12:34 - 01:12:40] And do you know if it's still low. It's based on any of those or it's just a need their own saying.

[01:12:41 - 01:12:48] Uh, I did not under two like underneath. I mean, this is a cloud service that underneath has an actual vector database.

[01:12:49 - 01:12:56] Um, it's, you know, it's their own or they're just grabbing one and make it. Make it cloud cloud enabled.

[01:12:57 - 01:13:07] I think like the backend, what they are using is they are having an efficient cloud engineer. And they are, they are kind of managing it from the EWS.

[01:13:08 - 01:13:14] If I'm assuming it correct. I have like good cloud engineer to make it efficient.

[01:13:15 - 01:13:18] Yeah, because that's the thing. I mean, if you know it enough, AWS, maybe you don't need this.

[01:13:19 - 01:13:25] Just go straight to the AWS. But then it's going to increase the complexity because now you will need the cloud to maintain all that.

[01:13:26 - 01:13:27] Yeah. Yes.

[01:13:28 - 01:13:31] Now you will leave the ground engineering scale to maintain those things. Yeah.

[01:13:32 - 01:13:45] Of course, to get started may not be a saying, but when you're like seriously going into production, scaling these up, maybe cutting costs. Yeah.

[01:13:46 - 01:13:58] Yeah. And because it provides like, it provides all the service on the clouds of whatever you can think of.

[01:13:59 - 01:14:08] Basically, you can build these on EWS. And then they are just making it user friendly and, uh, efficient.

[01:14:09 - 01:14:15] So it's that it can be used faster for that user. They don't have to go through with, we can just basically go through that docs.

[01:14:16 - 01:14:26] And we don't need to deploy anything or do anything. We just go through the docs and start storing vector duties, uh, meta data and all sorts of other things in here.

[01:14:27 - 01:14:41] But does they use a WIS vector or they build their own and just deploy in EC2 or in AWS? I think it's easy to.

[01:14:42 - 01:14:51] It's not like the literal vector DB. Because right now they are like, like they are on scale.

[01:14:52 - 01:15:05] So it might be like combination of AWS and their own hardware because these, these platform like. Rammed up very quickly and they did not had enough time to scale up their hardware.

[01:15:06 - 01:15:18] So now what they're doing is they are start, they are servicing these things from the AWS, but then I think they are going to start investing in the hardware and remove the cloud at some extent at some point.