Retrieval 1 - Steps to Perform RAG
As we seen the below given link, LLMs are trained with large amount of data. https://gaillms.blogspot.com/2024/01/training-llm-dataset.html But this data does not include proprietary data and data not available for free on the internet. When we give prompts to an LLM, it replies based on the data in which it was trained. The problem arises when we want answers for data that the LLM was not trained on. Two options are generally considered for this purpose. 1. Model Fine Tuning 2. Retrieval Augmented Generation(RAG) Based on our requirement we can decide between the 2. Our topic of discussion is RAG. Retrieval It is to retrieve external data like pdf, html, videos etc. Generation It refers to generating output with the external data retrieved. The steps used in RAG are as follows: Why we need RAG? Consider a pdf document with 1000 pages about my company. This is the context within which we pose questions to the llm. That is the prompt expects answers based on the document. For...