I'm Developing an In-House RAG Solution for Retrieving Company Data from Diverse Documents
I'm Developing an In-House RAG Solution for Retrieving Company Data from Diverse Documents
At a glance
The community member is working on an in-house Retrieval Augmented Generation (RAG) solution to retrieve chunks of data from a diverse set of documents, including annual reports, earnings call transcripts, expert call transcripts, and equity research reports. They are seeking recommendations for the most precise RAG solution that can handle this diverse data set.
In the comments, another community member suggests starting with normal retrieval, then trying a hybrid approach, and finally adding a reranker to see if it improves the results. They note that there is no definitive way, and the best approach will depend on the specific data.
Hey guys! I ve been working on my inhouse RAG solution, which should retrieve chunks from diverse set of data that I have collected about a single company. These data consists of annual reports (past 5 years), earning call transcripts (past 5 years), expert call transcripts and some equity research reports. Does anyone know of the most precise RAG solution that would be able to retrieve from diverse set of documents? Thank you an happy new year!
You can start with normal retrieval and then try hybrid and then try if adding reranker makes it better or not. There is no definite way, one way can work for you but not for others, So it will be up to you to try and see which way suits your data