Find answers from the community

Updated 3 months ago

Hey all I’m looking to do RAG on my db

Hey all I’m looking to do RAG on my db records where columns are strings and floats and ints… how much I think of doing this

5 comments

llucastonon

Finding ints in semantic search might be challenging.
https://docs.llamaindex.ai/en/latest/examples/query_engine/sec_tables/tesla_10q_table.html#try-table-comparisons
Take a look at this - how they process tables in this sec demo

llucastonon

One thing I've been experimenting and it's been successful as well is asking GPT to summarize the tables - Columns + Rows.
I usually store the textual output and add the table extracted (in CSV) to the metadata and feed this table (alongside with the summary) to the llm model after the retrieval

llucastonon

In your case, you might be interested in a summary/sample of the data (with the schema) maybe of the first N rows to be embedded

llucastonon

If you're dealing with a larged table from your DB where youd like to retrieve specific numbers, it would be better to use a SQL agent

cchirag

Thanks @lucastonon I’m trying to build a fuzzy matching process where I have internal data and need to match external data. I’m trying to build a rag pipeline that embeds the row in the internal database with the columns I need for matching . Then I embed a row of similar columns from the external data and get the 5 most similar and use the LLM to reason which one is the closest/best match

Add a reply