Find answers from the community

Updated 2 months ago

excel

Hello, can anyone help me with this..
what is the way to create embeddings from excel files and store in vectordb for creating information retrieval?
L
N
6 comments
Try this: https://llamahub.ai/l/file-pandas_excel

You also parse the excel yourself and create a Document objects.

doc = Document(text="text from a row?")

Although if the excel sheet is highly numeric, you might be more interested in using a pandas query engine, or converting to sqlite and using SQL

https://gpt-index.readthedocs.io/en/latest/examples/query_engine/pandas_query_engine.html

https://gpt-index.readthedocs.io/en/latest/end_to_end_tutorials/structured_data/sql_guide.html
While querying, PandasQueryEngine is calling the llm to for answer, will it pass the complete data frame in prompt while calling llm?
When i set verbose to True, i couldn't see the prompt, how to check it?
Okay but it answers correctly for anywhere in the excel sheet, any intermediate calls made to get this?

But my main issue is I want to embed the excel sheet which contains highly numeric data, this PandasQueryEngine is working fine, I need to achieve the same for embeddings is there a way?
No intermediate calls. It just looks at the df.head() output and the user query, generates some python code that is evaluated, and then it returns the output πŸ™‚

If you need embeddings, I suggesting moving the dataframe to a sqlite database and using the sql + vector query engine, where the vector engine is just each row embeded? https://github.com/jerryjliu/llama_index/blob/main/docs/examples/query_engine/SQLJoinQueryEngine.ipynb
Add a reply
Sign up and join the conversation on Discord