Find answers from the community

Updated 2 months ago

HI All I m using a DatabaseReader to

HI All! I'm using a DatabaseReader to read a simple table and convert it to docs for further indexing. However, when I look at the table, I don't see the field (column) names from the database. Is it expected behavior? The model sometimes acts weird, responding that it does not have the data when it's clearly in the table. I'm suspecting if this can be the reason. I'm attaching a screenshot of docs with fictional data. Thanks!
Attachment
docs.png
L
o
4 comments
Yea thats expected -- since it's just running a sql query, I'm not 100% sure if there's an easy way to get the column names?
https://github.com/emptycrown/llama-hub/blob/374fb7a7f4a99aca72d028e74459df99779886cc/llama_hub/database/base.py#L86

But, you could just add the column names to the metadata to help with this
Plain Text
documents = loader.load_data("SELECT mycol1, mycol2")
for doc in documents:
  doc.metadata['columns'] = "mycol1, mycol2"
Thanks, I will try that. Would you know why the model has only part of the data from table? For example, in a chat, the model responds that it does not have data for Q1 and Q4, however it is in the database (I just use SELECT * from table and then load it to index)
It sounds like you are using a vector index right?

Since each document is a row, and the default top k is 2, then it is only retrieving 2 rows from your index to answer each query.

This is probably not what you intended.

The ideal approach is to use text2sql rather than indexing rows of data. Especially for highly numerical data.
https://gpt-index.readthedocs.io/en/stable/end_to_end_tutorials/structured_data/sql_guide.html

If you feel you still need similarity search on rows, you can use the sql join engine to combine the two approaches
https://gpt-index.readthedocs.io/en/stable/examples/query_engine/SQLJoinQueryEngine.html
that makes sense! I didn't realize that it's only retrieving 2 rows at a time. Thanks a lot for clarification
Add a reply
Sign up and join the conversation on Discord