I would like to run a query by my index, however, befor...

At a glance

The community member is interested in running a query on an index, but is unsure if the index has the necessary information. The proposed solution is to first check if the index can answer the query, and if not, fetch the data from a database, add it to the index, and then run the query.

Another community member suggests using LlamaIndex with an SQL database, providing an example of how to build an SQL index and run queries. They also suggest running a "test query" to check the similarity of the source nodes, and adding data to the index if the similarity is too low.

The original community member tries this approach, but the response indicates the index is confident it cannot answer the question, despite the high similarity scores. They are looking for a programmatic way to infer from the response that the index cannot answer the query.

Useful resources

RRouzbeh

I would like to run a query by my index, however, before running the query I do not know whether the index has information about this query as a result I would like to fetch data from a database, add that data to the index and then run the query. Here is how it might look like:

Plain Text

query = 'xyz'
if index capable of answering the query:
    #run the query on index
else:
    #fetch data
    #add the data to the index
    #run the query by index

Has anyone done something like this?

9 comments

kkapa.ai

Please stand by @Rouzbeh as I search the knowledge base. We are experiencing especially high traffic right now and might take longer than usual. :wave:

kkapa.ai

----------------------
Yes, you can achieve this by using LlamaIndex with an SQL database. First, you need to build an SQL index using the GPTSQLStructStoreIndex.from_documents() method, as shown in this source (https://gpt-index.readthedocs.io/en/latest/examples/query_engine/SQLRouterQueryEngine.html):

Plain Text

sql_database = SQLDatabase(engine, include_tables=["city_stats"])

sql_index = GPTSQLStructStoreIndex.from_documents(
    [], 
    sql_database=sql_database, 
    table_name="city_stats",
)

Then, you can run a query on the index using the query() method. If the index is not capable of answering the query, you can fetch the data from the database, add it to the index, and then run the query again. Here's an example of how you can achieve this:

Plain Text

query = "xyz"

# Check if the index is capable of answering the query
response = index.query(query, mode="default")

if response:  # If the index is capable of answering the query
    print(response)
else:
    # Fetch data from the database
    # Add the data to the index
    # Run the query on the index
    response = index.query(query, mode="default")
    print(response)

Keep in mind that this is a general example, and you might need to adapt it to your specific use case and data fetching process.
----------------------
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
----------------------

LLogan M

I think you could run a sort of "test query" and check the similarity of the source nodes. If the similarity is too low, then add the data to the index (this assumes you are using a vector index though)

query_engine = index.as_query_engine(response_mode="no_text")

Then when you query, you can check the similarity of each source node

for source in response.source_nodes:
print(source.score)

LLogan M

Just an idea though, not sure how well it will work lol

LLogan M

With the default embedding model, anything below 0.77 similarity is probably not helpful. It might take some testing though

RRouzbeh

Thanks for the tip @Logan M I'll try this out 🤩

RRouzbeh

@Logan M I tried it and the response I get is

Plain Text

I'm sorry, I cannot answer the question as the context information provided is ...
 source_nodes: [
    { node: [Object], score: 0.8328059527937333 },
    { node: [Object], score: 0.8255972873586926 }
  ]

The score seems high a bit but the response is clearly saying it cannot answer the question. I need to figure out a programatic way to infer from the response that it cannot answer the question

RRouzbeh

Basically the it is confident that it doesn't know the answer

LLogan M

Yea, seems like the problem is a little more tricky then 🤔

Add a reply

Find answers from the community

I would like to run a query by my index, however, before running the query I do not know w