Which type of index are you using? mode="embedding"
is only needed for list and tree indexes.
You can try increasing the top k as well response = index.query(..., similarity_top_k=3)
Also, you can check the response object to see which nodes were used to create the answer: response.source_nodes
At the end of the day though, it's up to the LLM to figure out the final answer even if they correct answer is in the source nodes, and they aren't always the smartest lol π but usually it should be working
this is really helpful, thanks @Logan M . Currently I'm using simple vector index, I simply don't know what's the best index for my case, I want to index around 15 markdown documents and use the chat to provide answers about this documents. It seems a list index is more "precise" for documentation but queries are more expensive since it needs to iterate the whole list for every query, right?
Exactly! A vector index should work fine though for that case as well, you might just need a higher top_k. You can also set response_mode="compact"
to speed up response times in the query call. (It will stuff as much as it can into each LLM call, rather than one call per node)
A tree index might also be interesting try, but it's a little expensive to build as it uses the LLM during index construction (the build cost would be similar to one list index query). But the queries will be more efficient than a list index
I'll try those adjustments! thanks a lot!
similarity_top_k=3 did the trick! now answers are sooo good! thanks a lot @Logan M
@Logan M I noticed that answers are really good locally but not on prod. Do you think this is related with cpu power? The very same question, super precise and simple it's found locally but not in the server. Although in both cases now is using the documents, that's a huge progress π
ok, in prod it improved by increasing the similarity_top_k to 5 and also I changes the answer mode to compact. But I can see the right answer in the node 4 (out of 5) with lower similarity then 1, 2 and 3 even though the exact words of my questions are in the number 4.
All this happen only in the server, locally it makes sense the similarity
Huh that's super strange π€
Your index has the exact same data on the server and locally? Created with the same settings?
created with the exact same script, I've checked the beginning of the index and it looks the same. To be sure I'll use the local index in the server and see. Besides that only hardware and OS are different
I will also try gpt3.5-turbo model and see how it goes
Yea, if the problem continues, maybe gather up a solid example and post an issue on the repo. To me, it should be working if everything is the exact same π