Find answers from the community

Updated 6 months ago

Retrieval

At a glance

The post discusses an issue where the target document is returned after 100 items when the PDF includes 100 pages, but the community members can get the result in the top 20 when using a few pages. The comments suggest that the retrieval in llamaindex uses a similarity algorithm, so it should not have an issue even if the required content is present after 100 items. One community member suggests setting the similarity_top_k value to 20 to find the top 20 results based on the query. Another community member mentions the need to optimize the query or do query transformation to ensure the target answer appears in the top 20 results. The community members also discuss how to implement retrieval and query engine together, and how to fetch the retrieved nodes used for generating the response.

if i just use a few pages, i can get the result in just top 20, but when the pdf including 100 pages, the target doc will be returned after 100 items
W
h
S
10 comments
Retrieval in llamaindex uses similarity algorithm to find the most nearest content from your dataset for your query.


So it would not have any issue even if the required content is present after 100 items
yes you are correct
is there any doc about optimize retrieval results, my expectation is top 20
If you want to find the top 20 result based on your query. Just need to set the topK value to 20

query_engine = index.as_query_engine( similarity_top_k=20)
ok i think i have to optimize my query or do query tranformation , so the target answer will show in top 20
How to implement retrieval and query engine together
If you define your query_engine and ask the query. Along with the LLM response, you also get the retrieved nodes used for generating the response.
Plain Text
response = query_engine.query("ask your query here")
# Retrived nodes can be fetched as 
print(response.source_nodes)
I’m using create-llama package and have edited index.py for local LLM and embedding usage.
How many arguments does query_engine take ? Can top k argument included along with tree_summarize
Add a reply
Sign up and join the conversation on Discord