Why does index always work on the last

At a glance

Why does index always work on the last index?

6 comments

Can you elaborate on your query?

Random example, real data blurred: when I searched for the author of Wonders of the Unknown, the first thing indexed was (Exploring the Unknown: Book Title. Author Beebe, published in English in 1923 under the title "Wonders of the Unknown"). The content of the tenth index is (Frontiers of Knowledge: The Scientific Exploration Program is a joint project of international research institutions. (The goal of the program is to push the boundaries of science and unravel the mysteries of the natural world). The content of the tenth index is irrelevant to the question, but the final data generated falsely claims that the authors are international research organizations.

kkrieo

query_engine = index.as_query_engine(similarity_top_k=10)

WWhiteFang_Jr

Okay so while answering it is pulling irrelevant source nodes ?

You can give try to Similarity Postprocessor to set a threshold value which will limit the fetching of nodes based on the set threshold

Plain Text

from llama_index.postprocessor import SimilarityPostprocessor

query_engine = index.as_query_engine(similarity_top_k=10,node_postprocessors=[                        SimilarityPostprocessor(similarity_cutoff=0.7)])

WWhiteFang_Jr

https://docs.llamaindex.ai/en/stable/module_guides/querying/node_postprocessors/node_postprocessors.html#similaritypostprocessor

kkrieo

Thank you, while this question is still not resolved, other questions were answered well.

Add a reply

Find answers from the community

Why does index always work on the last