Retrieving relevant documents from llama index with chr...

At a glance

The community member is working on a Retrieval Augmented Generation (RAG) system using LlamaIndex and ChromaDB as the vector store. They are facing an issue where the system is retrieving documents that do not contain information related to the query. The community members discuss using node_postprocessors like the SimilarityPostProcessor to filter out nodes with similarity scores below a certain threshold, but one community member is unsure if this will be helpful as their actual questions have similarity scores less than 70%. The community members are looking for alternative solutions to address this issue.

Useful resources

ppaapi

Hi I am working on RAG with llama index and chromadb as vector store, while querying I am trying to retrive the document used to answer the query, issue with doing response.source_nodes is that even if I have query like I have in the image, it would still provide me document where such thing is not mentioned at all, so I am not sure how to fix this. Is there an alternative option to make it work with my requirement?

Attachment

7 comments

WWhiteFang_Jr

Does your document contains info about vector embedding?

ppaapi

nope it does not, so like I am wondering if there is no info about vector embedding then why is it showing the document in the node

WWhiteFang_Jr

It may catched something similar in the given node. As your query is matched with the docs using cosine similairty.
You can use node_postprocessors like https://docs.llamaindex.ai/en/stable/module_guides/querying/node_postprocessors/node_postprocessors/#similaritypostprocessor

This eliminates node below the set threshold value. This way if it picks anything below the threshold value it will remove it.

ppaapi

ahh okay, because same would come even if my query is "hi" so i think using this node_postprocessors should help

WWhiteFang_Jr

Yea this should help

ppaapi

I don' t think so this is helpful tbh, because many of my actual questions which are from the doc have similarity sciore less than 70 so I am not really sure if this will help

ppaapi

do you have any other way?

Add a reply

Find answers from the community

Retrieving relevant documents from llama index with chromadb vector store