Retrieving Relevant Metadata and Sentence-Level Context for Responses
Retrieving Relevant Metadata and Sentence-Level Context for Responses
At a glance
The community member is using VectorStoreIndex and wants to know how to get information about which sentences or paragraphs in the context the answer is generated from, not just the source node score. They believe they can achieve this by customizing the prompt and inserting extra metadata information associated with each node. However, they are unsure if they can do this at a high level or if they need to build a response synthesis from scratch.
The comments suggest using fuzzy matching techniques like thefuzz library to find the closest sentences or paragraphs to the generated answer. Another approach mentioned is to get the embeddings of each sentence and find the cosine distance that best matches the provided answer. The community members also discuss the possibility of using BM25 per-sentence as a faster alternative.
I am using VectorStoreIndex. When I generate a response, I want to see if I can get where in a node that the answer is generated from. It's not enough to check the source_node score - I want to get which sentences or paragraphs from the context where an answer is pulled from. I believe I can get this by customizing the prompt and insert extra metadata information associated with each node. Then I want to add an additional instruction in the prompt "pull the relevant metadata and which sentences where the answer was generated from". I'm not quite sure if I can do this at a high level, or if I need to build a response synthesis from scratch. Any help on this?
@Logan M I am finding that my first pass using above technique using levenshtein distance is yielding mediocre results. *** w kapa.ai/phorm.ai, getting the reference url is fairly basic, but when I do search results, google is able to provide the precise paragraph that best fits an answer. Still not sure how to do that easily. I imagine I can get all the embeddings of each sentence and find the cosine distance that best matches the provided answer