Questions

At a glance

rrini

needed some explanation on a few concepts in the llamaindex documentation :-

https://gpt-index.readthedocs.io/en/latest/end_to_end_tutorials/dev_practices/production_rag.html - what do you mean by "Decoupling chunks used for retrieval vs. chunks used for synthesis" - the synthesis happens on the chunks retrieved only right? How can they be decoupled?

https://gpt-index.readthedocs.io/en/latest/examples/retrievers/auto_vs_recursive_retriever.html - How is VectorStoreInfo working here? I can see we have set metadata for every node at the top of the article. How is it getting connected to the vector_store_info property?

4 comments

LLogan M

Decoupling just means you take the retrieved nodes and do something to them before sending them to the LLM (I.e. a node postprocessor). The sentence window retrieval is a good example of this. Retrieve sentences, replace with the surrounding context

LLogan M

Vector store info is being used in the auto retriever. It is used to format a prompt that helps the LLM write query settings

rrini

How is it helping in refined answers to queries as mentioned in the doc? I am actually trying to understand what's happening in the backend a little more. Deeper explanation than - " It is used to format a prompt that helps the LLM write query settings" 🙂 🥺

LLogan M

Just with a prompt. Maybe the source prompt will help things make sense

https://github.com/run-llama/llama_index/blob/b703e454e885bebe002336a2383af637c48c56ea/llama_index/indices/vector_store/retrievers/auto_retriever/prompts.py#L18

Add a reply

Find answers from the community

Questions