Find answers from the community

Updated last year

In a typical RAG pipeline, is there a

At a glance
In a typical RAG pipeline, is there a dedicated node responsible for evaluating whether augmentation/external knowledge is needed in the first place? If yes, what's it called?
Typically, if I start a conversation with a chatbot with 'Hi', I don't need it to go look at my vector store for relevant chunks. Or rather maybe the search happens anyways, but I don't need it to include any additional context into its input prompt in order to answer me.
The closest thing I found is a hybrid search with an intersection logic as described here: https://docs.llamaindex.ai/en/stable/examples/query_engine/CustomRetrievers.html
L
T
24 comments
Typically a router or agent makes this decision
Can you point me to resources where the decision logic is described?
Typically, its like a list of options. For openai. this means a list of tools
and the tools api lets the LLM pick a tool to route to
I can point to source code if you wanted
The actual usage of routers is described in a few places in the docs
Source code would be perfect πŸ™‚
Thanks a lot! So I skimmed through the code and from my understanding, either you can re-embed a pre-selection of chunks (choices) using a given encoder and basically re-rank them and pick the top k based on a query, or you can do the same using an LLM that you pass the choices and the query to. Is that right?
actually, its not embedding, its raw llm prompting.

The prompt bascically asks the LLM "Hey, given this query and these choices, what should I pick?"
oh haha that one is specific to embeddings yes
forgot we had that
Alright so I think I understand how it works thanks πŸ™‚
But still in all these cases a choice is picked
What about when no additional context is needed, like when I greet the chatbot?
Yea so the above is used in very structured RAG pipelines

In a chat scenario, that would be an agent.

Its slightly similar actually -- the llm sees the chat history and a list of tools, and decides to either call a tool or respond directly
Oh ok so it's always up to the agent to decide whether to resort to retrieval or not

I was thinking about a simpler approach where the sim score of retrieved chunks would need to be above a certain threshold for them to qualify for augmentation, but I haven't tried it yet. Was wondering if this was tried before
What's the difference between a router and a selector?
a router uses a selector lol
Just a higher level abstraction
Thanks again for your help πŸ™‚
Add a reply
Sign up and join the conversation on Discord