Find answers from the community

Updated 3 months ago

Hello, I want to implement a chatbot.

Hello, I want to implement a chatbot. This will implement a QA chatbot. There are quite a few retrievers in the docs below.

What I want is a retriever that gives very accurate answers. When I need to implement a QA chatbot, what is the most popular retriever?


https://docs.llamaindex.ai/en/stable/module_guides/querying/retriever/retrievers.html
๊ถŒ
v
14 comments
This is what I'm using
Plain Text
retriever = VectorIndexRetriever(
    index=index,
    similarity_top_k=10,
)

# configure response synthesizer
response_synthesizer = get_response_synthesizer()

# assemble query engine
query_engine = RetrieverQueryEngine(
  retriever=retriever,
  response_synthesizer=response_synthesizer,

  node_postprocessors=[
      MetadataReplacementPostProcessor(target_metadata_key="window")
  ],
)
My experience thus far is that it's dependent on your data and other parameters like embedding model, chunking strategy, window size, llm model...
The best thing is to get something working e2e and then iterate to make it better
I didn't quite understand your answer, the company wants me to implement retrievers and then compare them and use the retriever that gave the best results.
Can you please select some of the retrievers listed in the docs link above?
The point I'm trying to make is that whatever list someone gives you, it might not be right for your data or problem. You have to experiment to find what works for your problem. The "best" for one person is not going to be the "best" for another.

I'd get Langfuse setup early, so you can enable user feedback to know what is "best" for the company. This will also help you collect good & bad examples to make the solution even better. Any out-of-the-box retriever is not going to be "best" and really requires the feedback loop
The retriever is just one part of the pipeline too. You will need to experiment with many parts of the overall system to produce an acceptable solution
Arguably the data & chunking is more important than the retriever
If those aren't good, the best retriever is not going to help
Yeah.. so what is the retriever that is often used in QA chatbots?
The right retriever is going to depend on the user's question
  1. Are they asking for a summary, you might want to look at tree_summarize mode
  2. Are they looking for a specific piece of information, default is probably the way to go
  3. Are they looking to ask the same question against many documents? accumulate mode might be appropriate
There is not a good generic answer to your question
llama-index makes it easy to switch the retriever, making it easy to try out different ones to see what works. My suggestion is to follow an example, get the e2e working, then try out different options to find what works best
I just went through this process, now I'm at QueryPipeline because things like the default prompt isn't great with the LLM model I'm using (chat-bison), though it worked for mistral in my earlier experiments...
https://docs.llamaindex.ai/en/stable/module_guides/querying/pipeline/root.html
Add a reply
Sign up and join the conversation on Discord