Find answers from the community

Updated 3 months ago

Hi Guys,

Hi Guys,

I have already built a solution to create custom AI chatbots from own data which uses gpt-3.5-turbo and RAG to fetch the context using openai embedding and cosine similarity match.

I am facing issues when user asks some follow up question on the previous question, in this case doing RAG for just the question is not sufficent as it will not fetch the required context.

I already tried question rephrasing based on conversation history but it makes responses slow and difficult to decide which one to rephrase and which one to not.

I went through the source code of chat llama_index and couldn't find anything different, follow up questions are not handled is what I observed.

How RAG will work for this scenario?
Q: what is apple watch?
A: bla bla
Q: what are it's features?

I have tried chatbase.co and it works pretty well don't know how they do it. Can someone please help me on this
L
l
c
13 comments
rephrasing (or relying on the LLM to make a new query) is really the only way.

You could retrieve based on the entire chat history, but this is problematic if the user changes topics
From an engineering perspective, I really don't know what a better option is
Ok, Can I embed last 10 messages and still get the right context?
Or I figure some sort of NLP technique or attention to get right context using RAG
I'm not aware of any fancier techniques yet.

Like, picture this chat history

Plain Text
user: What is an apple watch?
A: bla bla
user: What is a samsung fold 4?
A: bla bla
user: What colors does it come in?


How do I retrieve the right context at every step? Without an LLM rephrasing the question, it's nearly impossible πŸ˜…
Don't know what magic chatbase does I'm stuck on this problem for a month now, read source code of multiple libraries couldn't find anything.
For each query chatbase does retrieve context so they don't use agent, even for "hi" it retrieves context
But it works so well for follow up questions, until at least 5-6 messages
If LLM has to rephrase how to decide when to do it, that I couldn't figure out
@Logan M Can function calling help in this?
I think it's mostly prompt engineering for deciding when to rephrase

i.e. "Repeat the users message to query, or rephrase based on chat history if needed"
Yeah, the other thing that you can do is feed previously used sources forward in the conversation. Or to create some sort of running summary that helps, but I think your best bet is to use an llm for writing a new query based on history.
Add a reply
Sign up and join the conversation on Discord