Hi everyone I have a few questions Since

At a glance

Hi everyone, I have a few questions. Since I am using llama_index to be used as a chatbot it needs to behave as one. The problem I am having is that it answer alwasys talking about the context. For example if after a question I say "okay thanks" it answers me: "given the addition context provided, the original answer regarding ... Remains accurate. Therefore there's no need to refine the originak answer."
And what instead of the dots is not even what I asked before. Is there a way to fix it, and making it gives normak answer in cases like this?
Another thing, also about the chatbit referring to the context, is that it always does it. After a normal question it answers "...in the provided context..." "Given the context..." Or something like that. Can I somehow remove it and just get the actual answer? Because like this is a bit weird as a chatbot if some clients have to use it.

17 comments

LLogan M

I'm assuming you are using GPT-3.5 hey?

This is a prompt engineering problem. It's talking about context because the prompt templates talk about context.

GPT-3.5 is pretty bad at following more complex instructions, so it may take some fiddling around.

You can set the templates like this:

index.as_query_engine(text_qa_template=my_qa_template, refine_template=my_refine_template)

And you can use the existing templates as a reference for how to create your own

Here's the QA template (for chat models, this gets automatically transformed into a single human message):
https://github.com/jerryjliu/llama_index/blob/18d2ecbefcf5811f3a8b367931a5f1c28f6c2ac6/llama_index/prompts/default_prompts.py#L98

Here's the refine template (specifically for chat models)
https://github.com/jerryjliu/llama_index/blob/18d2ecbefcf5811f3a8b367931a5f1c28f6c2ac6/llama_index/prompts/chat_prompts.py#L12

HHarrison

You need to include somewhere with the prompt structure " You are a chatbot training on X data, respond like one in Y way" - it worked for my chatbot

TTesterMan

Are these the ones llama_index already use, so I have to edit them, or these are more suitable for my case?

LLogan M

Yea these are just the defaults. Generally I would take those and tweak them to your needs

TTesterMan

And when does it uses the refine prompt? In which cases?

LLogan M

Depending on your settings, it's basically whenever there is more text retrieved for a query that can fit in a single LLM call

So it gets an existing answer, and tries to refine it with the next LLM call with new context

TTesterMan

And from the "QA template" can I give some specific answer? For example "if the user says a specific sentence answer 'somethin somethin'" @Logan M

TTesterMan

Can I never use refine prompt?

LLogan M

You can try and give instructions like this, but it's up to the LLM to decide to follow your instructions 🙂

LLogan M

Depends on your settings. With all default settings, it will never be used with a vector store, since the chunk size is 1024 and the default top k is 2, there's lots of room in a single LLM call for the retrieved text

TTesterMan

I am using vector store with a higher chunk size and top k = 3 but it still call retrieve text pretty often...

TTesterMan

But the weird thing is that it has "original answer" even if I ask new question I never asked before

LLogan M

Yea, like I mentioned earlier, gpt-3.5 is not great with refine. The only way to avoid it is to lower the chunk size or top k

1024 and top k of 3 should be OK

TTesterMan

Aaah so with lower chunk size there is less chance for the bot to call refine?

LLogan M

Pretty much! It's all about how much text is called when you do the retrieval

TTesterMan

So I just have to set the serviceContext "chunk_size_limit" to 1024, or leave it default since I am using VectorStoreIndex.
(Sorry for all this question but I am trying to understand everything to use the package in the best way😅)

LLogan M

Yup, you got it!

Add a reply

Find answers from the community

Hi everyone I have a few questions Since