What method is LlamaIndex Chat use in background to provide the answers? I tested on Documents and it provide very good answers, I'm using Python queryengine with 4 chunks and have different responses on same document.
It's doing API calls to OpenAI by default using 3.5. LLMs aren't deterministic though so you'll see some variance in responses, accumulating chat history with chat engine will also cause variance in answers
what is the difference between chat engine and a chatbot implementation ? If I need to use it for documents questions and answers, to have a memory to refine / reask something on top of provided answer, what is recommended to use ?
I have also some use cases where I have 2-3 documents with different role and I want to query with different questions each document and compose the answer
Thanks @Teemu , I have another question, if I keep all message for question and llm response in a database, how I can load this messages in memory to use in the chat, and how I can control how many message from history I sent ?