i'm working on a RAG app using LlamaIndex / llamacpp server / vectorpg / flask / react+ typescript chat UI
i've got a prototype working that:
- uses a custom VectorDBRetriever to retrieve first batch of nodes with scores
- response_syntheziser with custom prompt text_qa_template
- RetrieverQueryEngine that uses SentenceTransformerRerank
- and finally generating response via CondenseQuestionChatEngine.
problem: i'm unable to figure out how to roll in chat history so that the app functions like a true chat bot.
are there any documents / tutorials / code bases that would be a good reference to help me figure out this last piece?