I want to build a chatbot which can understand and answer user's questions.
there are 2 kinds of questions:
- pre-set standard questions. these are important, carefully selected, frequently asked questions in the past. we have carefully prepared standard answers to these questions(in the form of videos, articles, etc). when a user's question comes, the chatbot is expected to return the matched standard question(a following module will fetch the standard answer to the question and return to user). the chatbot should never try to answer a matched standard question by itself.
- other questions. if user's question doesn't match any of the standard questions, then the chatbot should fallback and try to answer it through gpt3.5.
I'm trying to customize a llama index chat engine to fulfil this. the chat engine consists of a query engine(which matches a query against standard question library), a llm(gpt3.5), and a system prompt. it can evaluate whether a question is a standard question or not, which is good. but the problem is, it will continue to forward the question to llm and return the final answer, which costs extra time and the result is useless. how can I stop this and let it just return the matched standard question when a match is found? attached is the code snippet(the full code is in
https://github.com/pengxiaoo/llama-index-fastapi/blob/main/app/llama_index_server/index_server.py)