I am currently trying to build a chatbot for our website using LlamaIndex and chatGPT. Our chatbot has around 50 documents, each around 1-2 pages long, containing tutorials and other information from our site. While the answers I'm getting are great, the performance is slow. On average, it takes around 15-20 seconds to retrieve an answer, which is not practical for our use case.
I have tried using Optimizers, as suggested in the documentation, but haven't seen much improvement. Currently, I am using GPTSimpleVectorIndex and haven't tested other indexes yet.
I am pretty new to this, would like to hear if this is expected times or if it could be improved by, e.g., building indices in a more efficient way, setting different params, etc. Basically looking for any suggestions on how to improve the performance of the bot so that it can provide answers more quickly.