The community member is enjoying using llama-index for their project and has successfully deployed it as a chatbot. They are now looking to experiment and improve the model's responses to various questions, and potentially reduce the cost, which is currently around 7 cents per query, which they find too expensive to run publicly. The community member is asking if anyone has experience with iterating through different parameterizations to improve model performance or reduce cost, as they are working with a substantial custom corpus (25MB) of somewhat high complexity.
In the comments, another community member suggests that the original poster should take a look at the query optimizer, providing a link to the relevant documentation. The original poster responds that they find this interesting and will take a look.
Another comment asks if they are adding another conversation with OpenAI to request a cut in the answer cost, but the original poster responds with "Sorry?", indicating they do not understand the comment.
There is no explicitly marked answer in the comments.
Hey -- I'm enjoying using llama-index very much for my project, and have it successfully deployed backing a chatbot. Now, I'm hoping to experiment and improve the model's responses to various questions, and if possible reduce the cost, as it's about 7c a query at the moment, which is too expensive for me to run publicly. Does anyone have experience with iterating through different parameterizations to improve model performance or reduce cost? I'm working with a substantial custom corpus (25MB) of somewhat high complexity.