Find answers from the community

Updated 2 years ago

Query time

At a glance
I'm currently using llama_index. I tested it with a 850KB pdf file (5 pages). I generated the index for that file using default settings, then queried it using "text-davinci-003", temperature=0.7, SentenceEmbeddingOptimizer, response_mode="compact". It takes around 20 seconds to finish executing the script. Is there anything that can be done to reduce the response time?
L
M
7 comments
What type of index did you use, a vector index?
Yes, GPTVectorStoreIndex, although I tried with Chroma and it didn't help.
Also the index is stored in local json files, they are not regenerated before the query
Given what you told me, I think the main problem is just openai right now

I tried both gpt-3.5 and text-davinci-003, both take around 8-10 seconds for me right now
You could try using streaming, to make the time feel a little faster
Ah, that could work. Now that I think about it, the online service I tested for PDF Q&A displayed words one at a time, but I thought that was just a fancy effect they added for aesthetics. I'll try that, thanks.
haha slightly fancy, but also helps the UX when the words are appearing faster than you can read
Add a reply
Sign up and join the conversation on Discord