Find answers from the community

Updated 3 months ago

Hi friends How to improve response time

Hi, friends, How to improve response time from query in llama-index?
L
o
3 comments
Either setting a smaller chunk_size in the service_context, avoiding using complex index structures if possible, or enabling streaming, will improve the speed (or at least make it feel faster, i.e. with streaming)
for chunk_size, what would be ideal?
usually the default (1024) is the best balance between speed and quality of generated embeddings. I wouldn't go much lower than 512
Add a reply
Sign up and join the conversation on Discord