Find answers from the community

Updated 6 months ago

Hi friends How to improve response time

At a glance

The community members discuss ways to improve the response time from queries in llama-index. The suggestions include:

- Setting a smaller chunk_size in the service_context - Avoiding the use of complex index structures if possible - Enabling streaming

Regarding the ideal chunk_size value, the community members suggest that the default of 1024 is a good balance between speed and quality of generated embeddings, and that going much lower than 512 may not be advisable.

Hi, friends, How to improve response time from query in llama-index?
L
o
3 comments
Either setting a smaller chunk_size in the service_context, avoiding using complex index structures if possible, or enabling streaming, will improve the speed (or at least make it feel faster, i.e. with streaming)
for chunk_size, what would be ideal?
usually the default (1024) is the best balance between speed and quality of generated embeddings. I wouldn't go much lower than 512
Add a reply
Sign up and join the conversation on Discord