Query speedup

At a glance

The community member is using a list index but needs to speed up the processing time. Their use case does not allow them to risk missing important information, so a list index is required. They are wondering if there is a way to use trees to summarize and send out multiple API calls at once to speed up the process.

The comments suggest that using a tree index instead of a list index could be faster to query, but a bit slower to build. The community members discuss different approaches, such as using async, increasing chunk size, and using a composable index. They also discuss the trade-offs between a tree index and a list index, with the key concern being that the community member cannot miss any information.

There is no explicitly marked answer in the comments, but the community members provide various suggestions and discuss the pros and cons of different approaches.

Useful resources

sshere

Hi, im using a list index but u need to speed up the processing time. My use case doesn’t allow for me to risk missing important information so a list index is required there a way to case doesn’t allow for me to risk missing important information so a list index is required. Is there a way to maybe with trees summarize send out multiple API calls at once?

16 comments

LLogan M

I think if you are using response_mode="tree_summarize" already, you can get a decent speedup using async

https://github.com/jerryjliu/llama_index/blob/main/examples/async/AsyncQueryDemo.ipynb

Otherwise, try increasing the chunk size (but only if you previously shrank it)

sshere

Some documents are 100+ pages, any recommendations on how to speed it up? I don’t care about the cost of the LLM as much as getting a response faster

sshere

@Logan M new to discord don’t know if need to tag your or not haha

LLogan M

Haha I'm on it

A tree index will be faster to query than a list index, but a little slower to build

Use mode="summarize" instead of response_mode for tree indexes though (if you are summarizing)

sshere

Great ! The key part is that I can’t miss any information so that limits me to a high k similarity on vector or list/tree. Does a tree miss any information?

sshere

Is there no way to split up the Index simultaneously ask the same question across all nodes and then synthesize the response ?

LLogan M

Mmm a tree summarizes a lot of info (at basically builds a bottom up tree of summaries and then uses that to query).

Assuming it does a good job it should work well 🤔

LLogan M

That's a composable index! You could wrap a few list indexes with a top level index

https://gpt-index.readthedocs.io/en/latest/how_to/index_structs/composability.html

sshere

here are my configurations for tree i'll give it a try any recomendations?

{"query_str": user_question,
"mode": "embedding",
"service_context": service_context,
"verbose": True}

def initialize_service_context():
max_input_size = 8000
num_output = 1500
max_chunk_overlap = 20

prompt_helper = PromptHelper(max_input_size, num_output, max_chunk_overlap)
llm_predictor = LLMPredictor(llm=ChatOpenAI(temperature=0, model_name="gpt-4", request_timeout=1500))

return ServiceContext.from_defaults(llm_predictor=llm_predictor, prompt_helper=prompt_helper)

LLogan M

Hmm looks pretty good to me! I think the mode could be either embedding or summarize, I'm not sure what will work best for you 😅

sshere

weird i tried those settings and it only returned the prompt empty...

LLogan M

That's... strange 🤔

LLogan M

Lol

sshere

its happened many times before it shows all the nodes but it doesn't include them in the response i'm running on default now will post back

sshere

it usually doesn't happen on default

LLogan M

Hmm, I need to play more with the tree index lol

Add a reply

Find answers from the community

Query speedup