How do i set the output

At a glance

How do i set the output_cls and similarity_top_k with the retry query engine?

Plain Text

# this is what i want but output_cls and similarity_top_k are not accepted as args
base_query_engine = index.as_query_engine(llm=llm, filters=filters)

query_engine_presentation_content = RetryQueryEngine(
    query_engine=base_query_engine,
    output_cls=PresentationContentListV1,
    similarity_top_k=10,
)
query_engine_presentation_outline = RetryQueryEngine(
    query_engine=base_query_engine,
    output_cls=PresentationOutlineV1,
    similarity_top_k=10,
)

13 comments

LLogan M

I think you'd set all that in base_query_engine ?

NNiels

Is there no way to share common properties in one (base query engine in this case) and create different variables that hold the properties that should differ?

LLogan M

I'm not sure what you mean?

RetryQueryEngine is just a wrapper on top of an existing query engine

(And a query engine is just a wrapper on top of a retriever and response synthesizer, both of which have settings that change depending on the type of index, retriever, and syntesizer)

NNiels

In my scenario i want two different query engines because the output class differs (for the rest all args are the same as you can see.

I am looking for a way to structurize my code so i can re-use the common args and not have to define them twice (so the llm for example which is the same for the 2)

NNiels

Ideally i would have something like this:

Plain Text

base = QueryEngine(...shared_args)

specific1 = QueryEngine(base_query_engine=base, ...specific_args1)
specific2 = QueryEngine(base_query_engine=base, ...specific_args2)

Is a pattern like this possible?

LLogan M

you could just put it in a dict

Plain Text

shared_args = {"similarity_top_k": 4, "filters" filters}

specific1 = QueryEngine(..., **shared_args)
specific2 = QueryEngine(..., **shared_args)

NNiels

Thanks good to know, not a python dev 😛

LLogan M

haha no worries!

NNiels

Just got this error:

worker-1 | [2024-04-18 15:55:25,266: INFO/ForkPoolWorker-7] HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 400 Bad Request"
worker-1 | [2024-04-18 15:55:25,267: WARNING/ForkPoolWorker-7] Retrying llama_index.llms.openai.base.OpenAI._chat in 7.560941276281184 seconds as it raised BadRequestError: Error code: 400 - {'error': {'message': "This model's maximum context length is 16385 tokens. However, your messages resulted in 16441 tokens (16176 in the messages, 265 in the functions). Please reduce the length of the messages or functions.", 'type': 'invalid_request_error', 'param': 'messages', 'code': 'context_length_exceeded'}}.

i thought the query engine automatically chunks API calls so this cant happen. What is going wrong here? :/

Plain Text

index = initialize_index(model)

base_args = {"llm": get_llm(model_name=model), "filters": get_document_filters(uuid)}

outline_query_engine = index.as_query_engine(
    output_cls=PresentationOutlineV1, similarity_top_k=15, **base_args,
)

outline = outline_query_engine.query(outline_query_str).response.dict()

LLogan M

seems like a small token counting issue (its just barely over too :PSadge: )

LLogan M

To change this, I might... artifically lower the context window size. Except the OpenAI class doesn't let you do this (without a small PR), so you can only modify it in the global settings

NNiels

Good idea. Do you think this is an issue with the llama index implementation itself kind of ignoring the token count of the user query?

LLogan M

Its not ignoring it, its that token counting can get very tricky (especially when using the output_cls option)

Add a reply

Find answers from the community

How do i set the output_cls and