Hi there, I am struggling with having

eejmiddle

Hi there, I am struggling with having full control about what exact components of the service_context are actually used in a RetrieverQueryEngine.
Components are passed via ServiceContext.from_defaults during

creation of the index
adding of nodes
and then creating the query engine itself.

But what exactly is used when is not obvious. Especially when passing a certain llm in the last step, it seems like it will possibly not use it.

So questions are:

Is there a way to have adequate logging output to be sure what exact llms/apis are used
What are the design principles behind the service context? Alway use the latest component passed somewhere? And jump to defaults when the component is invalid? Ors th like that. Maybe you have a link on that?
Is there a more appropriate way to have such control, cause maybe service context is rather for fast prototyping only?

Thank you so much
Andi

5 comments

LLogan M

creating an index and inserting data uses embeddings. Querying uses embedding to embed the query, and an LLM to respond using the retrieved context

If you are using openai, the openai client has quite detailed debug logs

Plain Text

import logging
import sys

logging.basicConfig(stream=sys.stdout, level=logging.DEBUG)
logging.getLogger().addHandler(logging.StreamHandler(stream=sys.stdout))

I realize the service context is a tad confusing to use. There is a rather large update to the library soon, that includes moving to a proper global settings object + add components that are actually used to the interfaces of classes/functions.

WWhiteFang_Jr

Waiting for 0.10 🤩 to say bye bye service_context lol 😆

eejmiddle

Thanks @Logan M , I invested a little on this problem and have a minimum example that shows what I would consider a bug. However, as you pointed out doing this with the global service context solved the problem.

So briefly the problem: When I instantiate the index with a service context and a desired llm and then create an actual query engine it only works creating it with index.as_query_engine() directly. When customizing, e.g. the retriever, it does not use any passed llm anymore and jumps to the default. In the code sketch below only variant A uses the desired llm, B and C use the default. I have had a few other setups but this shows the core problem I would say.

Plain Text

service_context = ServiceContext.from_defaults(
    llm=some_desired_llm,
    embed_model=some_embedding_model)
index = VectorStoreIndex.from_vector_store(
    some_vector_store,
    service_context=service_context)

# Variant A - directly from index
rag_query = index.as_query_engine()

# Variant B - more granular control
rag_query = RetrieverQueryEngine.from_args(
    retriever = index.as_retriever(),
    response_synthesizer=get_response_synthesizer(streaming=True),
)

# Variant C - more granular control
rag_query = RetrieverQueryEngine.from_args(
    retriever = index.as_retriever(),
    response_synthesizer=get_response_synthesizer(streaming=True),
    service_context=service_context
)

response = rag_query.query("Whats in the doc?")

eejmiddle

As you said that there is a major update, I guess this will be fixed anyways. However, a few points that I would like to emphasize based on this learning:

The way the service context is passed was fine, global or local both is understandable
However, the fact that the retriever jumped to a default in the background and only noticing this via checking api call logs, even though service context is passed literally in each step where it could, is quite problematic from a data protection point of view. So independent of global or local approaches it would be nice (and important from my perspectve) if this would be adressed somehow
Also given that in a more advanced application you wann may exchange the llm very often depending on the current task or user question, a global only approach feels a little risky as well. On the longterm, full local control about which llm is used where is quite important I guess.

Anyway thanks for your great work 🙂 Looking forward for the update (any rough hint on what 'soon' means? 🙂 )

LLogan M

This is fixed by just putting the actual attributes that are used in function apis, rather than relying on some magic seevice context to get passed down nicely

Add a reply

Find answers from the community

Hi there, I am struggling with having