LLM

bborjazz

Hi all, I am working on an RAG application for which I want to use a model for embeddings and another one (LLM) for the question-answer flow.
For this I am using ServiceContext with the following configuration:

def setup_index(documents):
    embed_model = HuggingFaceEmbedding('sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2')
    service_context_embedding = ServiceContext.from_defaults(embed_model=embed_model, llm=None, chunk_size=1024)
    return VectorStoreIndex.from_documents(documents, service_context=service_context_embedding)

After this I store a persistent folder with the indices locally.

When I load the index from the local folder with the following code:

def load_documents():
    # Create storage context from persisted data
    storage_context = StorageContext.from_defaults(persist_dir="./data")

    # Load index from storage context
    index = load_index_from_storage(storage_context)
    return index

I get this error:

Could not load OpenAI model. If you intended to use OpenAI, please check your OPENAI_API_KEY.
Original error:
No API key found for OpenAI.

Does anyone know what is going on? I want to create the indexes without LLM and then add it as ServiceContext in a response_synthesizers.

16 comments

WWhiteFang_Jr

You'll need to pass in the service_context when you are loading the index as well if service_context is not declared globally.

WWhiteFang_Jr

I would suggest you declare the service_context globally.

https://docs.llamaindex.ai/en/stable/module_guides/supporting_modules/service_context.html#setting-global-configuration

bborjazz

Im doing this in the service_context

    service_context_llm = ServiceContext.from_defaults(
            llm=llm,
            params=parameters, 
            apikey=api_key, 
            roject_id=project_id
        ),
        system_prompt="Eres un asistente de IA que responde sobre postulaciones a puestos de trabajo \
        No debes ser sesgado ni racista en tus respuestas. Responde siempre en español."
    )
    retriever = VectorIndexRetriever(
        index=index,
        similarity_top_k=10,
    )
    response_synthesizer = llama_index.response_synthesizers.get_response_synthesizer(
        response_mode="compact",
        service_context=service_context_llm,
        use_async=False,
        streaming=False,
    )
    return RetrieverQueryEngine(
        retriever=retriever,
        response_synthesizer=response_synthesizer,
    )

bborjazz

And using the index previously built as a Retriever

WWhiteFang_Jr

If you do this, you'll not get the above openAI error

WWhiteFang_Jr

Actually when you are loading back the saved index, it looks for the service_context if it does not find it it creates a new one with default values which is openAI

bborjazz

But the error appears to be in the load_index_from_storage function

WWhiteFang_Jr

Yes, you load the index via load_index_from_storage

bborjazz

And that's before the ServiceContext initialization

LLogan M

load_index_from_storage will create a new service context if it's not passed in yea.

Just pass it in and you should be good to go

index = load_index_from_storage(storage_context, service_context=service_context)

LLogan M

or set a global (as linked above)

Plain Text

from llama_index import set_global_service_context

set_global_service_context(service_context)

bborjazz

So i have to initialize the ServiceContext before the load of the index? In that case, do i need the following line of code?

    response_synthesizer = llama_index.response_synthesizers.get_response_synthesizer(
        response_mode="compact",
        service_context=service_context_llm,
        use_async=False,
        streaming=False,
    )

If i have a Retriever with the index in which i declare the ServiceContext?

LLogan M

Yea you can still use that line of code.

Alternatively, you can do index.as_query_engine(response_mode="compact", similarity_top_k=10) to create the query engine

bborjazz

Perfect! So i in that case i dont need to Initialize a VectorIndexRetriever either

LLogan M

yup! The lower level API you were using there is mostly there to make customizing easier, or using seperate steps/components easier 🙂

bborjazz

Thank you so much!

Add a reply

Find answers from the community

LLM