Find answers from the community

Updated 3 months ago

LLM

Hi all, I am working on an RAG application for which I want to use a model for embeddings and another one (LLM) for the question-answer flow.
For this I am using ServiceContext with the following configuration:

def setup_index(documents): embed_model = HuggingFaceEmbedding('sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2') service_context_embedding = ServiceContext.from_defaults(embed_model=embed_model, llm=None, chunk_size=1024) return VectorStoreIndex.from_documents(documents, service_context=service_context_embedding)

After this I store a persistent folder with the indices locally.

When I load the index from the local folder with the following code:

def load_documents(): # Create storage context from persisted data storage_context = StorageContext.from_defaults(persist_dir="./data") # Load index from storage context index = load_index_from_storage(storage_context) return index

I get this error:

Could not load OpenAI model. If you intended to use OpenAI, please check your OPENAI_API_KEY. Original error: No API key found for OpenAI.

Does anyone know what is going on? I want to create the indexes without LLM and then add it as ServiceContext in a response_synthesizers.
W
b
L
16 comments
You'll need to pass in the service_context when you are loading the index as well if service_context is not declared globally.
Im doing this in the service_context

service_context_llm = ServiceContext.from_defaults( llm=llm, params=parameters, apikey=api_key, roject_id=project_id ), system_prompt="Eres un asistente de IA que responde sobre postulaciones a puestos de trabajo \ No debes ser sesgado ni racista en tus respuestas. Responde siempre en español." ) retriever = VectorIndexRetriever( index=index, similarity_top_k=10, ) response_synthesizer = llama_index.response_synthesizers.get_response_synthesizer( response_mode="compact", service_context=service_context_llm, use_async=False, streaming=False, ) return RetrieverQueryEngine( retriever=retriever, response_synthesizer=response_synthesizer, )
And using the index previously built as a Retriever
If you do this, you'll not get the above openAI error
Actually when you are loading back the saved index, it looks for the service_context if it does not find it it creates a new one with default values which is openAI
But the error appears to be in the load_index_from_storage function
Yes, you load the index via load_index_from_storage
And that's before the ServiceContext initialization
load_index_from_storage will create a new service context if it's not passed in yea.

Just pass it in and you should be good to go

index = load_index_from_storage(storage_context, service_context=service_context)
or set a global (as linked above)

Plain Text
from llama_index import set_global_service_context

set_global_service_context(service_context)
So i have to initialize the ServiceContext before the load of the index? In that case, do i need the following line of code?

response_synthesizer = llama_index.response_synthesizers.get_response_synthesizer( response_mode="compact", service_context=service_context_llm, use_async=False, streaming=False, )

If i have a Retriever with the index in which i declare the ServiceContext?
Yea you can still use that line of code.

Alternatively, you can do index.as_query_engine(response_mode="compact", similarity_top_k=10) to create the query engine
Perfect! So i in that case i dont need to Initialize a VectorIndexRetriever either
yup! The lower level API you were using there is mostly there to make customizing easier, or using seperate steps/components easier 🙂
Thank you so much!
Add a reply
Sign up and join the conversation on Discord