LlamaIndex

Log inLog into community

Find answers from the community

Updated 4 months ago

Question about loading multiple indices

Question about loading multiple indices

At a glance

·

Question about loading multiple indices for RAG using load_indices_from_storage(storage_context)
I have used the embedding to create different indexes with each index in a folder with all the json files.

For example, I have one index for Math, and one index for Science. My goal is to keep them separated instead of combining them into one index.

But running into error with
storage_context = StorageContext.from_defaults(persist_dir=PERSIST_DIR)
index = load_indices_from_storage(storage_context)

Is it possible to load multiple indices?

a

s

38 comments

Would it be okay to store these separately in their own persist dir as well, and just load these indices indvidually from their own respective storage contexts?

Attachment

This is the file structure right now.

And I have set persist_dir as index folder

@nerdai I see what you mean, in that case, I will have index_math and index_science.
But then I'm trying to do st.session_state.chat_engine = index.as_chat_engine(chat_mode="best", verbose=True)

Then do I have to create 2 chat_engine then?

The goal is to have one chat engine that can load up 2 indices.

Yes, so one way that could work for you is to use a Router Query Engine: https://docs.llamaindex.ai/en/stable/examples/query_engine/RouterQueryEngine.html

so Math and Science would have its own index and query engine

and then you have router query engine that sits atop of those

Ok, this is helpful.
Just to confirm, can I still use query_engine as chat_engine?

https://blog.streamlit.io/build-a-chatbot-with-custom-data-sources-powered-by-llamaindex/

So what I'm trying to do is exactly like this one but with multiple indices.

ah i don't know if yet have convenience methods from query engine to chat engine

as you know, these exist when working with index

You should be able to turn the query engine as a query engine tool, then supply that to an agent if using ChatMode.REACT, OPENAI, or BEST

https://github.com/run-llama/llama_index/blob/5d557cb2fe48b90e4056ecae25b9371681752a3c/llama-index-core/llama_index/core/indices/base.py#L426

do you know what kind of chat engine you're trying to build?

I've used 'best' and 'condense_plus_context' as parameters inside chat_engine = index.as_chat_engine(chat_mode="condense_question", verbose=True) before.

Also, is there a way to define which specify which LLM to use?

I think the default has been GPT3.5 and I couldn't find a way to change that.

yes, in v0.9.xx. which is now legacy, you need to work with ServiceContext objects and define the llm there

then you simply supply the ServiceContext to the index/engine that requires it

in v0.10.0 just released today, you can continue to use ServiceContext, but that is deprecated in favor or using Settings

Plain Text

from llama_index.embeddings.openai import OpenAIEmbedding
from llama_index.llms.openai import OpenAI
from llama_index.core import Settings

Settings.llm = OpenAI(model="gpt-4")
Settings.embed_model = OpenAIEmbedding(model="text-embedding-3-small")

Ok, I think I did define LLM for Service Context and saved the index. Then when I load the Index again and use it as_chat_engine, GPT3.5 is still being used instead of GPT4.

Good job on the release, I've been modifying my code. Does Settings now automatically apply to all code?

Let's say I have changed the Settings like the code you provided above, and use
ServiceContext.from_defaults(). Does this use the Settings?

It's giving me warning

<ipython-input-6-1153c1c48423>:27: DeprecationWarning: Call to deprecated class method from_defaults. (ServiceContext is deprecated, please use llama_index.settings.Settings instead.) -- Deprecated since version 0.10.0.

service_context = ServiceContext.from_defaults()

So I'm guessing it's not pulling it?

How would you suggest me to get rid of ServiceContext and use Settings?

Oh yeah, ServiceContext won't use Settings in that way.

oh wait... sorry you're right

let me see why you may be getting this error

This is my old code
index = VectorStoreIndex.from_documents(docs, service_context=service_context, show_progress = True)

Ok, so to modify this code, it will just be like this?
index = VectorStoreIndex.from_documents(docs, llm = llm, embed_model = embed_model, show_progress = True)

yes that should work

you can simply define llm and embed models in these api's now as opposed to having to define them thru service_contexgt

Gotcha. That makes sense.

index.as_chat_engine(chat_mode="best", verbose=True)

As for this, does it automatically pull settings.llm then?

Yes, that's right it should pull from the global settings

unless you specify otherwise at this component level

Ok, thank you. This is very helpful.

I will poke around and still need to figure out how to turn the multi indices query engine as a chat mode.

My pleasure. Okay, sounds good! You know how to find us, should you run into any problems -- but hopefully it goes smoothly for you!

Add a reply

Sign up and join the conversation on Discord