Hey is it possible to insert documents

Hey, is it possible to insert documents one by one into a ComposableGraph?

10 comments

Not quite. Since a graph is made up of several sub-indexes, each of those sub-indexes usually contains a specific subset of documents.

You'd have to pick which sub-index to insert into ahead of time.

Although auto-inserting is an interesting idea to explore 🤔

YYasmine

Thanks for your response!
And can I insert a new sub-index? in my case instead of loading all documents at once and index them, then create the graph based on those indexes and their summaries, I'm loading document by document, then I want to index it and add it to the graph, not sure if I can do that

LLogan M

You'd have to re-create the graph in order to add a new sub-index 🤔 If this is the approach you want to take, I would save each index individually and then build the graph at runtime.

Definitely, the operations around the graph could be better supported 🙂

YYasmine

Hey, do you know if the graph can be saved into a DB (such as Weaviate or any other) or is it only used for querying at runtime?

LLogan M

Hmm, yea I think that won't work, just due to the structure of how the graph is built

You could definitely save the sub-indexes to a vector db though, assuming they are vector indexes

YYasmine

Thanks!
Edited : So what is exactly the purpose of graphs if we cannot use them directly when querying into DB or save them into the database? 😅

LLogan M

Wait, why can't you use them? 😅

You can save them to disk or S3 or a Google bucket. Just need to ensure all sub-indexes share the same storage context, then you can do graph.root_index.storage_context.persist() as well as load_graph_from_storage()
https://gpt-index.readthedocs.io/en/latest/how_to/index/composability.html#optional-persisting-the-graph

https://gpt-index.readthedocs.io/en/latest/how_to/storage/save_load.html#using-a-remote-backend

But lately development on graphs has slowed a bit in favor of the router query engine and sub question query engine

YYasmine

Ahh sorry about that, I meant use them directly when querying into DB. So basically when using a graph we need to save its indexes into a DB (if we're using one) and the graph to the disk, then when querying we need to load the graph from the disk and the indexes from the DB and recreate the graph at runtime?
Or by this Just need to ensure all sub-indexes share the same storage context you mean persist both indexes and graph to disk and then just load them and use that for querying?

LLogan M

hmmm.... actually I think to save the entire graph, the sub-indexes cannot be saved in a 3rd party vector db. If they are, you'll have to load each sub-index and re-construct the graph

If they aren't in a vector db, what I meant was something like this

Plain Text

# create
storage_context = StorageContext.from_defaults()
index1 = VectorStoreIndex.from_documents(docs, storage_context=storage_context)
index2 = ListIndex.from_documents(docs, storage_context=storage_context

graph = ComposableGraph.from_indices(
    ListIndex,
    [index1, index2],
    index_summaries=[index1_summary, index2_summary],
    storage_context=storage_context,
)

# save
# set the ID
graph.root_index.set_index_id("my_id")

# persist to storage
graph.root_index.storage_context.persist(persist_dir="./storage")

# load 
from llama_index import StorageContext, load_graph_from_storage

storage_context = StorageContext.from_defaults(persist_dir="./storage")
graph = load_graph_from_storage(storage_context, root_id="my_id")

YYasmine

cool thank you !

Add a reply

Find answers from the community

Hey is it possible to insert documents