Hello I have created an application very

ddrButts

Hello, I have created an application very similar to the uber 10k tutorial, and am moving to qdrant as the vector store and would like to do my data ingestion in a separate 1 time process, and then run my app off of the embeddings and llama_index index set and graph_index that I have created. I am looking for some direction on how to "reconstitute" these llama_index tools when i am not creating the embeddings within the same application as my chatbot. Most or all examples I have found create the embeddings in the same process as the chatbot/lanchain tools etc

16 comments

LLogan M

I'm not sure what you mean here 🤔 Can you give an example?

ddrButts

I am basing it off of this example: https://gpt-index.readthedocs.io/en/latest/guides/tutorials/building_a_chatbot.html

ddrButts

If I wanted to essentially build this, but have the ingestion and embedding of the doc to be done in say airflow or a one time process, save the embeddings to qdrant and then run a chatbot off of that vectordb I would need to recreate the indexes/composable graph etc but from different server. I am storing the embeddings in qdrant, but trying to determine how I might either presist the graph and index set to be consumed elsewhere or rebuild from the vectordb

ddrButts

I don't want to run the embeddings generation every time i run the app

ddrButts

the tools generated for the individual year indexes and the graphs are stored locally or in memory in this example

LLogan M

Persisting the embeddings for the tools/sub-indexes is easy enough, assuming your qdrant server is persisting somewhere

Once you create the index for the first time using qdrant, you can "reload" it by setting up the vector store object to connect back, and then doing

index = VectorStoreIndex.from_vector_store(vector_store)

ddrButts

what version of llama_index, VectorStoreIndex isn't showing or importing for me

ddrButts

at least I am not finding the import so must have something wrong

LLogan M

VectorStoreIndex was introduced around 0.6.20 I think? It's just a cosmetic rename (GPTVectorStoreIndex is still supported)

LLogan M

from_vector_store was also around the same time. But if you can't upgrade, there's another method that's just more verbose

LLogan M

Plain Text

vector_store = [setup to connect to existing vector store]
storage_context = StorageContext.from_defaults(vector_store=vector_store)
index = GPTVectorStoreIndex([], storage_context=storage_context)

ddrButts

thanks, I'll try it out when I get home. it might be my ide acting up and not resolving the import or I need to upgrade versions

ddrButts

@Logan M Thank you, had to make some version updates but now able to load the index. Not sure about recreating the llama_index per year and graph index structures, but at least a start in the direction of solving that problem

ddrButts

@Logan M If I want to split out the process of ingestion/embedding creation from the instance in which I run and use my llama_indexes/langchain tools/chatbot it seems like I will need to work out a method in which I create the embeddings separately from creating the llama_indices, right now when I examine the index I have created, it has many bound methods and references to memory address which I wouldn't expect won't work another machine even if I found a way to serialize the index and export.
Is there another example which shows how to contructed these in a way that doesn't require the embeddings to be generated everytime the indexes are created?

LLogan M

The graph tbh hasnt been well supported -- internally it's a little deprecated

And since you are using a remote vector db, it's a little complicated 😅

Using the method I shared takes care of the sub-indexes -- all thats left is the root index

You can instansiate the root index the same way as the sub indexes (assuming you passed in the storage_context when using from_indices)

Then, I thiiiiink you can re-create the graph like ComposableGraph([root_index, index1, index2, ...], root_index.index_id)

LLogan M

this is my best guess looking at the source code lol

Add a reply

Find answers from the community

Hello I have created an application very