Hi Everyone

RRendy Febry

Hi Everyone

Is anyone know what's the purpose of Index Store?

If I want to have 2 separate indexes which their own sets of documents, what's the best way to achieve that?

11 comments

WWhiteFang_Jr

The index store is a component of LlamaIndex that contains lightweight index metadata. It stores additional state information created when building an index.

for more you can checkout : https://gpt-index.readthedocs.io/en/stable/core_modules/data_modules/storage/index_stores.html#index-stores

For the Second query:
You can create two separate indexes query as per the requirement if they both are completely non related.
If they are related you can check use Composable Graph
Linking one example here: https://github.com/jerryjliu/llama_index/blob/main/docs/examples/composable_indices/financial_data_analysis/DeepLakeDemo-FinancialData.ipynb

RRendy Febry

Actually my problem is, when creating multiple indexes with separate documents/nodes between then, when query an index, the query engine should only looks for documents/nodes that belong to that index right? because from my test, that's not the case.

RRendy Febry

Also, I found there's a problem on IndexStore when using PgVectorStore
https://github.com/jerryjliu/llama_index/issues/7360

RRendy Febry

WDYI @Logan M

WWhiteFang_Jr

If you have multiple index lets say index 1 and index_2 then you will have to create same number of query engines as well.

for example:

Plain Text

# first index
index_1 = VectorStoreIndex(documents1)

# second index
index_2 = VectorStoreIndex(documents2)


# create first query engine
query_engine_1 = index_1.as_query_engine()


# create second query engine
query_engine_2 = index_2.as_query_engine()


now based on condition whether the query is for first index or second you can directly use the query engine.

RRendy Febry

@WhiteFang_Jr I know, tried that already, but it won't work when using PgVectorStore. That because for some reason when using PgVectorStore, the index store won't be updated, so each index won't know which Documents/Nodes that belong to them. So at the end both query engine will give same result.

LLogan M

@Rendy Febry the index store and docstore is not used when using a vector store integration. This is because the entire index is stored in the vector store.

If you make two postgres indexes, each in a different table (i.e. two distinct vector store objects) you'll get the behaviour you want

RRendy Febry

If you make two postgres indexes, each in a different table

So there's no way to keep them in same table?

RRendy Febry

@Logan M Actually I found a way to make the index store and docstore updated when using pgvector store, we can do that by enabling the store_nodes_override

Plain Text

        western_index = VectorStoreIndex.from_documents(documents,
                                                        service_context=self.service_context,
                                                        storage_context=self.storage_context,
                                                        store_nodes_override=True)

The problem is, somehow the generated nodes_dict is not respected by the query_engine, and it still search from all nodes instead of only from the nodes that belong to that index.

----

he index store and docstore is not used when using a vector store integration. This is because the entire index is stored in the vector store.

I don't think this statement is always true, because we can disable that behaviour on VectorStorestores_text=False. CMIIW.

LLogan M

Yea that's fair, you can set the override, but tbh I never found it that helpful 😅

Creating different tables in pgvector seems easier and more organized, becasue then you aren't juggling local files

RRendy Febry

Fair enough, considering how postgres and sql query in general works, I don't think IDs filtering will works great either, unless we have dedicated SQL Iindex for that.

Will try with different tables for now.

Add a reply

Find answers from the community

Hi Everyone