Find answers from the community

Updated 9 months ago

Hi @Logan M. We noticed something

Hi @Logan M. We noticed something strange whilst removing our document store and index store from our Llama index storage context. So basically we had the idea that our Docstore and index store were not really useful since our vectore store (PgVector) already contains all of the nessesary info for our app to function.

So basically our question is: Why does our app still fully work after removing the Document store and Index store that uses MongoDB from our service context?

We changed:

Plain Text
document_store = MongoDocumentStore.from_uri(uri=MONGO_DB_URL)
index_store = MongoIndexStore.from_uri(uri=MONGO_DB_URL)

vector_store = PGVectorStore.from_params(
    async_connection_string=f"postgresql+asyncpg://{user}:{password}@{host}:{port}/{database}",
    connection_string=f"postgresql+psycopg2://{user}:{password}@{host}:{port}/{database}?sslmode=require",
    table_name=PG_VECTOR_DATABASE_DOC_TABLE_NAME,
    embed_dim=1536,
    hybrid_search=True,
    use_jsonb=True,
)

storage_context = StorageContext.from_defaults(
    docstore=document_store,
    index_store=index_store,
    vector_store=vector_store,
)


To

Plain Text
vector_store = PGVectorStore.from_params(
    async_connection_string=f"postgresql+asyncpg://{user}:{password}@{host}:{port}/{database}",
    connection_string=f"postgresql+psycopg2://{user}:{password}@{host}:{port}/{database}?sslmode=require",
    table_name=PG_VECTOR_DATABASE_DOC_TABLE_NAME,
    embed_dim=1536,
    hybrid_search=True,
    use_jsonb=True,
)

storage_context = StorageContext.from_defaults(
    vector_store=vector_store,
)


It would be really nice if you could maybe help us learn why the docstore and index store are relevant in the first place and what the implications could be of removing this?
W
L
N
3 comments
You get docstore and index store when you are storing your indexes locally. You can access docstore and check on the nodes maybe update, delete , modify the text. Or if you want to iterate over the nodes based on metadata etc.
docstore in nutshell stores the text and metadata in it.
For vector stores, the text, metadata and embeddings are combined together in nodes. That is why you dont see any node in docstore.

All the nodes are sotred on third party stores. And with client you only form the connection to the store for querying.
Yea, if when building your index, if you didn't specify store_nodes_override=True, it wasnt even using the docstore/indexstore

Really the only reason youd use it is if
a) the vector db doesn't support storing nodes/text
b) you wanted to track inserted nodes/documents for upserts
Thanks a lot for the explanation guys, will remove those then to reduce overhead!
Add a reply
Sign up and join the conversation on Discord