Find answers from the community

Updated 2 months ago

Can you retrieve original documents from a VectorStoreIndex

Hi gang. I'm posting this question for clarification

I have created a VectorStoreIndex (specifically a QdrantStoreIndex) on a bunch of documents. Can I retrieve the original documents from the VectorStoreIndex? If I can retrieve the original documents, then I'd like to build a SummaryIndex on top of them.

My understanding is that I cannot retrieve the original documents from the VectorStoreIndex. And I would have to create a separate DocumentStore and utilize it to build the SummaryIndex on. Please clarify.
W
r
2 comments
You will be able to retrieve the nodes only.
But related nodes has a relationships if they are formed from a Document object.

You can stich those nodes together to form the full document.

Also there is a method in Qdrant vector store class to fetch max 9999 nodes at once: https://github.com/run-llama/llama_index/blob/e2dca8bb021b36b8eaf38be953cb2496f029d680/llama-index-integrations/vector_stores/llama-index-vector-stores-qdrant/llama_index/vector_stores/qdrant/base.py#L300
Thanks @WhiteFang_Jr . Stitching the nodes does not feel like a practical approach since the nodes have overlapping text and that might become source of bugs.
Add a reply
Sign up and join the conversation on Discord