Find answers from the community

Updated 9 months ago

Hi, I am using IngestionPipeline to

Hi, I am using IngestionPipeline to ingest to pinecone db. I have just specified a vector store and not docstore. For the deduplication to be handled (during upsert), is it necessary to add docstore? If so, then the docstore is supposed to be a db or just any cloud location works?
Thanks in advance!
L
h
3 comments
yes its needed to use a docstore

The docstore can be either SimpleDocumentStore (saved to/from disk), or a remote docstore like redis, mongodb, etc. If you use an integration, be sure to install it pip install llama-index-storage-docstore-redis -- from llama_index.storage.docstore.redis import RedisDocumentStore

I can see them all listed in the API ref
https://docs.llamaindex.ai/en/stable/api_reference/storage/docstore/simple/
Got it. I will check this. Can you pls confirm that this will check for duplicates at a doc_id level and not at node Id (pinecone id) level?
it checks at the doc_id level
Add a reply
Sign up and join the conversation on Discord