Find answers from the community

Updated 6 months ago

Hi all I am trying to load a document

At a glance

A community member is trying to load a 400-page PDF document into a Chroma DB vector store locally, but is hitting the maximum batch size when trying to persist the document. The community member notes that the same document was able to be persisted using LangChain. In the comments, another community member suggests that the Chroma vector store implementation may not have implemented batching yet, and recommends setting the insert_batch_size parameter when creating the vector store index to a higher value, such as 1024, as a potential workaround.

Hi all I am trying to load a document into a chroma db vector store locally. It’s a 400 page pdf file, it embedded fine but when I go to persist it I am hitting the chroma db maximum batch size. Is there a way around this? Same doc with langchain was able to persist fine
L
1 comment
hmm I guess the chroma vector store implementation hasn't implemented batching yet

You can set batching at the top-level though

VectorStoreIndex(..., insert_batch_size=1024) or something like that
Add a reply
Sign up and join the conversation on Discord