A community member is trying to load a 400-page PDF document into a Chroma DB vector store locally, but is hitting the maximum batch size when trying to persist the document. The community member notes that the same document was able to be persisted using LangChain. In the comments, another community member suggests that the Chroma vector store implementation may not have implemented batching yet, and recommends setting the insert_batch_size parameter when creating the vector store index to a higher value, such as 1024, as a potential workaround.
Hi all I am trying to load a document into a chroma db vector store locally. It’s a 400 page pdf file, it embedded fine but when I go to persist it I am hitting the chroma db maximum batch size. Is there a way around this? Same doc with langchain was able to persist fine