Find answers from the community

Updated last year

Hi there, is there a way to bulk insert

At a glance

The community member is looking for a way to bulk insert documents into an index instead of doing a loop and waiting for slow individual inserts. The comments suggest that using the from_documents() method might be a solution, as it should allow for batch inserts. One community member confirms that this method should work and append to an existing vector store. The community members agree that a bulk insert feature would be a useful addition.

Hi there, is there a way to bulk insert into an index instead of doing a loop and waiting for the slow pinecone inserts?

For more context, we now do this:

Plain Text
def insert_into_index(self, doc_file_path, model: str, doc_id=None):
    """Insert new document into global index."""
    self.initialize_index(doc_id, model)
    documents = self.loader_pdf.load_data(file=Path(doc_file_path))

    for document in documents:
        if doc_id is not None:
            document.doc_id = doc_id
            self.index.insert(document)

    return
L
N
6 comments
not at the moment, we should really add that though
I thiiiiiink you might be able to just call from_documents() though to do a batch insert
VectorStoreIndex.from_documents(documents, storage_context=storage_context)

If storage context points to an existing vector store, it should, in theory, just append to it
Ahh okay, thank you. Gonna try it!
And yeah I agree that that should probably be a feature. I have a 300 page document which takes 300 insert cycles :/
Yea not ideal πŸ˜… hopefully from_documents works, as it does batch inserts
Add a reply
Sign up and join the conversation on Discord