Find answers from the community

Updated 6 months ago

How would I upsert nodes in parallel

How would I upsert nodes in parallel using VectorStoreIndex using pinecone, since i have so many nodes to upsert. I would of thought use_async would of upserted each batch in parallel but just upserts sequentially.
index = VectorStoreIndex(nodes, storage_context=storage_context,use_async=True,insert_batch_size=1500)
L
g
2 comments
use async is just for generating embeddings

If you want to insert in parallel, you could use some lower-level APIs. I might use the ingestion pipeline here

Plain Text
pipeline = IngestionPipeline(transformations=[SentenceSplitter(), OpenAIEmbedding()])

nodes = await pipeline.arun(documents=documents)

batches = <split init batches>
jobs = [vector_store.async_add(node_batch) for node_batch in batches]
await asyncio.gather(jobs)

index = VectorStoreIndex.from_vector_store(vector_store)
This makes sense i will try this
Add a reply
Sign up and join the conversation on Discord