Find answers from the community

Updated 6 months ago

Async

At a glance

The community member is asking if they can parallelize the from_documents function, as the embedding process is taking a long time. They have provided their current code, which uses ChromaDB and LlamaIndex to create a vector store index.

In the comments, another community member suggests making the index creation asynchronous to speed up the ingestion process, and provides a link to an example in the LlamaIndex documentation.

There is no explicitly marked answer in the comments.

Useful resources
Hello. Does anyone know if I can parallelize the from_documents function. The embedding process is taking a long time and I was wondering if it can be accomplished in parallel. My code currently is
Plain Text
import chromadb
from llama_index.vector_stores import ChromaVectorStore

db = chromadb.PersistentClient(path="./polygon")
collection = db.get_or_create_collection("default")
vector_store = ChromaVectorStore(chroma_collection=collection)
index = VectorStoreIndex.from_vector_store(vector_store=vector_store)
index = index.from_documents(documents, show_progress=True)
query_engine = index.as_query_engine()

I want to speed up the ingestion of documents
Add a reply
Sign up and join the conversation on Discord