Find answers from the community

Updated last year

Chroma

I am trying to use Chromadb to store the vectors and documents. I was reading the chroma documentation and it discusses that chroma uses a default sentence transformer to chunk and a default embedding function to create embeddings when you add documents. My question is, when we use a Llama Index service_context that specifies, for example, to use the WindowNodeParser to create the chunks/nodes, does that mean Llama Index chunks the documents into nodes and computes the embeddings and then passes the embeddings of the nodes, original documents and the Llama Index created nodes to chroma without chroma performing its own processing? Does the embed_model argument of the Llama Index service_context override chroma's embedding function specification? Sorry if that sounds convoluted! I'm just not clear where the work is happening and who is doing it... LOL! πŸ˜†
PS. according to the chroma docs you're supposed to pass an embedding function when you create or load a collection, does Llama Index manage this on our behalf?
collection = client.get_collection(name="my_collection", embedding_function=emb_fn) <- from chroma documentation
L
2 comments
Yea, llama-index is handling all the chunking and embedding, chroma is just storing everything πŸ‘
Add a reply
Sign up and join the conversation on Discord