When working with chromadb directly and

At a glance

The community member is experiencing an issue when working with ChromaDB and loading an index from a VectorStoreIndex. They are getting an InvalidDimensionException error, which they suspect is because the embedding model used by ChromaDB does not match the dimensions expected by the OpenAI language model they are using.

Other community members suggest that the issue is likely due to a mismatch between the embedding model used in the LlamaIndex library and the one used to create the index. They recommend checking the embedding model used by ChromaDB, which is reportedly the "sentence-transformers/all-MiniLM-L6-v2" model. One community member confirms that setting the HuggingFaceEmbeddings to use this model resolves the issue.

ccmagorian

When working with chromadb directly, and loading the index frm VectorStoreIndex.from_vector_store(), I get the following error when using the chat_repl()

chromadb.errors.InvalidDimensionException: Embedding dimension 768 does not match collection dimensionality 384

I am using OpenAI as the LLM, im assuming this is because when i do chroma_collection.upsert() (via there API) that this uses their default embedding model which doesn't match the the dimensions that OpenAI expects?

5 comments

LLogan M

yea so theres two models in llama-index, the LLM and the embedding model

It looks like whichever embedding model you are using in llama-index is not the same as the embedding model that created the index. These need to be the same

The LLM can change at any time though

ccmagorian

gotcha, ill dig into chroma-db and figure out which huggingface model it uses.

ccmagorian

Worked! If anyone else is using chromadb in a different pipeline, be sure to set HuggingFaceEmbeddings(model_name="sentence-transformers/all-MiniLM-L6-v2")

ccmagorian

thats the model used by chromadb internally

ccmagorian

ty!

Add a reply

Find answers from the community

When working with chromadb directly and