Find answers from the community

Updated 3 months ago

Error add to a collection in chromadb:

Error add to a collection in chromadb:


collection_name = "name"
vector_store = ChromaVectorStore(chroma_collection=collection_name)

storage_context = StorageContext.from_defaults(vector_store=vector_store)

raw_index = VectorStoreIndex.from_documents(
parsed_docs,
storage_context=storage_context,
embed_model=Settings.embed_model
)




--------------------------------------------------------------------------- AttributeError Traceback (most recent call last) <ipython-input-41-79eec0778777> in <cell line: 7>() 5 storage_context = StorageContext.from_defaults(vector_store=vector_store) 6 ----> 7 raw_index = VectorStoreIndex.from_documents( 8 parsed_docs, 9 storage_context=storage_context, 6 frames /usr/local/lib/python3.10/dist-packages/llama_index/vector_stores/chroma/base.py in add(self, nodes, **add_kwargs) 263 documents.append(node.get_content(metadata_mode=MetadataMode.NONE)) 264 --> 265 self._collection.add( 266 embeddings=embeddings, 267 ids=ids, AttributeError: 'str' object has no attribute 'add'
L
k
19 comments
This isn't how you use chroma
Plain Text
db = chromadb.PersistentClient(path="./chroma_db")
chroma_collection = db.get_or_create_collection("quickstart")
vector_store = ChromaVectorStore(chroma_collection=chroma_collection)
is one example
many on this page
Thanks logan.
I am following the llama-parse example in https://github.com/run-llama/llama_parse/blob/main/examples/demo_advanced.ipynb and building a raw_index and recursive_index. I was able to build the indices in chromadb, however, how do I load it from the disk? Here's an example I am referring to:

Plain Text
# save to disk

db = chromadb.PersistentClient(path="./chroma_db")
chroma_collection = db.get_or_create_collection("quickstart")
vector_store = ChromaVectorStore(chroma_collection=chroma_collection)
storage_context = StorageContext.from_defaults(vector_store=vector_store)

index = VectorStoreIndex.from_documents(
    documents, storage_context=storage_context, embed_model=embed_model
)

# load from disk
db2 = chromadb.PersistentClient(path="./chroma_db")
chroma_collection = db2.get_or_create_collection("quickstart")
vector_store = ChromaVectorStore(chroma_collection=chroma_collection)
index = VectorStoreIndex.from_vector_store(
    vector_store,
    embed_model=embed_model,
)

# Query Data from the persisted index
query_engine = index.as_query_engine()
response = query_engine.query("What did the author do growing up?")
display(Markdown(f"{response}"))
My code:

vector_store = ChromaVectorStore(chroma_collection=chroma_collection) storage_context = StorageContext.from_defaults(vector_store=vector_store) raw_index = VectorStoreIndex.from_documents( parsed_docs, storage_context=storage_context, embed_model=Settings.embed_model ) recursive_index = VectorStoreIndex( nodes=base_nodes + objects, storage_context=storage_context, embed_model=Settings.embed_model )

I am trying to load the raw and recursive separately, and not sure where to specify in VectorStoreIndex.from_vector_store
The example shows the loading

You'd jsut create two vector store objects, one for each index, and do VectorStoreIndex.from_vector_store(vector_store)
Plain Text
# load from disk
db2 = chromadb.PersistentClient(path="./chroma_db")
chroma_collection = db2.get_or_create_collection("quickstart")
vector_store = ChromaVectorStore(chroma_collection=chroma_collection)
index = VectorStoreIndex.from_vector_store(
    vector_store,
    embed_model=embed_model,
)
in the same collection or a different collection?
it would be a collection per index
ah ok, makes sense
So, I am getting an AssertionError when attempting to run: response_1 = raw_query_engine.query(query)
Code:

Plain Text
from llama_index.postprocessor.flag_embedding_reranker import (
    FlagEmbeddingReranker,
)

llm = MistralAI(
                model="mistral-small-latest",
                api_key=userdata.get('MISTRAL_API_KEY')
               )


reranker = FlagEmbeddingReranker(
    top_n=5,
    model="sentence-transformers/all-MiniLM-L6-v2",
)

raw_query_engine = raw_index.as_query_engine(
                                              similarity_top_k=15,
                                              node_postprocessors=[reranker],
                                              llm=llm
                                            )

recursive_query_engine = recursive_index.as_query_engine(
                                                          similarity_top_k=15,
                                                          node_postprocessors=[reranker],
                                                          verbose=True,
                                                          llm=llm
                                                        )
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at sentence-transformers/all-MiniLM-L6-v2 and are newly initialized: ['classifier.bias', 'classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
Error:

Plain Text
---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
<ipython-input-48-4badad8cf032> in <cell line: 3>()
      1 query = "What is the  Section 8 Rent Income in March 2023 at The Tillicum Apartments?"
      2 
----> 3 response_1 = raw_query_engine.query(query)
      4 print("\n***********New LlamaParse+ Basic Query Engine***********")
      5 print(response_1)

7 frames
/usr/local/lib/python3.10/dist-packages/llama_index/postprocessor/flag_embedding_reranker/base.py in _postprocess_nodes(self, nodes, query_bundle)
     71                 scores = [scores]
     72 
---> 73             assert len(scores) == len(nodes)
     74 
     75             for node, score in zip(nodes, scores):

AssertionError: 
Hmm, I don't think flag embedding rerranker is meant to be used with that model?
ah that's right.
Add a reply
Sign up and join the conversation on Discord