Find answers from the community

Updated 4 months ago

Crash

At a glance

The community member is trying to use the BM25Retriever with Chroma, following the documentation, but their Jupyter Lab kernel keeps dying and restarting when they try to execute the VectorStoreIndex line. They are using a small PDF, so they don't think it's a memory overflow issue. The other community members suggest that the issue might be on Chroma's side, as they have faced similar issues with Chroma's stability and performance. One community member recommends switching to Qdrant instead of Chroma.

Useful resources
I'm trying to use BM25Retriever w/ Chroma and this is the documentation that I follow: https://docs.llamaindex.ai/en/stable/examples/retrievers/bm25_retriever/#hybrid-retriever-with-bm25-chroma
When I try to execute this line: index = VectorStoreIndex(nodes=nodes, storage_context=storage_context), my jupyter lab kernel always dies and restarts.
I'm only using a very small PDF so there shouldn't be memory overflow issues.
Do you have any ideas of what's going on? Thanks!
Attachment
image.png
L
l
t
6 comments
Are you sure its small? πŸ˜… what is len(nodes) ?
print(len(nodes)) = 363
print(sum(len(str(node)) for node in nodes)) = 62013
This is just from 1 PDF w/ 13 pages
it has 363 nodes because I'm using the Sentence-Window Retrieval, which basically takes each sentence as a node unit (and its surrounding 3 sentences as context, recorded in metadata)
oh i faced this too... change vectorDB to Qdrant.

I've been facing a lot of issues with ChromaDB whether it's data ingestion, or vector search. I think their latest releases aren't stable
it's chroma's side.
ah that's unfortunate...thanks for letting me know!
Add a reply
Sign up and join the conversation on Discord