Find answers from the community

Updated 3 months ago

Hi, Is there any implementation

Hi, Is there any implementation mechanism in Llama-index to create a BM25 retriever from an already existing ChromadbVectoreStore object for advanced RAG retrieval?? Or how to get a docstore object from an existing ChromadbVectoreStore or how to retrieve all nodes from an already existing ChromadbVectoreStore ?
R
H
6 comments
you should be able to get the list of all existing nodes from a ChromaVectorStore object like this:
Plain Text
chrome_vector_store.client.get().documents
Thanks @Rohan. One more question, I have relatively a large crhomadb (163GB) file. I've tried to create a new BM25 retriever from that chromadb text based on your help but it takes too long to create the new retriever, is it normal due to the size of the chromadb file?
I haven't worked with that big corpus, but as it tokenizes nodes one by one, that's why it might be taking so long
Attachment
image.png
is there any solution that i can prepare and store the bm25 retriever data to the disk so that next time it won't take as long as the first time?
I'm not exactly sure if the corpus is persisted or the nodes are tokenized on every update. If only the new nodes are processed on update, then it'll not take as long as the first run
thanks will test that with a small corpus
Add a reply
Sign up and join the conversation on Discord