Find answers from the community

H
Hoaz
Offline, last seen 3 months ago
Joined September 25, 2024
Hi, Is there any implementation mechanism in Llama-index to create a BM25 retriever from an already existing ChromadbVectoreStore object for advanced RAG retrieval?? Or how to get a docstore object from an existing ChromadbVectoreStore or how to retrieve all nodes from an already existing ChromadbVectoreStore ?
6 comments
H
R
Hi, I am tryin to use PrevNextNodePostprocessor to retrieve more nodes from the same document but I get raise ValueError(f"doc_id {doc_id} not found.") ValueError: doc_id e4e59a11-a8d6-4141-b775-9b60d8af1788 not found. whenever I use this option.
This is the code I implemented to use this feature:
node_postprocessors =[ PrevNextNodePostprocessor(docstore=storage_context.docstore,num_nodes=2,mode="both")]
First I thought that maybe some retreived nodes are the first or the last chunk of document and there is no next/prev node related to them but it seems this is not the main problem.
Any I idea what am I doing wrong?
2 comments
H
r
H
Hoaz
·

How can i

How can i
5 comments
k
H
Hi, I want to index a corpus of data and store it directly into chromadb instance.
But this code only genreates a storage folder and than stores it into a file instead of chromadb vector_storedb.
Can anyone help

chromadb_vs = ChromaVectorStore(chroma_collection=chromdb_collection) print("INFO: Initializing the Service Context") service_context = ServiceContext.from_defaults( llm=llm, embed_model="local" ) print("INFO: Creating Vector Store index object") index = VectorStoreIndex.from_documents(documents=documents,vector_store=chromadb_vs,service_context=service_context,show_progress=True) print("INFO: Writing to disk as persistance") index.vector_store.persist()
6 comments
L
r
H
🆘 HELP!! Does anyone has experience with none Latin documents and data in llama-index. Specially Arabic alphabets. Is llama-index default tokenizer and embeddings fit for Arabic documents. Any idea or experience in this field!!!!
4 comments
L
H
Hi, I have indexed a big amount of data with llama-index and stored the data into the disk. But here is the problem that default_vector_store.json file is bigger than 6GB. Now each time I want to load the data into a storage_context and to crate a new VectoreStoreIndex to make query, it takes more than half an hour just to load it. Any idea !!!!!!
20 comments
H
L
H
Hoaz
·

Metadata

Hi everyone. I need a pre retrieval metadata aware condition query in Chromadb Vector Store. Does llama-index provide any utility to specify metadata conditioning retrieval? I can do some post retrieval node processing but in this stage any metadata conditioning will drop the nodes count and sometimes to 0.
2 comments
H
L
Hi, How can I setup a DocumentSummaryIndex using custom llm instead of OpenAI default. I have set mistral as my LLM in the srvice context but still it makes some request to OpenAI and I get OpenAI rete limit. BTW my embed_model is also a custom embed model.
llm_model_name= "Mistralai/Mistral-7B-Instruct-v0.2" llm = HuggingFaceLLM(model_name=llm_model_name) service_context = ServiceContext.from_defaults(llm=llm,embed_model=embed_model)
What am i missing to make it work with my default LLM model rather than the OpenAI GPTs.
1 comment
W
Does the node's metadata has any effect on the regular query method?
1 comment
W