Hoaz

Hi, Is there any implementation

Hi, Is there any implementation mechanism in Llama-index to create a BM25 retriever from an already existing ChromadbVectoreStore object for advanced RAG retrieval?? Or how to get a docstore object from an existing ChromadbVectoreStore or how to retrieve all nodes from an already existing ChromadbVectoreStore ?

6 comments

HHoaz

Hi, I am tryin to use

Hi, I am tryin to use PrevNextNodePostprocessor to retrieve more nodes from the same document but I get

 raise ValueError(f"doc_id {doc_id} not found.")
ValueError: doc_id e4e59a11-a8d6-4141-b775-9b60d8af1788 not found.

whenever I use this option.
This is the code I implemented to use this feature:

    node_postprocessors =[ PrevNextNodePostprocessor(docstore=storage_context.docstore,num_nodes=2,mode="both")]

First I thought that maybe some retreived nodes are the first or the last chunk of document and there is no next/prev node related to them but it seems this is not the main problem.
Any I idea what am I doing wrong?

2 comments

HHoaz

How can i

5 comments

HHoaz

Hi, I want to index a corpus of data and

Hi, I want to index a corpus of data and store it directly into chromadb instance.
But this code only genreates a storage folder and than stores it into a file instead of chromadb vector_storedb.
Can anyone help

chromadb_vs = ChromaVectorStore(chroma_collection=chromdb_collection)
print("INFO: Initializing the Service Context")
service_context = ServiceContext.from_defaults(
    llm=llm,
    embed_model="local"
)

print("INFO: Creating Vector Store index object")
index = VectorStoreIndex.from_documents(documents=documents,vector_store=chromadb_vs,service_context=service_context,show_progress=True)

print("INFO: Writing to disk as persistance")
index.vector_store.persist()

6 comments

HHoaz

🆘 HELP!! Does anyone has experience

🆘 HELP!! Does anyone has experience with none Latin documents and data in llama-index. Specially Arabic alphabets. Is llama-index default tokenizer and embeddings fit for Arabic documents. Any idea or experience in this field!!!!

4 comments

HHoaz

Hi, I have indexed a big amount of data

Hi, I have indexed a big amount of data with llama-index and stored the data into the disk. But here is the problem that default_vector_store.json file is bigger than 6GB. Now each time I want to load the data into a storage_context and to crate a new VectoreStoreIndex to make query, it takes more than half an hour just to load it. Any idea !!!!!!

20 comments

HHoaz

Metadata

Hi everyone. I need a pre retrieval metadata aware condition query in Chromadb Vector Store. Does llama-index provide any utility to specify metadata conditioning retrieval? I can do some post retrieval node processing but in this stage any metadata conditioning will drop the nodes count and sometimes to 0.

2 comments

HHoaz

Hi, How can I setup a

Hi, How can I setup a DocumentSummaryIndex using custom llm instead of OpenAI default. I have set mistral as my LLM in the srvice context but still it makes some request to OpenAI and I get OpenAI rete limit. BTW my embed_model is also a custom embed model.


llm_model_name= "Mistralai/Mistral-7B-Instruct-v0.2"
llm = HuggingFaceLLM(model_name=llm_model_name)
service_context = ServiceContext.from_defaults(llm=llm,embed_model=embed_model)

What am i missing to make it work with my default LLM model rather than the OpenAI GPTs.

Find answers from the community

Hi, Is there any implementation

Hi, I am tryin to use

How can i

Hi, I want to index a corpus of data and

🆘 HELP!! Does anyone has experience

Hi, I have indexed a big amount of data

Metadata

Hi, How can I setup a

Does the node's metadata has any effect