Find answers from the community

s
sysfor
Offline, last seen 4 weeks ago
Joined September 25, 2024
I'm having an issue reading from a local SimpleVectorStore and can't figure it out. The question is an exact replica of one in the ingested docs. Works fine in the qdrant i have setup (using other code obviously but same concepts).

(code in thread)
6 comments
s
L
With 0.10.7, is this pip3 install llama-index-vector-stores-chroma all that's required for from llama_index.vector_stores.chroma import ChromaVectorStore it's installed but it is the only import that isn't resolving after migrating to 0.10.7. Curious if there is something else I need to do here?

Plain Text
Requirement already satisfied: llama-index-vector-stores-chroma in ./env/lib/python3.10/site-packages (0.1.2)

Requirement already satisfied: llama-index-core<0.11.0,>=0.10.1 in ./env/lib/python3.10/site-packages (from llama-index-vector-stores-chroma) (0.10.9)
2 comments
s
L
s
sysfor
·

Error

I'm using MistralAIEmbedding along with ChromaDB as a vector store.

Getting the following error now: chromadb.errors.InvalidDimensionException: Embedding dimension 1536 does not match collection dimensionality 1024 because Mistral has a dimensionality of 1024

For example using faiss i can specify the dimensionality

d = 1024
faiss_index = faiss.IndexFlatL2(d)

d = property(_swigfaiss.Index_d_get, _swigfaiss.Index_d_set, doc=r""" vector dimension""")

^ i don't see anything like this in Chroma. Is this possible to do using ChromaVectorStore?
6 comments
s
L
When setting up my vector database (qdrant) I didn't use enable_hybrid=True, and I would like to compare the differences in quality. Is there a way to do this after the fact with llama index, or would I need to rebuild the entire vector db from scratch?
5 comments
L
s
Are there some methods for speeding up synthesize? the llm calls within synthesize are the most timely in my rag when looking at tracing via phoenix
1 comment
L
where is the correct place to print out the final prompt that's being sent to llm? would i need to leverage the callback manager in the RetrieverQueryEngine when assinign the variable vector_query_engine any good examples?

Plain Text
    
def _query_index(self, query_engine: RetrieverQueryEngine, query: str) -> RESPONSE_TYPE:
        embedded_query = Settings.embed_model.get_text_embedding(query)      
        response = query_engine.query(QueryBundle(query_str=query, embedding=embedded_query))
        
        return response

def _create_query_engine(self) -> RetrieverQueryEngine:
        vector_index = VectorStoreIndex.from_vector_store(vector_store=self.vector_store, 
                                                   embed_model=Settings.embed_model)
        
        vector_retriever = VectorIndexRetriever(index=vector_index, similarity_top_k=5)
        
        vector_query_engine = RetrieverQueryEngine(
            retriever=vector_retriever,
            response_synthesizer=self.response_synthesizer,
            node_postprocessors=[SimilarityPostprocessor(similarity_cutoff=0.50),],
        )
        
        vector_query_engine.update_prompts({"response_synthesizer:text_qa_template": self.qa_prompt_tmpl})
                
        return vector_query_engine

def query_rag(self, query: str) -> Dict[str, Any]:
        vector_query_engine = self._create_query_engine()

        response = self._query_index(query_engine=vector_query_engine, query=query)
5 comments
s
L
Is there a benefit to using either of these response variables?

response = engine.query(query)

response = engine.query(QueryBundle(query_str=query, embedding=embedded_query))

Plain Text
    def query_index(self, query_engines: List[BaseQueryEngine], queries: List[str]):
        for query in queries:
            embedded_query = Settings.embed_model.get_text_embedding(query)
            for engine in query_engines:
                response = engine.query(query)
                <...or...>
                response = engine.query(QueryBundle(query_str=query, 
                                                    embedding=embedded_query))

Plain Text
    vector_index = VectorStoreIndex.from_vector_store(vector_store=rag.vector_store, 
                                               embed_model=Settings.embed_model)

    query_engine0 = vector_index.as_query_engine(llm=Settings.llm,
                                         similarity_top_k=15, 
                                         node_postprocessors=[
                                                SimilarityPostprocessor(similarity_cutoff=0.60), 
                                                cohere_rerank
                                            ]
                                        )
3 comments
L
s
what would be the best way to extract keywords from a query string? would it be best to call a fast llm ~7b, or is there something native inside llama index to do this? I would like to extract keywords from the query and then use a node_postprocessor to look for said keywords in node metadata. not sure how practical this is but i find some instances in my testing where I think this would be useful
3 comments
s
L
s
sysfor
·

Rag

I'm less than thrilled with my rag results and looking to see if anyone has some suggested reads they found useful around metrics and root causing, etc. I am reading this atm - https://blog.llamaindex.ai/evaluating-the-ideal-chunk-size-for-a-rag-system-using-llamaindex-6207e5d3fec5 which has good information but admitidaly not a rag expert so there could be much better reads i am overlooking. Rag implementation is basically - scraped a bunch of websites related to a topic. when asking it questions, ones i know data exists and in some cases using the exact title from metadata it's not finding them and returning stuff from other unrelated blog texts.
12 comments
s
L
s
sysfor
·

Service context

Curious if it's possible to use Mistral for service context? If I replace with OpenAI it works just fine.

Plain Text
llm = MistralAI(model="mistral-tiny", 
                api_key=MISTRAL_API_KEY)
....
print(type(llm))
<class 'llama_index.llms.mistral.MistralAI'>
....

service_context = ServiceContext.from_defaults(llm=llm,
                                    embed_model=embed_model,
                                    system_prompt=SPF)
....
index = VectorStoreIndex.from_documents(
    documents=[web_docs, pdf_docs],
    service_context=service_context,
    storage_context=storage_context,
    show_progress=True
)


It's throwing an OpenAI error. Appears to be trying to default to OpenAI: Could not load OpenAI model. If you intended to use OpenAI, please check your OPENAI_API_KEY.
5 comments
s
L