sysfor

I'm having an issue reading from a local SimpleVectorStore and can't figure it out. The question is an exact replica of one in the ingested docs. Works fine in the qdrant i have setup (using other code obviously but same concepts).

(code in thread)

6 comments

ssysfor

With 0.10.7, is this `pip3 install llama

With 0.10.7, is this pip3 install llama-index-vector-stores-chroma all that's required for from llama_index.vector_stores.chroma import ChromaVectorStore it's installed but it is the only import that isn't resolving after migrating to 0.10.7. Curious if there is something else I need to do here?

Plain Text

Requirement already satisfied: llama-index-vector-stores-chroma in ./env/lib/python3.10/site-packages (0.1.2)

Requirement already satisfied: llama-index-core<0.11.0,>=0.10.1 in ./env/lib/python3.10/site-packages (from llama-index-vector-stores-chroma) (0.10.9)

2 comments

ssysfor

Error

I'm using MistralAIEmbedding along with ChromaDB as a vector store.

Getting the following error now: chromadb.errors.InvalidDimensionException: Embedding dimension 1536 does not match collection dimensionality 1024 because Mistral has a dimensionality of 1024

For example using faiss i can specify the dimensionality

d = 1024
faiss_index = faiss.IndexFlatL2(d)

d = property(_swigfaiss.Index_d_get, _swigfaiss.Index_d_set, doc=r""" vector dimension""")

^ i don't see anything like this in Chroma. Is this possible to do using ChromaVectorStore?

6 comments

ssysfor

When setting up my vector database (

When setting up my vector database (qdrant) I didn't use enable_hybrid=True, and I would like to compare the differences in quality. Is there a way to do this after the fact with llama index, or would I need to rebuild the entire vector db from scratch?

5 comments

ssysfor

Are there some methods for speeding up

Are there some methods for speeding up synthesize? the llm calls within synthesize are the most timely in my rag when looking at tracing via phoenix

1 comment

ssysfor

where is the correct place to print out

where is the correct place to print out the final prompt that's being sent to llm? would i need to leverage the callback manager in the RetrieverQueryEngine when assinign the variable vector_query_engine any good examples?

Plain Text

    
def _query_index(self, query_engine: RetrieverQueryEngine, query: str) -> RESPONSE_TYPE:
        embedded_query = Settings.embed_model.get_text_embedding(query)      
        response = query_engine.query(QueryBundle(query_str=query, embedding=embedded_query))
        
        return response

def _create_query_engine(self) -> RetrieverQueryEngine:
        vector_index = VectorStoreIndex.from_vector_store(vector_store=self.vector_store, 
                                                   embed_model=Settings.embed_model)
        
        vector_retriever = VectorIndexRetriever(index=vector_index, similarity_top_k=5)
        
        vector_query_engine = RetrieverQueryEngine(
            retriever=vector_retriever,
            response_synthesizer=self.response_synthesizer,
            node_postprocessors=[SimilarityPostprocessor(similarity_cutoff=0.50),],
        )
        
        vector_query_engine.update_prompts({"response_synthesizer:text_qa_template": self.qa_prompt_tmpl})
                
        return vector_query_engine

def query_rag(self, query: str) -> Dict[str, Any]:
        vector_query_engine = self._create_query_engine()

        response = self._query_index(query_engine=vector_query_engine, query=query)

5 comments

ssysfor

Is there a benefit to using either of

Is there a benefit to using either of these response variables?

response = engine.query(query)

response = engine.query(QueryBundle(query_str=query, embedding=embedded_query))

Plain Text

    def query_index(self, query_engines: List[BaseQueryEngine], queries: List[str]):
        for query in queries:
            embedded_query = Settings.embed_model.get_text_embedding(query)
            for engine in query_engines:
                response = engine.query(query)
                <...or...>
                response = engine.query(QueryBundle(query_str=query, 
                                                    embedding=embedded_query))

Plain Text

    vector_index = VectorStoreIndex.from_vector_store(vector_store=rag.vector_store, 
                                               embed_model=Settings.embed_model)

    query_engine0 = vector_index.as_query_engine(llm=Settings.llm,
                                         similarity_top_k=15, 
                                         node_postprocessors=[
                                                SimilarityPostprocessor(similarity_cutoff=0.60), 
                                                cohere_rerank
                                            ]
                                        )

3 comments

ssysfor

what would be the best way to extract

what would be the best way to extract keywords from a query string? would it be best to call a fast llm ~7b, or is there something native inside llama index to do this? I would like to extract keywords from the query and then use a node_postprocessor to look for said keywords in node metadata. not sure how practical this is but i find some instances in my testing where I think this would be useful

3 comments

ssysfor

Rag

I'm less than thrilled with my rag results and looking to see if anyone has some suggested reads they found useful around metrics and root causing, etc. I am reading this atm - https://blog.llamaindex.ai/evaluating-the-ideal-chunk-size-for-a-rag-system-using-llamaindex-6207e5d3fec5 which has good information but admitidaly not a rag expert so there could be much better reads i am overlooking. Rag implementation is basically - scraped a bunch of websites related to a topic. when asking it questions, ones i know data exists and in some cases using the exact title from metadata it's not finding them and returning stuff from other unrelated blog texts.

12 comments

ssysfor

Service context

Curious if it's possible to use Mistral for service context? If I replace with OpenAI it works just fine.

Plain Text

llm = MistralAI(model="mistral-tiny", 
                api_key=MISTRAL_API_KEY)
....
print(type(llm))
<class 'llama_index.llms.mistral.MistralAI'>
....

service_context = ServiceContext.from_defaults(llm=llm,
                                    embed_model=embed_model,
                                    system_prompt=SPF)
....
index = VectorStoreIndex.from_documents(
    documents=[web_docs, pdf_docs],
    service_context=service_context,
    storage_context=storage_context,
    show_progress=True
)

It's throwing an OpenAI error. Appears to be trying to default to OpenAI: Could not load OpenAI model. If you intended to use OpenAI, please check your OPENAI_API_KEY.

5 comments

Find answers from the community

Asking questions like where are our logging gaps? in a vector db

Issue reading from local SimpleVectorStore

With 0.10.7, is this `pip3 install llama

Error

When setting up my vector database (

Are there some methods for speeding up

where is the correct place to print out

Is there a benefit to using either of

what would be the best way to extract

Rag

Service context