secondly what is fusion rag and how does it compare against bm25s, re ranking algorithms

Please stand by @payload as I search the knowledge base. I normally take about 20 seconds or so. :wave:

----------------------
The Fusion RAG (Reciprocal Rank Fusion) is a method used in LlamaIndex to combine retrieval results from multiple queries and multiple indexes. The retrieved nodes are reranked according to the Reciprocal Rerank Fusion algorithm, which provides an efficient method for reranking retrieval results without excessive computation or reliance on external models. This method is demonstrated in this paper () and the implementation in LlamaIndex is credited to @Raduaschl on GitHub for their example implementation () [source ()].

BM25 is a ranking function used by search engines to rank matching documents according to their relevance to a given search query. In LlamaIndex, a BM25 based retriever can be fused with the index to capture both semantic relations and keywords in input queries. The BM25 retriever and the vector retriever (another type of retriever used in LlamaIndex) both calculate a score, and these scores can be used by the Reciprocal Rerank Fusion algorithm to re-sort the nodes [source ()].

To use the Fusion RAG without BM25, you would need to replace the BM25 retriever with another type of retriever that is compatible with the Fusion RAG. However, the specific steps to do this would depend on the alternative retriever you want to use. The BM25 retriever is not strictly necessary for the Fusion RAG, but it is used in the examples provided in the LlamaIndex documentation because it is a well-established and effective method for ranking search results.
----------------------
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
----------------------

@kapa.ai how can i use only rag fusion using llamaindex and a vector store

Let me check the knowledge base for you. One moment! :books:

----------------------
To use only RAG Fusion with LlamaIndex and a Vector Store, you would need to set up your Vector Store and then use it to create a VectorStoreIndex. After that, you can use this index to create a query engine and make queries. Here's a simplified example of how you might do this:

First, import the necessary modules and set up your Vector Store:

Plain Text

from llama_index.core import VectorStoreIndex, StorageContext
from llama_index.embeddings.openai import OpenAIEmbedding

# Assuming you have a pre-existing vector_store
vector_store = your_vector_store

Next, create a VectorStoreIndex from your Vector Store:

Plain Text

index = VectorStoreIndex.from_vector_store(vector_store)

Then, create a query engine from your index:

Plain Text

query_engine = index.as_query_engine()

Finally, you can use your query engine to make queries:

Plain Text

query_str = "Your query here"
response = query_engine.query(query_str)
print(str(response))

Please note that this is a simplified example and the actual code you need to write may vary depending on your specific use case and the type of Vector Store you are using. For more detailed examples and information, you can refer to the LlamaIndex documentation ().

Sources:

Building a (Very Simple) Vector Store from Scratch ()
Using VectorStoreIndex ()
Advanced RAG with temporal filters using LlamaIndex and KDB.AI vector store ()
Building a Live RAG Pipeline over Google Drive Files ()
Using a Vector Store as an Index ()

----------------------
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
----------------------

@Logan M @WhiteFang_Jr ?

Yes you can use RAG fusion with a vector store also, I think you'll need to get the retriver and pass it to the QueryFusionRetriever.

I did it like this:

Plain Text

from llama_index.core.query_engine import RetrieverQueryEngine
from llama_index.core.retrievers import QueryFusionRetriever

# Get retriever from the index attached with the vector store
retriever = vector_index.as_retriever( similarity_top_k=TOP_K_VALUE )

# add this retriver to the 
retriever = QueryFusionRetriever(
            [retriever],
            similarity_top_k=10,
            num_queries=0,  # Adjust based on how many variations of the query you want to generate
            mode="reciprocal_rerank",
            use_async=False,
            verbose=True
        )

# Use the retriver in a query_engine
query_engine = RetrieverQueryEngine.from_args(retriever)

You can follow this doc for more: https://docs.llamaindex.ai/en/stable/examples/retrievers/relative_score_dist_fusion/?h=queryfusionretriever

Can I create a chat engine

I am working on a support bot so need chat history

Yes,

Plain Text

from llama_index.core.chat_engine import CondensePlusContextChatEngine
chat_engine = CondensePlusContextChatEngine.from_defaults(
            retriever=retriever
        )

My data is in qdrant vector store so I am guessing i can over write retriever there also index as chat engine

no, you'll need QueryFusionRetriever instance that can happen once you pass the current one from the index.

Got it

create vector index as a retriever
Pass it query fusion retriever
Then pass retriever to condense plus context chat engine

Will try it and update the results, thanks

One question that I have, are you adding query fusion just for re-ranker?

I need to rewrite the user query as it might not be good for retrieval,
If the query is not good then the retrieved node will be irrelevant (this issue I am facing currently)
So need to able to rewrite the query like (Hyde query translation which is essentially query rewriting) then I came across fusion rag and bm25s. So trying this approach

Do you have any suggestions?

The chat engine that I have provided uses condense mode in which user query is transformed based on user conversation history

Actually I had that doubt also when I tracked traces in langfuse
So you are suggested I should directly use re ranking?

No not re-ranking. From your statement I got that user query may not be correct in some cases right.

If that's the only case then the chat engine shared above can help you in that case.

Do give it a try