In Llama Index, is there a way to do a hybrid search by...

At a glance

The community member asked if there is a way to do a hybrid search in Llama Index, where the similarity score can be influenced by both metadata and text chunk. The response from another community member explains that yes, Llama Index supports hybrid search, which combines results from semantic search (embedding similarity) and keyword search (BM25). To control the weighting of metadata and text chunk, the alpha parameter can be used, where alpha=0 favors BM25 and alpha=1 favors vector search. An example of how to use this is provided.

nnode0

@kapa.ai In Llama Index, is there a way to do a hybrid search by specifying how much the metadata and the text chunk should influence the similarity score?

2 comments

kkapa.ai

Please stand by @node0 as I search the knowledge base. I normally take about 20 seconds or so. :wave:

kkapa.ai

----------------------
Yes, in LlamaIndex, you can perform a hybrid search which combines results from both semantic search (embedding similarity) and keyword search. This is particularly useful when embeddings fail to return text chunks with matching keywords in the retrieval step. There are two main ways to achieve this:

Use a vector database that has a hybrid search functionality. You can see a complete list of supported vector stores in the LlamaIndex documentation ().
Set up a local hybrid search mechanism with BM25.

In the context of weighting the influence of metadata and text chunk on the similarity score, you can use the alpha parameter in the hybrid search. This parameter determines the weighting (alpha = 0 -> bm25, alpha=1 -> vector search). By default, alpha=0.75 is used, which is very similar to vector search. Here is an example of how to use it:

Plain Text

query_engine = index.as_query_engine(
    vector_store_query_mode="hybrid", similarity_top_k=2, alpha=0.0
)
response = query_engine.query(
    "What did the author do growing up?",
)

In this example, setting alpha=0. favors bm25, which is a popular ranking function used by search engines to estimate the relevance of a document to a given search query.

You can find more details and examples in the LlamaIndex documentation ().
----------------------
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
----------------------

Add a reply

Find answers from the community

In Llama Index, is there a way to do a hybrid search by specifying how much the metadata a