----------------------
Yes, in LlamaIndex, you can perform a hybrid search which combines results from both semantic search (embedding similarity) and keyword search. This is particularly useful when embeddings fail to return text chunks with matching keywords in the retrieval step. There are two main ways to achieve this:
- Use a vector database that has a hybrid search functionality. You can see a complete list of supported vector stores in the LlamaIndex documentation ().
- Set up a local hybrid search mechanism with BM25.
In the context of weighting the influence of metadata and text chunk on the similarity score, you can use the
alpha
parameter in the hybrid search. This parameter determines the weighting (alpha = 0 -> bm25, alpha=1 -> vector search). By default,
alpha=0.75
is used, which is very similar to vector search. Here is an example of how to use it:
query_engine = index.as_query_engine(
vector_store_query_mode="hybrid", similarity_top_k=2, alpha=0.0
)
response = query_engine.query(
"What did the author do growing up?",
)
In this example, setting
alpha=0.
favors bm25, which is a popular ranking function used by search engines to estimate the relevance of a document to a given search query.
You can find more details and examples in the
LlamaIndex documentation ().
----------------------
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
----------------------