Find answers from the community

Updated 10 months ago

Hello, I have an issue when using

At a glance

The community member is having an issue with the QueryFusionRetriever where they are using similarity_top_k=8 but only seeing 3 chunks. They ask how to configure more chunks.

Other community members provide suggestions, such as setting similarity_top_k=12 for each retriever and then the overall fusion retriever will take the top 12 across all indexes. They also discuss how to print the chunks selected from each index and a warning related to combining a vector index and BM25 retriever.

There is no explicitly marked answer in the comments.

Hello, I have an issue when using QueryFusionRetriever, I'm using similarity_top_k=8 but in chunks I see only 3. How I can configure more chunks ? Thanks
L
A
10 comments
Do you have a code sample?

It might be removing duplicate chunks
Oh, thats the wrong place to put the top-k

You probably meant to have something like this

Plain Text
    index = VectorStoreIndex.from_vector_store(
                vector_store=vector_store,
                embed_model=embed_model
            )

    indexes.append(index.as_retriever(similarity_top_k=12))

retriever = QueryFusionRetriever(
    retrievers=indexes,
    llm=llm,
    mode=FUSION_MODES.SIMPLE,
    similarity_top_k=12,
    num_queries=1,
    use_async=False,
    verbose=True
)

query_engine = RetrieverQueryEngine.from_args(
            retriever,
            llm=llm,
            qa_prompt_template=qa_prompt_template,
            refine_prompt_template=refine_prompt_template,
        )


chat_engine = CondenseQuestionChatEngine.from_defaults(
            query_engine=query_engine,
            verbose=True,
        )
response = chat_engine.chat(question)
Now, every retriever retruns 12 nodes, and then the overall fusion retriever takes those returns the top 12 across all indexes
@Logan M how can I print each index from indexes to see what chunks have been selected ?
like, chunks from every index? Or just the overall top-12
@Logan M if I want to combine my index vector and BM25 retrievers for reciprocal reranking fusion
Plain Text
bm25_retriever = BM25Retriever.from_defaults(nodes=nodes, similarity_top_k=12)
I get this warning "BM25Retriever does not support embeddings, skipping..." what can be the issue ?
Thats a benign print, it means the query bundle sent to the bm25 retriever had embeddings attached to it, and its just saying its not using those. Its still running retrieval on bm25
There is a way to remove this warnings, maybe sent in a different way ?
hmm, probably, a PR is needed to remove it lol
Add a reply
Sign up and join the conversation on Discord