LlamaIndex

Log inLog into community

Find answers from the community

Updated last year

hi I am facing this error while using

hi I am facing this error while using

At a glance

The post describes an error encountered while using the QueryFusionRetriever, specifically a TypeError: unsupported operand type(s) for *: 'NoneType' and 'float'. Community members have discussed potential solutions, such as running the code outside of a notebook, checking the setup, and examining the traceback. Some community members have also faced the same issue and opened a GitHub issue.

The community members have provided code snippets and tried various approaches, including using different retrievers (BM25 and vector store) within the QueryFusionRetriever, adjusting the num_queries parameter, and examining the queries generated by the fusion retriever. However, they have not been able to consistently reproduce the issue.

The solution provided by one community member is that the issue was caused by using a small (2B) language model that did not generate the correct query format, resulting in the fusion retriever receiving a NoneType value. The community member suggests that small language models (<=7B) are not suitable for use in these contexts.

ggaurang.u.bhatt

·

hi, I am facing this error while using QueryFusionRetriever. Error: TypeError: unsupported operand type(s) for *: 'NoneType' and 'float'

anyone faced similar error? any solution? @WhiteFang_Jr

Attachment

1

L

g

L

36 comments

Can you run the code outside of a notebook? The traceback is truncated and I can't tell where the issue is

Or what did your setup look like?

I see the traceback is in an embedding similarity calculation, but no idea how that broke

ggaurang.u.bhatt

Ok, let me run it separately and send you results.

LLeonardo Oliva

Hello, I've the same identical problem, also I've opened an issue on github

LLeonardo Oliva

Attached the traceback

LLeonardo Oliva

May anyone help me? @Logan M @WhiteFang_Jr

Yea I saw the issue. Providing any way to replicate would be helpful

LLeonardo Oliva

@Logan M I'm trying a fusion retriever with BM25 retriever and a vectorindexretrieverer. There's just one PDF.
Here's some piece of code:

LLeonardo Oliva

load the storage and the query engine

storage_context = StorageContext.from_defaults(persist_dir=STORE_FILES_PATH,)
index = load_index_from_storage(storage_context)

reranker = RankGPTRerank(llm = llm ,top_n = RERANKER_TOP_N, verbose = True)
vector_store_retriever = get_retriever("vector",index, SIMILARITY_TOP_K)
print(f"FOUND RETRIEVER ++++vector_store_retriever: {vector_store_retriever}")
bm25_retriever = get_retriever("bm25",index, SIMILARITY_TOP_K)
print(f"FOUND RETRIEVER ++++bm25_retriever: {bm25_retriever}")
#query_engine = index.as_query_engine(streaming=True, similarity_top_k=SIMILARITY_TOP_K, node_postprocessors=[reranker])
#query_engine_no_rerank = index.as_query_engine(streaming=True, similarity_top_k=SIMILARITY_TOP_K)

fusion_retriever = QueryFusionRetriever(
[vector_store_retriever, bm25_retriever],
similarity_top_k=SIMILARITY_TOP_K,
num_queries=3, # set this to 1 to disable query generation
mode="reciprocal_rerank",
verbose=True,
use_async=True
)
retrieved = fusion_retriever.retrieve(QueryBundle("what are the baseline requirements for Availability?"))

query_engine_fusion = RetrieverQueryEngine.from_args(fusion_retriever)

while True:

response = query_engine_fusion.query("what are the baseline requirements for Availability?")
print(""10 + "LLM RESPONSE + ""10+ "\n")
display_response(response)

LLeonardo Oliva

THIS PIECE OF CODE: retrieved = fusion_retriever.retrieve(QueryBundle("what are the baseline requirements for Availability?"))

Already launches the exceptions

LLeonardo Oliva

@gaurang.u.bhatt have you solved this issue somehow?

LLeonardo Oliva

@Logan M I've found that if I set num_queries=1 on the QueryFusionRetriever the exception does not occour

Trying to reproduce, this works fine for me. The error really indicates that somehow a vector store query result doesn't have embeddings?

Plain Text

from llama_index.core import VectorStoreIndex, SimpleDirectoryReader, Settings
from llama_index.retrievers.bm25 import BM25Retriever
from llama_index.core.retrievers import QueryFusionRetriever

Settings.chunk_size = 256

# load documents
documents = SimpleDirectoryReader("./docs/examples/data/paul_graham/").load_data()

# create the index
index = VectorStoreIndex.from_documents(documents)

# create the retriever
vector_retriever = index.as_retriever(similarity_top_k=2)
bm25_retriever = BM25Retriever.from_defaults(index=index, similarity_top_k=2)

# create the fusion retriever
two_fusion_retriever = QueryFusionRetriever(
    [vector_retriever, bm25_retriever],
    num_queries=3,
    mode="reciprocal_rerank",
    use_async=True,
)

bm25_fusion_retriever = QueryFusionRetriever(
    [bm25_retriever],
    num_queries=3,
    mode="reciprocal_rerank",
    use_async=True,
)

vector_fusion_retriever = QueryFusionRetriever(
    [vector_retriever],
    num_queries=3,
    mode="reciprocal_rerank",
    use_async=True,
)

nodes = two_fusion_retriever.retrieve("What did the author do growing up?")
print(len(nodes))
nodes = bm25_fusion_retriever.retrieve("What did the author do at Viaweb?")
print(len(nodes))
nodes = bm25_fusion_retriever.retrieve("What did the author do at Interleaf?")
print(len(nodes))

LLeonardo Oliva

YEP I think that this is the cause, but the retrievers always retrieve the nodes

LLeonardo Oliva

I did not kwno that I could specify num_queries on other retrievers different from the fusion one! I'll try.

oh, they are all fusion retrievers 😅 I was just trying to see if a particular retreiver inside the fusion retriever was causing an issue

LLeonardo Oliva

ah yeah, sorry.....my mistake..

LLeonardo Oliva

yep also for me it happens that all the retrievers return the nodes, but when I use them in a fusion retriever, that exception occours

LLeonardo Oliva

in order to reproduce the issue, may you try to put two retrievers insiede a QueryFusionRetriever?

So I tried this with a few queries

Plain Text

two_fusion_retriever = QueryFusionRetriever(
    [vector_retriever, bm25_retriever],
    num_queries=3,
    mode="reciprocal_rerank",
    use_async=True,
)

But was not able to find any errors 🤔

LLeonardo Oliva

what top_k are you using in this declaration?

LLeonardo Oliva

the default one, 2...

LLeonardo Oliva

I'm goin to try the same topk to see if it crashes

LLeonardo Oliva

nah, it crashes anyway...

LLeonardo Oliva

so what I'm going to try now is to get the queries generated fromthe Fusion retriver, and I'm going to put them into a retrieve() method of both retirevers to see if they return anything

LLeonardo Oliva

Attachment

LLeonardo Oliva

I'm speechles :(:(:(

awe man 😅

LLeonardo Oliva

I really don't know what to think

LLeonardo Oliva

Guys, I'll leave the "solution" here

LLeonardo Oliva

I was using a 2b LLM (Gemini) that did not generate a correct query, so the Fusion retriver did just receive NoneType

LLeonardo Oliva

that's it 🙂

LLeonardo Oliva

little LLM are awful to use in these contexts.

@Leonardo Oliva Hi, currently struggeling with the same problem. how did you extract the queries from the hybrid retriever? Could you provide a small code sample, that would really help me 🙂

LLeonardo Oliva

If you're using a small llm (<=7B) it usually does not generates the correct query format and llamaindex will dump

Add a reply

Sign up and join the conversation on Discord