Hi,

At a glance

Hi,
I am getting the error in the metrics = ["hit_rate", "mrr", "precision", "recall", "ap", "ndcg"] that precision, recall, ap and ndcg are the invalid mertic names. i have upgraded my llamaindex package to 0.11.14. Can you highlight what is the cause of this error?

12 comments

WWhiteFang_Jr

Hi, which doc example are you following, would you mind sharing the link and also the full error that you are getting

HHammad

Hi,
So basically I am using a different approach, my RAG pipe line model code is;
summarizer = TreeSummarize(
service_context = ServiceContext.from_defaults(
llm=llm, embed_model=embed_model,chunk_size=1024
)
)
service_context = ServiceContext.from_defaults(chunk_size=1024, llm=llm, embed_model=embed_model)
index = VectorStoreIndex.from_documents(pdfdocuments,service_context=service_context)
retriever = index.as_retriever(similarity_top_k=3)
p = QueryPipeline(verbose=True)
p.add_modules(
{
"input": InputComponent(),
"retriever": retriever,
"summarizer": summarizer,
}
)
p.add_link("input", "retriever")
p.add_link("input", "summarizer", dest_key="query_str")
p.add_link("retriever", "summarizer", dest_key="nodes")
output = p.run(input='How is the current economic and political climate expected to impact consumer behavior during Ramazan 2023?')
output_str = output.response

I want to evaluate this RAG pipeline, I was trying RAGAS and trulens both of them giving me the import error, i tried different solution but none of them work. I posted my problem in this channel and get the link from Jack which was this https://docs.llamaindex.ai/en/stable/examples/evaluation/retrieval/retriever_eval/. When going through the link I use this code to get the metric
from llama_index.core.evaluation import RetrieverEvaluator

metrics = ["hit_rate", "mrr", "precision", "recall", "ap", "ndcg"]
retriever_evaluator = RetrieverEvaluator.from_metric_names(
metrics, retriever=retriever)

and got this error ;
ValueError: Invalid metric name: precision

Also I am want to know how to get the expected id from the service context I have used?

WWhiteFang_Jr

I can see Precision here: https://github.com/run-llama/llama_index/blob/a620a2661faabb49ba2f257bff7ae2ac04d0c12b/llama-index-core/llama_index/core/evaluation/retrieval/metrics.py#L457

Not sure why you got this error in the first place

HHammad

Can you provide that how can I get the node ID (expected) and document meta data related to my index?

WWhiteFang_Jr

When you say expected what do you mean here, could you please elaborate on this

HHammad

Basically I mean by which node ID the chunks are being saved along with their meta deta.

HHammad

LlamaIndex retrieval evaluator works on the same phenomena where it comapre the retireve node ID or node with the actual node and calculate the retreival metrices.

LLogan M

the node id comes from the node

Normally you generate a dataset by generating a question from a node, where its assumed that it should retrieve that node in order to answer that question

From that proces, you have the ground truth id, and you can compare to the retrieved ids

HHammad

retrieveddocs = retriever.retrieve("How is the current economic and political climate expected to impact consumer behavior during Ramazan 2023?")print("Retrieved document IDs:", [doc.id for doc in retrieved_docs])

From the above code I can extract the retrieved Ids but how can I get the ground truth ids, please can you guide me.

LLogan M

Plain Text

retrieved_docs = retriever.retrieve("...")
for node_with_score in retrieved_docs:
  print(node_with_score.node.id_)

Ground truth IDs depend on how you build your dataset. We have utils to do this for you, or you can build it yourself
https://docs.llamaindex.ai/en/stable/module_guides/evaluating/usage_pattern_retrieval/

HHammad

Hi Logon,
The code I have just provided got crashed and I have written a new code from the llama-index documentation,
embed_model = HuggingFaceEmbedding(model_name="BAAI/bge-small-en-v1.5")
parser = SentenceSplitter(chunk_size=1024)
nodes = parser.get_nodes_from_documents(pdfdocuments)
Settings.embed_model = embed_model
Settings.llm = llm
Settings.transformations = [SentenceSplitter(chunk_size=1024)]
index = VectorStoreIndex(
nodes, transformations=Settings.transformations
)
retriever = VectorIndexRetriever(
index=index,
similarity_top_k=3,
)
response_synthesizer = get_response_synthesizer(
response_mode="tree_summarize",
)
query_engine = RetrieverQueryEngine(
retriever=retriever,
response_synthesizer=response_synthesizer,
)
response = query_engine.query('How is the current economic and political climate expected to impact consumer behavior during Ramazan 2023?')
print(response)

can you please guide that the service context is depreciated so settings is used to call the LLM, so how the LLM is summarizing the retrieval documents as it is not used in the response_synthesizer?

HHammad

and also if I have split the node to a chunk size of 1024, does this Settings.transformations = [SentenceSplitter(chunk_size=1024)] further split the nodes?

Add a reply

Find answers from the community

Hi,