Find answers from the community

Updated 8 months ago

Is there a difference between summary

Is there a difference between summary index and document summary index? Can anyone help me better understand these two
U
k
L
9 comments
Let me check the knowledge base for you. One moment! :books:
Sorry @Uzair, I seem to be having a bit of trouble. Don't worry though, I've pinged my support and they will try to fix me up! :technologist:

Perhaps try again in 15mins?
Summary index sends all nodes to the llm to refine an answer

Document summary index generates a summary for each document, and uses that summary to pick which document(s) to use for the query
Thank you, I am having another problem.
I made DocumentSummaryIndexLLMRetriever as shown

Plain Text
retriever = DocumentSummaryIndexLLMRetriever(
    index=doc_summary_index,
    choice_batch_size=10,
    choice_top_k=2,
)

retrieved_nodes = retriever.retrieve("DĂ© un resumen de los tratamientos para la fibrosis quĂ­stica")
print(len(retrieved_nodes))
but it is showing error

Plain Text
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-35-3639e5dd6690> in <cell line: 1>()
----> 1 retrieved_nodes = retriever.retrieve("DĂ© un resumen de los tratamientos para la fibrosis quĂ­stica")
      2 print(len(retrieved_nodes))

3 frames
/usr/local/lib/python3.10/dist-packages/llama_index/core/instrumentation/dispatcher.py in wrapper(func, instance, args, kwargs)
    272             )
    273             try:
--> 274                 result = func(*args, **kwargs)
    275             except BaseException as e:
    276                 self.event(SpanDropEvent(span_id=id_, err_str=str(e)))

/usr/local/lib/python3.10/dist-packages/llama_index/core/base/base_retriever.py in retrieve(self, str_or_query_bundle)
    242                 payload={EventPayload.QUERY_STR: query_bundle.query_str},
    243             ) as retrieve_event:
--> 244                 nodes = self._retrieve(query_bundle)
    245                 nodes = self._handle_recursive_retrieval(query_bundle, nodes)
    246                 retrieve_event.on_end(

/usr/local/lib/python3.10/dist-packages/llama_index/core/indices/document_summary/retrievers.py in _retrieve(self, query_bundle)
     96                 query_str=query_str,
     97             )
---> 98             raw_choices, relevances = self._parse_choice_select_answer_fn(
     99                 raw_response, len(summary_nodes)
    100             )

/usr/local/lib/python3.10/dist-packages/llama_index/core/indices/utils.py in default_parse_choice_select_answer_fn(answer, num_choices, raise_error)
    102                     "answer_num: <int>, answer_relevance: <float>"
    103                 )
--> 104         answer_num = int(line_tokens[0].split(":")[1].strip())
    105         if answer_num > num_choices:
    106             continue

IndexError: list index out of range
Can you help me with this error? Why is it occuring and how to fix it?
The LLM is selecting which summary to use, and failing to pick in a way that llama-index can parse

The LLM retriever is a tad flakey this way, tbh I would just use the embedding retriever
Add a reply
Sign up and join the conversation on Discord