Find answers from the community

Updated 2 years ago

summarize fail

Plain Text
llm_predictor_chatgpt = LLMPredictor(llm=ChatOpenAI(temperature=0, model_name="gpt-3.5-turbo"))
service_context = ServiceContext.from_defaults(llm_predictor=llm_predictor_chatgpt, chunk_size=1024)
# get the summary prompt
summory_prompt = ""
with open("summary_prompt.txt", "r") as f:
    summary_prompt = f.read()
summary_query = summory_prompt
print(f"Summary Query length {len(summary_query)}")
text = f"{post.title}\n{post.subtitle}\n{post.content}"
document = Document(text, article.url)
document_summary_index = DocumentSummaryIndex.from_documents([document], service_context=service_context)
index = document_summary_index.as_query_engine()
summary = index.query(summary_query)
print(f"Summary: {summary}")
Z
W
L
7 comments
Plain Text
File "/Users/zachhandley/Documents/GitHub/AI-ChannelBot/article_summarizer.py", line 28, in summarize_article
    summary = index.query(summary_query)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/llama_index/indices/query/base.py", line 23, in query
    response = self._query(str_or_query_bundle)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/llama_index/query_engine/retriever_query_engine.py", line 142, in _query
    nodes = self._retriever.retrieve(query_bundle)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/llama_index/indices/base_retriever.py", line 21, in retrieve
    return self._retrieve(str_or_query_bundle)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/llama_index/indices/document_summary/retrievers.py", line 80, in _retrieve
    raw_choices, relevances = self._parse_choice_select_answer_fn(
                              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/llama_index/indices/utils.py", line 100, in default_parse_choice_select_answer_fn
    answer_num = int(line_tokens[0].split(":")[1].strip())
                     ~~~~~~~~~~~~~~~~~~~~~~~~~^^^
IndexError: list index out of range
Could you try reading the file like this
Plain Text
# open a file
file1 = open("data.txt", "r", encoding="utf-8")

# read the file
read_content = file1.read()
This seems like the LLM isn't following instructions (it's failing when "selecting"). Seems like a small bug when the document summary index only has one document, it shouldn't need to "select" a document when there is only one

Although if you just want the summary of a file, I would use a ListIndex with response_mode="tree_summarize"

Plain Text
from llama_index import ListIndex
index = ListIndex.from_documents([document])
query_engine = index.as_query_engine(response_mode="tree_summarize")
response = index.query("Summarize this document")
gotcha noted, quick question, what would you recommend to use if I want to select 10 times from a list of things? Is an agent the right thing to use?
I did end up getting it to work
I had to change the way I was asking it for a summary rather then using the document summary index as a query engine instead I used get_document_summary and that worked really well
Add a reply
Sign up and join the conversation on Discord