Find answers from the community

Updated last year

Hi we are trying to summarize very long

At a glance
Hi, we are trying to summarize very long text. Use case is we will extract the entire chat conversation between our customer and our customer service agent then get a summary so it'll be easier to handoff to another cs agent. We tried pure OpenAI calls but we are hitting the token limit even if we use gpt-3.5-16k. I was thinking we can use llama_index for this use case. Have you guys tried this before? Any patterns which we can use?

Would really appreciate to point me to the right direction. TIA!
d
L
26 comments
There's also a few other approaches
Like

Plain Text
documents = SimpleDirectoryReader("./data").load_data()

index = ListIndex.from_documents(documents)

response = index.as_query_engine(response_mode="tree_summarize").query("Summarize this text")
For a chat though, leaving the response mode to be default might work better too, not sure
the example is working fine for a small document. however, I encounter this error with a big one

WARNING:llama_index.llms.openai_utils:Retrying llama_index.llms.openai_utils.acompletion_with_retry.<locals>._completion_with_retry in 4.0 seconds as it raised APIConnectionError: Error communicating with OpenAI.
@Logan M does the example you gave use OpenAI also?
Yea it uses openai

That's a bit of an odd error, I'm going to assume openai is just having some server issues?
not sure. maybe it's a rate limit issue i dunno
do you mind send the full code for the example you gave?
sorry for being greedy 😄
i tried running it without OpenAI and it downloaded llama
Yea that's the local fallback
I believe you that you have it set up right. Normally if it's a rate limit issue it well say RateLimitError 🤔
Are you just using openai free credits right now or nah?
i am using a paid openai
i just run the entire code you gave

import os import logging import sys from llama_index import ( SimpleDirectoryReader, ListIndex, ) documents = SimpleDirectoryReader("./data").load_data() index = ListIndex.from_documents(documents) response = index.as_query_engine(response_mode="tree_summarize").query("Summarize this text")
Yea, like for me that always works 🤔 maybe there's something weird about the text you are reading/sending?
maybe, could it be the emojis 😄
i am encounting this error with the example you gave, any ideas on what is the issue? it is using a local llama model

**

llama_print_timings: load time = 92952.89 ms
llama_print_timings: sample time = 254.52 ms / 252 runs ( 1.01 ms per token, 990.10 tokens per second)
llama_print_timings: prompt eval time = 224784.60 ms / 1031 tokens ( 218.03 ms per token, 4.59 tokens per second) llama_print_timings: eval time = 118937.99 ms / 251 runs ( 473.86 ms per token, 2.11 tokens per second) llama_print_timings: total time = 345072.56 ms Exception ignored in: <function Llama.__del__ at 0x124d45d00> Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/llama_cpp/llama.py", line 1558, in __del__ TypeError: 'NoneType' object is not callable
looks like a max token issue with llama
i was able to get a result but still have this error


Spanish rule ended in 1898 with Spain's defeat in the Spanish–American Exception ignored in: <function Llama.__del__ at 0x11e884fe0> Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/llama_cpp/llama.py", line 1558, in __del__ TypeError: 'NoneType' object is not callable
isn't the chunking supported for llama?
i didn't encounter the max token issue using OpenAI
tried this but it didnt work

Plain Text
service_context = ServiceContext.from_defaults(llm='local', chunk_size_limit=3000)
Is this really an error? I think this is getting printed once your script has finished running.

I've seen this too, just a harmless llamacpp bug when the program is shutting down
i see got it. thanks!
Add a reply
Sign up and join the conversation on Discord