Find answers from the community

Updated last year

Hi we are trying to summarize very long

Hi, we are trying to summarize very long text. Use case is we will extract the entire chat conversation between our customer and our customer service agent then get a summary so it'll be easier to handoff to another cs agent. We tried pure OpenAI calls but we are hitting the token limit even if we use gpt-3.5-16k. I was thinking we can use llama_index for this use case. Have you guys tried this before? Any patterns which we can use?

Would really appreciate to point me to the right direction. TIA!
d
L
26 comments
There's also a few other approaches
Like

Plain Text
documents = SimpleDirectoryReader("./data").load_data()

index = ListIndex.from_documents(documents)

response = index.as_query_engine(response_mode="tree_summarize").query("Summarize this text")
For a chat though, leaving the response mode to be default might work better too, not sure
the example is working fine for a small document. however, I encounter this error with a big one

WARNING:llama_index.llms.openai_utils:Retrying llama_index.llms.openai_utils.acompletion_with_retry.<locals>._completion_with_retry in 4.0 seconds as it raised APIConnectionError: Error communicating with OpenAI.
@Logan M does the example you gave use OpenAI also?
Yea it uses openai

That's a bit of an odd error, I'm going to assume openai is just having some server issues?
not sure. maybe it's a rate limit issue i dunno
do you mind send the full code for the example you gave?
sorry for being greedy 😄
i tried running it without OpenAI and it downloaded llama
Yea that's the local fallback
I believe you that you have it set up right. Normally if it's a rate limit issue it well say RateLimitError 🤔
Are you just using openai free credits right now or nah?
i am using a paid openai
i just run the entire code you gave

import os import logging import sys from llama_index import ( SimpleDirectoryReader, ListIndex, ) documents = SimpleDirectoryReader("./data").load_data() index = ListIndex.from_documents(documents) response = index.as_query_engine(response_mode="tree_summarize").query("Summarize this text")
Yea, like for me that always works 🤔 maybe there's something weird about the text you are reading/sending?
maybe, could it be the emojis 😄
i am encounting this error with the example you gave, any ideas on what is the issue? it is using a local llama model

**

llama_print_timings: load time = 92952.89 ms
llama_print_timings: sample time = 254.52 ms / 252 runs ( 1.01 ms per token, 990.10 tokens per second)
llama_print_timings: prompt eval time = 224784.60 ms / 1031 tokens ( 218.03 ms per token, 4.59 tokens per second) llama_print_timings: eval time = 118937.99 ms / 251 runs ( 473.86 ms per token, 2.11 tokens per second) llama_print_timings: total time = 345072.56 ms Exception ignored in: <function Llama.__del__ at 0x124d45d00> Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/llama_cpp/llama.py", line 1558, in __del__ TypeError: 'NoneType' object is not callable
looks like a max token issue with llama
i was able to get a result but still have this error


Spanish rule ended in 1898 with Spain's defeat in the Spanish–American Exception ignored in: <function Llama.__del__ at 0x11e884fe0> Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/llama_cpp/llama.py", line 1558, in __del__ TypeError: 'NoneType' object is not callable
isn't the chunking supported for llama?
i didn't encounter the max token issue using OpenAI
tried this but it didnt work

Plain Text
service_context = ServiceContext.from_defaults(llm='local', chunk_size_limit=3000)
Is this really an error? I think this is getting printed once your script has finished running.

I've seen this too, just a harmless llamacpp bug when the program is shutting down
i see got it. thanks!
Add a reply
Sign up and join the conversation on Discord