Find answers from the community

s
F
Y
a
P
Home
Members
donvito
d
donvito
Offline, last seen last month
Joined September 25, 2024
i tried to set this

Plain Text
service_context = ServiceContext.from_defaults(llm='local', chunk_size_limit=3000)


but I am still getting this error using llama2-13B, the default one

Plain Text
File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/llama_cpp/llama.py", line 900, in _create_completion
    raise ValueError(
ValueError: Requested tokens (3993) exceed context window of 3900


Any ideas what I am doing wrongly?
3 comments
d
L
hi, I have a use case wherein my company wants to have questions answered by exact text we feed in to the LLM. Is this even possible? how can it be done? it is a document chat/query use case.
3 comments
d
T
Hi, we are trying to summarize very long text. Use case is we will extract the entire chat conversation between our customer and our customer service agent then get a summary so it'll be easier to handoff to another cs agent. We tried pure OpenAI calls but we are hitting the token limit even if we use gpt-3.5-16k. I was thinking we can use llama_index for this use case. Have you guys tried this before? Any patterns which we can use?

Would really appreciate to point me to the right direction. TIA!
26 comments
d
L
i am getting this answer when I am using llm=ChatOpenAI . Even I indexed my entire data set, it seems it is not added in the context. Any ideas how I can let it answer more accurately?

Answer:The context provided is about ... Therefore, the original answer remains the same.
6 comments
d
L
Hi, does GPTSimpleVectorIndex support changing of the LLM predictor? I checked my usage and it is still falling back to text-davinci-03. here's the gist of the code.

# define LLM llm_predictor = LLMPredictor(llm=OpenAI(temperature=0, model_name="gpt-3.5-turbo", max_tokens=512)) service_context = ServiceContext.from_defaults(llm_predictor=llm_predictor, prompt_helper=prompt_helper) index = GPTSimpleVectorIndex.from_documents( documents, service_context=service_context )
4 comments
d
L
d
donvito
·

Costs

hi, is there a way to limit the OpenAI API tokens generated in llamaindex? just wanted to control cost since I am exploring using my own funds. 😄
2 comments
d
L