Any ideas on how to protect JSON

At a glance

Any ideas on how to protect JSON failures from using output_classes:

Plain Text

  File "/Users/bmax/src/pursuit/ai/lib/python3.8/site-packages/typing_extensions.py", line 2562, in wrapper
    return __arg(*args, **kwargs)
  File "/Users/bmax/src/pursuit/ai/lib/python3.8/site-packages/pydantic/main.py", line 1026, in parse_raw
    raise pydantic_core.ValidationError.from_exception_data(cls.__name__, [error])
pydantic_core._pydantic_core.ValidationError: 1 validation error for ContactList
__root__
  Unterminated string starting at: line 232 column 15 (char 4738) [type=value_error.jsondecode, input_value='{\n  "contacts": [\n    ... Clogged Weekdays 6 a.m', input_type=str]

29 comments

bbmax

Plain Text

            output = self.output_cls.parse_raw(function_call["arguments"])

failing

LLogan M

Looks like to me the LLM ran out of room writing the JSON response 👀 '{\n "contacts": [\n ... Clogged Weekdays 6 a.m' is missing the closing brace

bbmax

oh that makes sense

bbmax

there is a shit done of data in this one...

bbmax

okay, let me rephrase question 😅

bbmax

is there a way to have like a dynamic max_tokens?

LLogan M

I think it's already dynamic by default 😅 it might be actually just running out of room

bbmax

Plain Text

openai_model = 'gpt-3.5-turbo-16k'
# set context window
context_window = 14300
# set number of output tokens
max_tokens = 1500

bbmax

but like... sometimes i don't need 1500 for max tokens

LLogan M

set max_tokens=None

bbmax

sometimes I need way less

bbmax

sometimes I need more

bbmax

let me try this None magic

bbmax

looks like it doesn't something deep in OpenAI code

LLogan M

You'll need to set num_outputs in the service context then too

LLogan M

like, llama-index has to leave room for tokens to generate

I do set num_outputs

to 1500

should I do None?

cool. Might want to like... double that

bbmax

but then that means my context window for the nodes will be smaller?

bbmax

less context per call?

LLogan M

yes, 13,500k instead of 14,500k

LLogan M

or something like that

LLogan M

like there's two sides to this coin

LLogan M

letting the model generate as much as it can (max_tokens) and configuring llama-index to leave room for that (num_outputs)

bbmax

got it that's where max_tokens = None

bbmax

that worked

bbmax

max_tokens=None and num_outputs=1500

Add a reply

Find answers from the community

Any ideas on how to protect JSON