Find answers from the community

Updated 2 months ago

We use PydanticProgramExtractor to get a

We use PydanticProgramExtractor to get a list of tags aswell as a summary and we see a strange error where content is repeated endlessly. This then causes the validation to fail.

This is our code:

Plain Text
EXTRACT_TEMPLATE_STR = """\
Here is the content of a section:
----------------
{context_str}
----------------
Given the contextual information, extract out a {class_name} object.\
"""

openai_program_summary = OpenAIPydanticProgram.from_defaults(
    llm=get_llm(model=MODEL_BASIC),
    output_cls=NodeSummaryMetadata,
    prompt_template_str="You must answer in the same language as the context given. {input}",
    extract_template_str=EXTRACT_TEMPLATE_STR,
)

openai_program_keywords = OpenAIPydanticProgram.from_defaults(
    llm=get_llm(model=MODEL_BASIC),
    output_cls=NodeKeywordsMetadata,
    prompt_template_str="You must answer in the same language as the context given. {input}",
    extract_template_str=EXTRACT_TEMPLATE_STR,
)

summary_extractor = PydanticProgramExtractor(program=openai_program_summary, input_key="input", num_workers=12)
keywords_extractor = PydanticProgramExtractor(program=openai_program_keywords, input_key="input", num_workers=12)
M
L
8 comments
And here is an example of an error:

Plain Text
1 validation error for NodeKeywordsMetadata
__root__
  Unterminated string starting at: line 1 column <NUMBER> (char <NUMBER>) (type=value_error.jsondecode; msg=Unterminated string starting at; doc={"excerpt_keywords":["MMS-feasibility","EFX-feasible","strong envy","reallocation","valuation function","PR algorithm","2-partition","3-partition","invariants","Lemma 4","allocation","agents","bundle","scenario","allocation scenario","allocation X","agent 1","agent 2","agent 3","valuation functions","valid partition","output","favourite bundle","favourite","max","min","feasibility","case","observation","proof","analysis","valid","run","pick","choose","observe","observe","observe","observe","observe","observe","observe","observe","observe","observe","observe","observe","observe","observe","observe","observe","observe","observe","observe","observe","observe","observe","observe","observe","observe","observe","observe","observe","observe","observe","observe","observe","observe","observe","observe","observe","observe","observe","observe","observe","observe","observe","observe","observe","observe","observe","observe","observe","observe","observe","observe","observe","observe","observe","observe","observe","observe","observe","observe","observe","observe","observe","observe","observe","observe","observe","observe","observe","observe","observe","observe","observe","observe","observe","observe","observe","observe","observe","observe","observe","observe","observe","observe","observe","observe","observe","observe","observe","observe","observe","observe","observe","observe","observe","observe","observe","observe","observe","observe","observe; pos=<NUMBER>; lineno=1; colno=<NUMBER>)


Anything we are doing wrong here?
Nothing you are doing wrong, just the LLM having a freak out it seems lol
Something to do witht the content it is reading I guess?
What LLM are you using?
Try playing with the temperature a bit
we use a temp of 0.0, this is using Azure GPT 3.5 Turbo 0125. But we also saw this same situation using the same model with OpenAI's API
It is quite rare though so can't really reproduce it easily...
yea seems like something in the input prompt is causing the LLM to just freak out -- not much can be done I think, besides maybe figuring out what piece of text caused this and if it needs to be cleaned?
Add a reply
Sign up and join the conversation on Discord