Find answers from the community

Updated 3 months ago

Hi, does anybody have an example of

Hi, does anybody have an example of working PydanticProgramExtractor? There might be something obvous missing. I was following an example in the docs for 0.10.15 and am getting a Field required [type=missing, input_value={}, input_type=dict] error.

def run_metadata_pipeline(nodes, node_parser): openai_program = OpenAIPydanticProgram.from_defaults( output_cls=NodeMetadata, prompt_template_str="{input}", # extract_template_str=EXTRACT_TEMPLATE_STR ) program_extractor = PydanticProgramExtractor( program=openai_program, input_key="input", show_progress=True,metadata_mode=MetadataMode.EMBED, workers=3 ) extractor = [ QuestionsAnsweredExtractor(questions=3, metadata_mode=MetadataMode.EMBED, workers=3), SummaryExtractor(summaries=["prev", "self", "next"],workers=3) ] pipeline = IngestionPipeline(transformations=[program_extractor]) return pipeline.run(nodes=nodes, in_place=False, show_progress=True)


here's my NodeMetadata class:

class NodeMetadata(BaseModel): """Node metadata.""" description: str = Field( ..., description="A concise one sentence description of what this text chunk useful for." ) terms: List[str] = Field( ..., description="a list of keywords used in this text chunk" )

The thing is that it works for a while but the error is thrown upon processing of 1-2% of nodes. If you are aware of the bug in this regards, please let me know. Thanks.
W
a
L
7 comments
Hi,
Can you share the whole error traceback
sure. here we go:
Are you using open source LLM?
nope.
llm = "gpt-3.5-turbo-0125"
embeddings = "text-embedding-3-small"
this error basically just means the LLM did not output the proper fields
any way we can manage? It is very unreliable then.
Thats the nature of using LLMs to output structured data right now
Add a reply
Sign up and join the conversation on Discord