I am trying to extract structure data

At a glance

The community member is trying to extract structured data from unstructured text using structured_predict, but is encountering an issue where llama-index is unable to parse a datetime.date field in their Pydantic model, throwing an "Invalid Date Format" error. The community members suggest writing a custom validator for the field to handle the formatting, as the LLM is writing a string that Pydantic cannot parse. They also discuss whether this functionality could be provided out-of-the-box in llama-index, as date parsing is a common requirement. Additionally, a community member inquires about why llama-index expects Pydantic v1 for BaseModels, and the response explains that the recent v0.11.x release has added full Pydantic v2 support, which was a significant undertaking.

ddhiraj

I am trying to extract structure data from the unstructured text using structured_predict, but if my Pydantic model has a field with type as datetime.date, llama-index is not able to parse the string and throws an error saying "Invalid Date Format". How can this be rectified

11 comments

LLogan M

Seems like that LLM is writing a string that pydantic cant parse into a datetime format?

Have you tried writing a custom validator for that field? You could help format the field and create the proper type

ddhiraj

thats write, LLM is writing a string that pydantic is not able to parse

ddhiraj

Haven't tried to write a custom validator though.

ddhiraj

while using instrcutor library, this parsing is done by the library right?

LLogan M

I would go the validator route, then you can handle it directly in the pydantic class

LLogan M

I'm not sure on instructor or how it works tbh

ddhiraj

i was thinking may be this can be provided as out of box method for llama-index

ddhiraj

since date is a pretty common format

ddhiraj

@Logan M - by the way, would like to know why is llama-index expecting pydantic.v1 for BaseModels?

LLogan M

because v0.10.x and below is built entirely on pydantic v1, and uses the pydantic.v1 layer to maintain compatibility

We actually JUST released v0.11.x, with one of the main changes being full pdyantic v2 support (no more v1). You'd be surprised just how breaking of a change this was under the hood, it was a huge amount of work to migrate 😅

ddhiraj

oh daam!!!
really appreciate all the hard work being done by you guys. You guys are too awesome

Add a reply

Find answers from the community

I am trying to extract structure data