Structured Data Extraction with O1-Mini and O1-Preview

Question

Hi, I am trying to extract a pydantic model and having success using "gpt-4o" and "gpt-4o-mini" but cannot get it working with "o1-preview" and "o1-mini".Can anyone help with how to perform structured data extraction with o1-mini and o1-preview pleaseclass Biography(BaseModel): """Data model for a biography.""" name: str best_known_for: List[str] extra_info: str documents = [Document( text="My name is John Dewberry, I am known for my back flips and I live under a mushroom in Alberta"
)] index = VectorStoreIndex.from_documents(documents) llms = ["gpt-4o","gpt-4o-mini","o1-preview","o1-mini"]
for llm_name in llms: llm = OpenAI(model=llm_name, temperature=0.1) query_engine = index.as_query_engine( output_cls=Biography, response_mode="compact", llm=llm ) response = query_engine.query("Who is John DewBerry?") print(f"{llm_name}: {response.response.model_dump_json()}") # this fails with BadRequestError: Error code: 400 - {'error': {'message': "Unsupported parameter: 'tool_choice' is not supported with this model."

Logan M · Answer

o1 does not support tools/functions afaik

aukinfo · Answer

Thanks Logan, I see that OpenAi suggest chaining "The o1 class of models currently doesn't have structured outputs support, but we can re-use existing structured outputs functionality from gpt-4o-mini by chaining two requests together. This flow currently requires two calls, but the second gpt-4o-mini call cost should be minimal compared to the o1-preview/o1-mini calls."
Is chaining like this possible in Llama-Index?
I guess so - Is that QueryPipeline?

Logan M · Answer

Openai doesn't have full context here. I don't think putting two llm calls together here makes total sense 🤔

aukinfo · Answer

Maybe I am unclear Logan. I need a pydantic response but I also on occasion need a powerful model like o1. Therefore I could ask o1 to perform the complex task and deliver the result in JSON (as the completion output, requested in the prompt). I could then pass this JSON string (which may not be well formed or validate) into an LLM such as gpt-4o-mini and convert using function calling into a well-strcutured pydantic response. I know this is possible, I was just wondering if there were tools for this kind of workflow in LlamaIndex

Logan M · Answer

I mean, maybe, if you wrote the flow using lower-level components. Like using a retriever and llm directly, inside a workflow https://docs.llamaindex.ai/en/stable/module_guides/workflow/#workflows

Find answers from the community

Structured Data Extraction with O1-Mini and O1-Preview