Does anybody know if there's an

FFalven

Does anybody know if there's an ACCUMULATE style extraction program in LlamaIndex? Where an extraction is attempted for each chunk individually and then ACCUMULATED into a response? Within the same lines is there a program where you can augment with context, and where you can reference source_nodes? Additionally any output_parser that selects the correct results from ACCUMULATE? The closest I've seen is references to structured refine but not even really sure what kind of extraction program this is using.

17 comments

LLogan M

you could use accumulate with structured outputs I think, but you'd need some tricks to parse the output I think (since it dumps to string)

structured outputs in a query engine use either function calling (if supported) or just prompting the llm to output a structured output

FFalven

Hmm yeah, I feel like maybe we should redesign to allow a program kwarg for a query engine or response synthesizer so you can also select the structured output type (prompting vs functions vs new api...)

LLogan M

you can set it on the llm

llm = OpenAI(..., pydantic_program_mode="...")

Not every LLM supports function calling, but function calling will be the most reliable

LLogan M

Attachment

FFalven

Ah ok, does "openai" use assistants and "function" use plain function calling API?

LLogan M

yessir (we added a more general function api, for any LLM that support function calling/tool calling)

FFalven

Thanks, last Q what's the difference between llm and default?

FFalven

Is my understanding correct?

Plain Text

| PydanticProgramMode | Description                                                                                                                                                                                                                                                                              |
|---------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| default             | Traditional prompt extraction where the template is formatted and the resulting string is sent to the LLM with the context and instructions. **Note:** This mode has no specific instructions on how the model should provide output, it is up to the template and query to define this. |
| openai              | Uses the [new OpenAI API](https://openai.com/blog/function-calling-and-other-api-updates) function calling to extract a structured object while leveraging OpenAI native JSON fine-tuning. Only works with Azure/OpenAI LLMs that natively support function calling with the new API.    |
| function            | Uses function calling to extract a structured object by leveraging JSON generation instructions/fine-tuning. Only works with LLMs that natively support function calling.                                                                                                                |
| llm                 | For LLMs without function calling, this method leverages JSON generation instructions to attempt to extract structured data.                                                                                                                                                             |

LLogan M

default picks function if supported, otherwise uses llm

llm sets it to LLMTextCompletetionProgram explicitly

FFalven

So then does 'None' use a regular text completion without Structured Output?

FFalven

Happy to document all of these modes, etc. in the code. I think we're missing some good documentation around structured outputs.

LLogan M

hmm, None would probably lead to default

FFalven

Oh no I am so confused. It seems that lama_index/core/response_synthesizers/refine.py:485 attempts to:

Plain Text

structured_response = cast(
    StructuredRefineResponse, structured_response
)
query_satisfied = structured_response.query_satisfied

But structured_response is my Pydantic BaseModel. What am I missing here? Why does it try to cast the Pydantic model to a different one StructuredRefineResponse?

FFalven

I think, potentially from what I am seeing, StructuredRefine essentially has it's own Pydantic model (StructuredRefineResponse) it uses with the PydanticProgram and does not accept the user's models?

LLogan M

It still accepts the user's pydantic model. Otherwise , output_cls wouldn't work 👀

FFalven

Yeah, I think the problem is structured_answer_filtering=True is broken. I can take a deeper look this weekend 😐

LLogan M

ah yea, that might clash if you specify both at the same time

Add a reply

Find answers from the community

Does anybody know if there's an