Find answers from the community

Updated 3 months ago

Does anybody know if there's an

Does anybody know if there's an ACCUMULATE style extraction program in LlamaIndex? Where an extraction is attempted for each chunk individually and then ACCUMULATED into a response? Within the same lines is there a program where you can augment with context, and where you can reference source_nodes? Additionally any output_parser that selects the correct results from ACCUMULATE? The closest I've seen is references to structured refine but not even really sure what kind of extraction program this is using.
L
F
17 comments
you could use accumulate with structured outputs I think, but you'd need some tricks to parse the output I think (since it dumps to string)

structured outputs in a query engine use either function calling (if supported) or just prompting the llm to output a structured output
Hmm yeah, I feel like maybe we should redesign to allow a program kwarg for a query engine or response synthesizer so you can also select the structured output type (prompting vs functions vs new api...)
you can set it on the llm

llm = OpenAI(..., pydantic_program_mode="...")

Not every LLM supports function calling, but function calling will be the most reliable
Ah ok, does "openai" use assistants and "function" use plain function calling API?
yessir (we added a more general function api, for any LLM that support function calling/tool calling)
Thanks, last Q what's the difference between llm and default?
Is my understanding correct?
Plain Text
| PydanticProgramMode | Description                                                                                                                                                                                                                                                                              |
|---------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| default             | Traditional prompt extraction where the template is formatted and the resulting string is sent to the LLM with the context and instructions. **Note:** This mode has no specific instructions on how the model should provide output, it is up to the template and query to define this. |
| openai              | Uses the [new OpenAI API](https://openai.com/blog/function-calling-and-other-api-updates) function calling to extract a structured object while leveraging OpenAI native JSON fine-tuning. Only works with Azure/OpenAI LLMs that natively support function calling with the new API.    |
| function            | Uses function calling to extract a structured object by leveraging JSON generation instructions/fine-tuning. Only works with LLMs that natively support function calling.                                                                                                                |
| llm                 | For LLMs without function calling, this method leverages JSON generation instructions to attempt to extract structured data.                                                                                                                                                             |
default picks function if supported, otherwise uses llm

llm sets it to LLMTextCompletetionProgram explicitly
So then does 'None' use a regular text completion without Structured Output?
Happy to document all of these modes, etc. in the code. I think we're missing some good documentation around structured outputs.
hmm, None would probably lead to default
Oh no I am so confused. It seems that lama_index/core/response_synthesizers/refine.py:485 attempts to:
Plain Text
structured_response = cast(
    StructuredRefineResponse, structured_response
)
query_satisfied = structured_response.query_satisfied

But structured_response is my Pydantic BaseModel. What am I missing here? Why does it try to cast the Pydantic model to a different one StructuredRefineResponse?
I think, potentially from what I am seeing, StructuredRefine essentially has it's own Pydantic model (StructuredRefineResponse) it uses with the PydanticProgram and does not accept the user's models?
It still accepts the user's pydantic model. Otherwise , output_cls wouldn't work πŸ‘€
Yeah, I think the problem is structured_answer_filtering=True is broken. I can take a deeper look this weekend 😐
ah yea, that might clash if you specify both at the same time
Add a reply
Sign up and join the conversation on Discord