Find answers from the community

Updated 6 months ago

structured output

At a glance

hello, i want to get multiple structured output from llm using llamaindex. for example when i query to llm, then the answer should like:
query: List the most populous cities in England.

Plain Text

{country: England, city:London population: 10m},
{country: England, city:Birmingham population: 2.5m},
{country: England, city:Manchester population: 2m},

How can do this?

7 comments

ssansmoraxz

depends on LLM. most often some basic prompt engineering asking to format in json.

Just pass in the schema.

vverdverm

second this, I was able to get LLMs to follow a simplified JSON/CUE format by providing a "schema" and a few examples

https://github.com/hofstadter-io/hof/blob/_dev/flow/chat/prompts/dm.cue

ssansmoraxz

yes. just make sure to not ask too complex schema. You can programetically merge and such later. LLM will either struggle with complexities giving incorrect output. Or generated output may contain less relevant contents.

ssansmoraxz

Also pass in some examples, most often they help, unless it's a small model (like 4B or something parameters) in which case you are better off dropping those examples.

vverdverm

Big models like GPT-4 and Gemini are much better at this type of stuff in my experience. The smaller numbers tended to get the format right, but failed at filling in the details correctly based on user input

vverdverm

I used a format that could be parsed & processed by CUE because the "merge" is much better and has great correctness guarantees

ssansmoraxz

btw forgot to mention, if the model supports system prompts you can leverage that

Add a reply