Find answers from the community

Updated 2 months ago

Structured data

I have a bit more of a targeted use case maybe someone has tried that I'd like to discuss
Essentially I'm trying to provide a list of questions for an LLM to answer from a document (who was involved, what day did this occur), and return these answers in a more structured format (csv, json, whatever) to begin building a structured table that could then be queried by SQL (probably using an NL-SQL model)
I'm trying to understand if anyone has tried to generate more structured data out of the summarised responses from an LLM and if there is a good llamaindex/langchain way to go about this, short of generating custom prompts to do this entirely?
L
j
5 comments
Have you come across the pydantic program in llama index?

If you define a class (I.e a pydantic model), you could use it to extract structured data that would be easy enough to slot into a db
https://gpt-index.readthedocs.io/en/stable/examples/output_parsing/openai_pydantic_program.html

We recently did something similar-ish, extracting data points about common problems in our github issues
https://github.com/jerryjliu/llama_index/blob/main/docs/examples/usecases/github_issue_analysis.ipynb
I've been meaning to expand this to other modules to make using it/automating it a bit easier, but definitely possible the way it is now
How tightly coupled is this to using openai? Would it be feasible to expand it to using LlamaCPP bindings?
Thank you for sharing this though it's definitely in the realm of what i'd be looking to do
ohhhh its super tied to openai. It's using their function calling api

I have seen llama2 be used in a similar function calling api (i.e. llama-api, which we also support). But tbh open-source models are not great with structured outputs right now πŸ€”
Add a reply
Sign up and join the conversation on Discord