Find answers from the community

Updated 2 years ago

Structured input to output

At a glance
Hey my dudes, I'm back at it again. What's the best tool to use to do a specific data retrieval in a unstructured intput defined at first in input with a specific structured input.
V
L
38 comments
I thought of smth like this
Either with a program who knows what to search for and fill many values

at first in input:
{ "languages": [ { "Type" : "" , "years" : "" }, { "Type" : "", "years" : "" }, ] }
and in output
{ "languages": [ { "Type" : "Python" (value found in the doc), "years" : "3" (here too, at the same time ?) }, { "Type" : "java", "years" : "1" }, ] }
Is it possible to do a query who can fill values at the same time in one query ? Or do I need to retrieve 1 value at a time and fill it in my structured ouptut ?
Since I want to retrieve information and not to imagine or "create an album exemple", what's the best way to do that ?
I don't think any existent tool match my request for now
Or even if i defined the language type at first and I only want the years
Structured input to output
I think this is possible with the pydantic programs (or the newer pydantic output parsers)

For example

Plain Text
class Experience(BaseModel):
  skill_name: str
  num_years_experience: int

Class Experiences(BaseModel):
  experiences: List[Experience]
isn't it just hallucinating infos here ?
Attachment
image.png
just to have a fake album who match the structured request ?
or these infos are from a specific data ?
Nah it's hallucinating. The only input it has is that input string a cell or two above. You could modify that string to have details from the document you are parsing

Or, you can use the output parser to use this with an index in a more normal manner lol
I thought we merged the output parser
yeah so the schema I created on the first screen isnt possible for now on
Can I have more details on the json format ? like having this
class Experience(BaseModel): skill_name: str num_years_experience: int Class Experiences(BaseModel): experiences: List[Experience]
I mean, I still think it's possible imo
Yea this might work too! I haven't tried this yet either
So using pydantic, you can convert to/from json super easily. But constructing as class is just really easy (and it's what the function call API from openai expects under the hood - a pydnatic class that was converted to pydantic json output)
I tried to use pydantic but i couldn't use it with Azure :/
So I couldnt use the class I created
The only thing that "matched" what I wanted was the langchain thing
Plain Text
# define output schema
response_schemas = [
    ResponseSchema(name="doc_name", description="nom de document"),
    ResponseSchema(name="data", description="date de création du document"),
    ResponseSchema(name="site", description="qui est l'organisme qui a créé le document ?"),
    ResponseSchema(name="apave", description="quel est le nom de la personne représentante de l'APAVE"),
    ResponseSchema(name="8.5", description="quel est le nom de la partie 8.5"),
    ResponseSchema(name="page8.5", description="page de début de la partie 8.5"),
    ResponseSchema(name="19011", description="a quelle norme se référer pour les lignes directrices de la partie 9.2.2"),
]

# define output parser
lc_output_parser = StructuredOutputParser.from_response_schemas(response_schemas)
output_parser = LangchainOutputParser(lc_output_parser)

# Prompt de base du chatbot
# from llama_index import Prompt
template = (
...
)

# format each prompt with output parser instructions
fmt_qa_tmpl = output_parser.format(template)
fmt_refine_tmpl = output_parser.format(DEFAULT_REFINE_PROMPT_TMPL)
qa_prompt = QuestionAnswerPrompt(fmt_qa_tmpl, output_parser=output_parser)
refine_prompt = RefinePrompt(fmt_refine_tmpl, output_parser=output_parser)

# query index
query_engine = index.as_query_engine(
    similarity_top_k=3,
    text_qa_template=qa_prompt, 
    refine_template=refine_prompt, 
)
response = query_engine.query(
    "remplis les questions d'une manière détaillée et precise, soit sur a 100%", 
)

print(str(response))

with open("donnes.json", "w") as dt:
    dt.write(str(response))
doing somthing like that
Yea, it needs support for the function calling api. Has azure added that yet? If they have, we should patch that on our end.

We are actually also about to merge our own azure LLM class, that is hopefully less error prone compared to langchain lol
I'll take a look, I'm not sure but I think so
https://stackoverflow.com/questions/76543136/how-to-do-function-calling-using-azure-openai I guess I was wrong someone saying it's "supposed" to do it but in fact it's not working
Lol sheesh, sounds like mircosoft 🥲
Classic microsoft hahah
one day it's working, one day it's not
we can only pray with them
Add a reply
Sign up and join the conversation on Discord