Find answers from the community

Updated 5 months ago

I m trying to query some Data using this

At a glance

I'm trying to query some Data using this https://gpt-index.readthedocs.io/en/latest/examples/output_parsing/LangchainOutputParserDemo.html.

Unofrtunatly when I ingest some PDFs i get wrong results when I'm trying to query over data that is in a table like this: (see picture)
I get some resulsts that seems right but that are not.

Attachment

10 comments

In fact I get these results

Plain Text

{"Codes ISIN": "FR0010298596, FR0010298604, FR0010298612",

When i ask him to find the 4 numbers that are highlited.

My program is running the basic Vectore Store with top_k retrieval. For my pdf

Is it the right thing to do since i query over some text in the pdf too and that is working

Or do I need to implement another way to do it

Such as this one https://gpt-index.readthedocs.io/en/latest/examples/query_engine/pdf_tables/recursive_retriever.html (is it possible to be used with the langchain output parser ?). Or do I you know any way to do it better using the langchain output parser ?

Tough problem, tables in PDFs suuuuck to parse 🫠 using Camelot to structure the table data might help, assuming Camelot works lol

assuming it haha

yeah it's starting to get a too much tricky for me from now on

but i'll manage to suceed

Add a reply