Find answers from the community

Updated 6 months ago

Hello hope everyone is doing well I am

At a glance

Hello, hope everyone is doing well. I am working in a way to create indexes for pdf files (about 36 pages), my question here is that in this page we have many tables and other structures, what is a clever way to parse this document into nodes? I am worried that when I parse the document, I will separate a table into two and without the context about what is the table taking about, the retrieval step will only get one part of the table.

4 comments

SSeldo

Have you seen this page of the docs? https://docs.llamaindex.ai/en/stable/examples/query_engine/pdf_tables/recursive_retriever.html#load-in-document-and-tables

SSeldo

It shows parsing tables out of a PDF

FFelipe Damascena

thanks, I did not see it, I was looking at the node parser documentation, it is interesting, I can separate the text from the table using it.

FFelipe Damascena

I will do some tests

Add a reply