Find answers from the community

Updated 5 months ago

Hello hope everyone is doing well I am

At a glance
Hello, hope everyone is doing well. I am working in a way to create indexes for pdf files (about 36 pages), my question here is that in this page we have many tables and other structures, what is a clever way to parse this document into nodes? I am worried that when I parse the document, I will separate a table into two and without the context about what is the table taking about, the retrieval step will only get one part of the table.
S
F
4 comments
It shows parsing tables out of a PDF
thanks, I did not see it, I was looking at the node parser documentation, it is interesting, I can separate the text from the table using it.
I will do some tests
Add a reply
Sign up and join the conversation on Discord