Find answers from the community

Updated 2 years ago

when parsing a document with multiple

At a glance

The community member is asking if it is possible to split nodes based on page when parsing a document with multiple pages, given that the text is in markdown format and pages are separated by '\x0c'. Another community member suggests that the user can manually split the text and create nodes ahead of time using any method they design.

when parsing a document with multiple pages, is it possible to split the nodes based on page if within the token/character limit?

I have text in markdown format. pages are broken up by '\x0c'

Could I try and parse for every new '\x0c'?
b
L
2 comments
(pages are important here because I'm using legal filings, and the citation to page is important for citation and context)
You can manually split your text and create nodes ahead of time, using any method you design πŸ™‚
Add a reply
Sign up and join the conversation on Discord