I am unable to get the page label and

At a glance

The community member is unable to get the page label and file name for PDFs in the llamaparse library. Other community members suggest trying the JSON mode and accessing the source nodes in the response object to retrieve the document details, including metadata and segmentation information. They indicate that the response object format may depend on the LLM (Language Model) or Vector DB used. One community member confirms that the source nodes contain the required information and provides an example of how to access them.

Useful resources

ddatadaba

I am unable to get the page label and file name in for PDFs in llamaparse

10 comments

WWhiteFang_Jr

Did you try the JSON mode?

WWhiteFang_Jr

https://github.com/run-llama/llama_parse/blob/main/examples/demo_json.ipynb

vverdverm

Is there a similar method for getting json response back (with the used document details) from query()?

WWhiteFang_Jr

I don't think so! All the used nodes can be accessed via source nodes in the response object itself.

vverdverm

ok, that sounds familiar. Those source nodes will contain the various metadata, document details, and segmentation (where in file) information?

vverdverm

Is there a reference to what the response object looks like? Does it depend on the LLM / Vector DB?

WWhiteFang_Jr

Yes

WWhiteFang_Jr

When you do .query() and get response you can check all the retrieved source nodes like this: print(response.source_nodes)

ddatadaba

Thanks guys

ddatadaba

It worked for me

Add a reply

Find answers from the community

I am unable to get the page label and