This is probably such a simple question and the answer is probably written someone on the Docs page, but I could not find it. How do I preserve a pdf page number for a long pdf , so that when getting vector search (or any other) results, it shows an excerpt + a page number? Thank you
with open(path, 'rb') as f:
pdf = PdfReader(f)
print("Metadata: ", pdf.metadata)
for page in pdf.pages:
documents.append(Document(text=page.extract_text(), metadata={page_number: pageNumber}))