I'm using the PDFReader to read in my document. Is there a way to specify how many pages you want in one chunk? I think for now, the default is to have each page as one document object. I'd like to have 5-10 pages in one document object.
So I'd like to try out the document summary index with 100-page long document. WIth the default set up, the generation of document summary index is very slow since it geneartes a summary for each doc id.
For now, you might have to just manually concat document objects for each page into a single page π Something like this Document("\n".join[doc.text for doc in documents])
Would be a pretty easy PR change to the reader too to allow documents per page
Yea that's kinda what I was trying to do. Thanks for the instruction. I guess to go one step further, would the document summary index be a good fit for multi-document chat bot? I was using a custom setup (kinda like QASummaryGraph) for a single-document chatbot and it worked super well. Now I want to expand it to multi-document chat I was looking for a better index struct to handle questions that requires router and questions that require synthesis.