Find answers from the community

Home
Members
jokubas.s
j
jokubas.s
Offline, last seen 3 months ago
Joined September 25, 2024
I've got a PDF file that is 3000~ pages long, it takes over a minute to load the file, is simpledirectoryreader dependent on hardware or is there any workarounds/more efficient ways to load files?
1 comment
L
Still having a problem with incorrect page_labels, after sentenceSplitter, it loses the correct page_label, and the document metadata points to the last page of the PDF. Can't find any solutions on how to fix this problem
11 comments
j
L
W
When I run a PDF through a pipeline with a sentence_splitter, it loses the correct page_label, as in, page_label display as the last page of PDF. Any solutions/ideas?
3 comments
j
W
Hi, switched my code to a pipeline+sentencesplitter for pdfs and now the all of the page_labels are the last page of the pdf. (In a 369 page pdf, all the page_labels in the vectorstore are 369.) Any workarounds? Here's some code:
5 comments
j
Hi, im trying to setup a multi-tenancy program. The users are all seperate and have seperate index locations (User1_Index/IndexedPDF1; User2_Index/IndexedPdf2...). Im having trouble when two or more users are loading a new index from storage, as in, another user overrides the other users loaded index. Example: User1 has PDF1, User2 loads PDF2, User1 and User2 has PDF2 "loaded", but ofcourse, user2 gets an empty response due to my metadata filter. Should I have just one big index with all the indices or is there a way to split the index loading between users.
1 comment
L
Hi, I have a question about index separation for multiple users. Right now, my project has two main functions: One function selects and loads the index, the other is the chatting with the document part. Everything seems fine with one user, but when a second user logins and loads a file, both users that are logged in have the same file loaded, instead of two separate files for two separate users. My question is, are there any other ways of separation apart from multiprocessing in the flask example to separate the index loading for users?
2 comments
j
W