Find answers from the community

Updated 6 months ago

Also would like to know , what strategy

At a glance

The community member is looking for a strategy to index 2000-page PDF files without using llamaparse, as the current process is taking a long time. The comments suggest trying to increase the embed_batch_size on the embedding model, as the large amount of data may be causing the slowdown. However, the community member indicates that this approach did not help, and they are still seeking an alternative strategy.

ddatadaba

Also would like to know , what strategy to follow when indexing 2000 pages PDFs without llamaparse . It is a native pdf file .
It is taking lot of time to generate a index .

5 comments

LLogan M

probably just increasing embed_batch_size on your embedding model

LLogan M

its likely a lot of data to embed

ddatadaba

Sure , thanks for your help will try this

ddatadaba

It is not helping either any other strategy that I can try
Thanks

SStatbot

Attachment

Add a reply