The community member is looking for a strategy to index 2000-page PDF files without using llamaparse, as the current process is taking a long time. The comments suggest trying to increase the embed_batch_size on the embedding model, as the large amount of data may be causing the slowdown. However, the community member indicates that this approach did not help, and they are still seeking an alternative strategy.
Also would like to know , what strategy to follow when indexing 2000 pages PDFs without llamaparse . It is a native pdf file . It is taking lot of time to generate a index .