Find answers from the community

Updated 4 months ago

I am looking for a text extraction use

At a glance

The community member is looking for a text extraction use case, where they have a large corpus, are doing semantic chunking and storing it in a vector database. They want to retrieve the top-k matches from the index, re-rank and organize the insights into a document by topic. The community member is considering using LLMs for this task, but is also wondering if the llamaindex abstractions could be useful beyond chunking and indexing.

In the comments, another community member suggests that a response synthesizer like tree summarize or accumulate could be helpful for this use case.

I am looking for a text extraction use case. Essentially there is a large corpus and i am doing semantic chunking and storing it in a vector db. I want to retrieve from the index top_k matches, re-rank and organize the insights into a document by topic. Of course i can raw dog all of this with LLMs, but i am trying to figure if any of the llamaindex abstractions might be useful beyond chunking and indexing. Fore reference, top_k here might be of the order of ~1000. Of course there is a summarization element to it, but the idea is not to dump everything into the context window once and do some generation.
L
1 comment
probably some response synthesizer like tree summarize or accumulate could help here
Add a reply
Sign up and join the conversation on Discord