Did anyone try to use recursive

At a glance

The community member who posted the original question is concerned about using the recursive retriever with embedded tables in a large number of complex documents. They believe that the recursive retriever may not be effective in this scenario, as it may not be able to associate the tables with the specific documents they belong to. The community member suggests creating a complex index structure for each document, but notes that using an LLM to decide from hundreds of possibilities may not be efficient. They propose an embedding-based routing approach as a potential solution and are currently working on it, but are open to other suggestions.

In the comments, other community members discuss the use of embeddings in the recursive retriever and whether it would be possible to implement a solution using the current recursive retriever class. One community member provides a minimum example implementation, which the original poster finds helpful, but they both agree that the interface needs some rethinking.

Useful resources

DDrSebastianK

Did anyone try to use recursive retriever with embedded tables on several hundreds of documents? If someone has a large number of complex documents, each containing different embedded tables, I don't think we can use the recursive retriever effectively. Am I right or it's just me who didn't understand something? We could create a complex index structure for each of the docs separately, but it wouldn't be efficient to use an llm, to decide from hundreds of possibilities which one to use. Some kind of embedding based routing would be a great idea in my opinion. Currently working on it, but let me know if there is a better way.

12 comments

LLogan M

Doesn't the recursive retriever already use embeddings?

LLogan M

Or I guess it depends on how you set it up

DDrSebastianK

It does, but what I experienced is, that if a table doesn't contain explicitly informations specific to the context it won't be retrieved. So if I have a document about product A, which contains a table, but with mostly general information like, size, weight, etc. the recursive retriever won't know, which document this table was in. Let's say I want to compare two products, which both contain the same basic table, it won't know that table 1 belongs to product A, and table 2 belongs to product B.

LLogan M

I guess the idea would be to have an IndexNode for product A and product B, and those index nodes point to retrievers for data specific to that product, including tables

LLogan M

Basically the top level retrieves the product, and then recursively retrieves information about that product

LLogan M

I think that makes sense?

DDrSebastianK

Yes. Would this be possible with the current recursive retriever class? Do you have an example of this implementation?

LLogan M

Definitely possible. Let me quickly take a look at the docs and try to write some minimum example

LLogan M

Wow taht was a bit of a ride. I hope this is somewhat helpful https://gist.github.com/logan-markewich/3472a11d2f5aa3c976232f7fa76f4272

LLogan M

I think the interface to this needs some rethinking lol

DDrSebastianK

Amazing! Still trying to wrap my head around this. But definitely what i was looking for. Thanks a lot!

LLogan M

Yea i agree, the interface needs some tweaks. Writing that was actually a little hard to figure out what I was doing lol

Personally I'd like to remove all the dicts from the interface, that's the weirdest part to me 😅

Add a reply

Find answers from the community

Did anyone try to use recursive