Find answers from the community

Updated 5 months ago

is there a way to do hybrid retriever

is there a way to do hybrid retriever / bm25 retriever without storing and loading from filesystem (docstore / objectstore) ?
L
P
V
7 comments
If you manually wrote the bm25 algorithm to generate the sparse vectors and threw that into a vector store, then yes πŸ™‚

The problem with bm25 is that if any text is added or removed, the entire index needs to be recalculated

I've been meaning to switch to the newer (and faster) bm25s library so that people can at least save the retriever to disk
Not sure. Its very small which is nice, so it should be much faster than splade. But they really didn't leave much info for benchmarking
In any case though, seems like some minor tweaks are needed to support it properly in llama-index (need to set that IDF param in the config for example)
Yeah conceptually seems useful, but TBD
Recalculating the whole index doesn't seem to nice.
I saw qdrant has bm25, but all is saved into the cloud not to disk as an docstore
Add a reply
Sign up and join the conversation on Discord