Log in
Log into community
Find answers from the community
View all posts
Related posts
Did this answer your question?
π
π
π
Powered by
Hall
Inactive
Updated 5 months ago
0
Follow
is there a way to do hybrid retriever
is there a way to do hybrid retriever
Inactive
0
Follow
V
Vi
5 months ago
Β·
is there a way to do hybrid retriever / bm25 retriever without storing and loading from filesystem (docstore / objectstore) ?
L
P
V
7 comments
Share
Open in Discord
L
Logan M
5 months ago
If you manually wrote the bm25 algorithm to generate the sparse vectors and threw that into a vector store, then yes π
The problem with bm25 is that if any text is added or removed, the entire index needs to be recalculated
I've been meaning to switch to the newer (and faster) bm25s library so that people can at least save the retriever to disk
P
PwnosaurusRex
5 months ago
Any thoughts on bm42 @Logan M ?
https://qdrant.tech/articles/bm42/#
L
Logan M
5 months ago
Not sure. Its very small which is nice, so it should be much faster than splade. But they really didn't leave much info for benchmarking
L
Logan M
5 months ago
In any case though, seems like some minor tweaks are needed to support it properly in llama-index (need to set that IDF param in the config for example)
P
PwnosaurusRex
5 months ago
Yeah conceptually seems useful, but TBD
V
Vi
5 months ago
Recalculating the whole index doesn't seem to nice.
V
Vi
5 months ago
I saw qdrant has bm25, but all is saved into the cloud not to disk as an docstore
Add a reply
Sign up and join the conversation on Discord
Join on Discord