Find answers from the community

Updated 6 months ago

Anyone here have a good doc on what

Anyone here have a good doc on what these options mean? I know default = vector, and sparse = bm25(f?)...if I user queryfusionretreiver to build a "hybrid", is that functionally the same as just calling hybrid below? what about semantic_hybrid? How about text_search...that just a keyword search? I thought sparse is basically text, but does some extra stuff to better weigh certain issues (i.e. overuse of the keyword)?

https://docs.llamaindex.ai/en/latest/api_reference/storage/vector_store/?h=vectorstorequery#llama_index.core.vector_stores.types.VectorStoreQueryMode

Plain Text
    DEFAULT = "default"
    SPARSE = "sparse"
    HYBRID = "hybrid"
    TEXT_SEARCH = "text_search"
    SEMANTIC_HYBRID = "semantic_hybrid"
L
P
S
22 comments
Tbh, most of these beyond default are only supported by certain vector dbs
postgresdb is my primary one
followed the docs on hte site for rerank hybrid, but noticed the other options...
sparse is sparse embeddings yes (bm25, splade, etc.)

hybrid is dense + sparse

text is just normal text search (think ctrl+f)

no idea what semantic hybrid is lmao
postgress supports default and hybrid
oh wait, and text and sparse
is that identical to hybrid?
(minus the rerank-y bits? :P)
ngl postgress is so confusing lol, I'm not entirely sure whats going on there
np, I've been testing the results and seeing what's returned
the interwebz calls things different stuff, sometimes text = bm25 = sparse
sometimes text is just ctrl F, etc...
Nice, will dig around there too.
If I build a vector store with postgres with hybrid on, does text_search use the tsvector column? While sparse does BM25 on...the same column? Essentially the same as performing a FTS natively in postgres?
I think both are using tsv
judging by the code
Happy to see this. It’s next on my list to try to improve retrieval of product names.
Add a reply
Sign up and join the conversation on Discord