Anyone here have a good doc on what

At a glance

The community members are discussing the different query modes available in the LlamaIndex library, such as default, sparse, hybrid, text_search, and semantic_hybrid. They are trying to understand the differences between these modes and how they work, especially in the context of using PostgreSQL as the vector store.

The community members note that not all of these modes are supported by all vector databases, and that PostgreSQL specifically supports default and hybrid. They also discuss the use of queryfusionretriever to build a "hybrid" query, and whether that is functionally the same as using the hybrid mode directly.

The community members provide some explanations of the different modes, such as sparse being related to sparse embeddings and BM25, and text_search being a simple keyword search. However, they are unsure about the exact differences and how they work in PostgreSQL.

Overall, the community members are trying to understand the various query modes and how they can be used to improve their retrieval of product names, but they are still exploring and testing the different

Useful resources

PPwnosaurusRex

Anyone here have a good doc on what these options mean? I know default = vector, and sparse = bm25(f?)...if I user queryfusionretreiver to build a "hybrid", is that functionally the same as just calling hybrid below? what about semantic_hybrid? How about text_search...that just a keyword search? I thought sparse is basically text, but does some extra stuff to better weigh certain issues (i.e. overuse of the keyword)?

https://docs.llamaindex.ai/en/latest/api_reference/storage/vector_store/?h=vectorstorequery#llama_index.core.vector_stores.types.VectorStoreQueryMode

Plain Text

    DEFAULT = "default"
    SPARSE = "sparse"
    HYBRID = "hybrid"
    TEXT_SEARCH = "text_search"
    SEMANTIC_HYBRID = "semantic_hybrid"

22 comments

LLogan M

Tbh, most of these beyond default are only supported by certain vector dbs

PPwnosaurusRex

postgresdb is my primary one

PPwnosaurusRex

followed the docs on hte site for rerank hybrid, but noticed the other options...

LLogan M

sparse is sparse embeddings yes (bm25, splade, etc.)

hybrid is dense + sparse

text is just normal text search (think ctrl+f)

no idea what semantic hybrid is lmao

LLogan M

postgress supports default and hybrid

PPwnosaurusRex

so using rerank fusion in the example: https://docs.llamaindex.ai/en/latest/examples/vector_stores/postgres/?h=postgres+hybrid#improving-hybrid-search-with-queryfusionretriever

LLogan M

oh wait, and text and sparse

PPwnosaurusRex

is that identical to hybrid?

PPwnosaurusRex

(minus the rerank-y bits? :P)

LLogan M

ngl postgress is so confusing lol, I'm not entirely sure whats going on there

PPwnosaurusRex

np, I've been testing the results and seeing what's returned

LLogan M

Attachment

PPwnosaurusRex

the interwebz calls things different stuff, sometimes text = bm25 = sparse

PPwnosaurusRex

sometimes text is just ctrl F, etc...

PPwnosaurusRex

Nice, will dig around there too.

PPwnosaurusRex

If I build a vector store with postgres with hybrid on, does text_search use the tsvector column? While sparse does BM25 on...the same column? Essentially the same as performing a FTS natively in postgres?

LLogan M

I think both are using tsv

judging by the code

nice

thanks boss

Happy to see this. It’s next on my list to try to improve retrieval of product names.

Add a reply

Find answers from the community

Anyone here have a good doc on what