Hybrid

At a glance

The community member is exploring hybrid search using the Qdrant vector DB integration. They are facing issues after enabling hybrid search, such as the Docker image size increasing to 8GB and the instance being unable to handle the workload. The community members suggest that Qdrant requires a model to generate sparse embeddings for hybrid search, and the community member can customize this to ping a hosted endpoint instead. A link is provided to documentation showing the functions for customizing the hybrid search with Qdrant.

Useful resources

TTurner

Hi, not sure if this is the place for this, but I'm currently exploring hybrid search utilizing the Qdrant vector DB integration. Currently have the embedding in the online instance (free tier) of qdrant.
The issue is, utilizing just Qdrant without hybrid search, the entire RAG pipeline runs smoothly (including the hosted backend service for it)
After enabling hybrid search, it seems the pytorch library became a dependency (i may be wrong about when torch was needed, but it is a dpendency in utilzing qdrant).
After enabling hybrid search, my docker image size bloats up to 8gb and the instance that was previously hosting and running the app is now suddenly unable to handle the workload

Now I'm not sure if that is normal behaviour, but I'm asking to see if the only workaround would be to upgrade the instance I'm woking on or I missed something

4 comments

LLogan M

This is correct, qdrant requires a model to generate sparse embeddings for hybrid search to work.

By default, this just loads a model in process.

You can customize this to ping some hosted endpoint instead though

LLogan M

https://docs.llamaindex.ai/en/stable/examples/vector_stores/qdrant_hybrid/#advanced-customizing-hybrid-search-with-qdrant

LLogan M

That shows the functions for customizing

TTurner

thanks, I will check that out 👍

Add a reply

Find answers from the community

Hybrid