I'm looking to improve the context retrieval performance of my RAG system. Currently, we're using Qdrant with approximately 100k vectors. I experimented with Qdrant's hybrid search following this documentation( https://docs.llamaindex.ai/en/stable/examples/vector_stores/qdrant_hybrid/). While I really liked the improved results from the hybrid search, the response time increased dramatically from around 700ms to between 8 and 11 seconds, which is impractical for my application.
Does anyone have suggestions on how to optimize this response time?