Find answers from the community

Updated 2 months ago

Hey everyone, I'm In a bit of a trouble

Hey everyone, I'm In a bit of a trouble right now and hope you can help me. I'm running LlamaIndex with Qdran on hybrid mode which requires uploading Dense and Sparse vectors, according the docs LlamaIndex will generate the sparse vectors locally. as a result the upload Is very slow.
What are the solutionS we currently have to load our data efficiently ? is there any model that is preferred over others ?
Also, if anyone can share any notebook with some related code, we would really appreciate that.
Thanks a lot guys !
L
1 comment
generating sparse vectors is definitely a bottleneck. We deployed our own model on huggingface inference api, and then provided the customatization hooks to call that when generating sparse vectors

Basically if you can run the sparse embeddings on GPU (either locally or across an api), thats the way to do it
Add a reply
Sign up and join the conversation on Discord