Find answers from the community

Updated 3 months ago

late-chunking/chunked_pooling at main ยท ...

Hi, just a random feature request. Is it possible to create a an embeddings class for jina.ai to leverage their new "late embeddings" approach via HF? Here is the sample code which still has a bunch of custom methods. https://github.com/jina-ai/late-chunking/tree/main/chunked_pooling

blog: https://jina.ai/news/late-chunking-in-long-context-embedding-models/
L
f
4 comments
I think this could be implemented as a node parser/text splitter?

But also, the accuracy improvements don't entirely convince me that its worth the effort to personally contribute this myself ๐Ÿ˜…
๐ŸŒถ๏ธ
But for real, looking at the code here, most of it could be copy-pasted into a node-parser integration package implementing the BaseNodeParser class
https://github.com/jina-ai/late-chunking/blob/main/chunked_pooling/chunking.py


I see it actually uses our semantic splitter too ๐Ÿ‘€
Add a reply
Sign up and join the conversation on Discord