colbert actually has it's own storage format. We are kind of stuck with what they implemented unless we wanted to reimplement the entire algorithm (I think, anyways, haven't looked too much into that)
However, traditional vector dbs use "dense" vectors, while colbert uses "sparse" vectors. The two concepts are a little different. A few vector dbs support both sparse and dense vectors (qdrant and pinecone)
Did you look into RAGatouille ? It is a full pipeline for ColBERT/v2 and all the other sparse BERTs ..... they have some connector code to langchain and integrated with Vespa db. They have a link to the LlamaIndex codebase to in their readme (lots of updates on the guy's twitter) .......... https://github.com/bclavie/RAGatouille
@jimmy6dof thanks. I will dive deeper into it. I have an existing qdrant cluster with 100m+ vectors so I need something to work with what I have going.