Find answers from the community

Updated last year

llama-cpp-python/examples/high_level_api...

At a glance

Can we use llama_cpp embeddings as well? i.e. instead of HuggingFaceEmbedding use https://github.com/abetlen/llama-cpp-python/blob/main/examples/high_level_api/high_level_api_embedding.py llm.create_embedding("Hello world!")? Does it make sense to standardize on one serving layer and not mix them?

3 comments

Usually for the best embeddings, you want a model actually trained for embeddings/retrieval (like bge-base-en-v1.5)

But otherwise, we would have to add an embeddings integration for llamacpp

Understood. but this means for serving would you suggest the native HuggingFaceEmbedding or https://github.com/huggingface/optimum ?

Add a reply