Find answers from the community

s
F
Y
a
P
Updated 8 months ago

i am looking to use xinfrence through

i am looking to use xinfrence through their openapi compatible rest api. for llm usage there is llama-index-llms-openai-like, but not for embedding.
s
W
L
20 comments
do we need to somehow extend the openai embedding metadata with the hosted xinference models?
or should we wait for llama-index-embeddings-openai-like (i.e. open a feature request)
You can try creating a custom embedding class: https://docs.llamaindex.ai/en/stable/examples/embeddings/custom_embeddings.html#custom-embeddings-implementation

define your endpoint and update the method from where it can call to your model.


Second option would be to use https://github.com/run-llama/llama_index/blob/main/llama-index-integrations/embeddings/llama-index-embeddings-ollama/llama_index/embeddings/ollama/base.py

Not sure if this will work out of the box by just providing the base_url to your model but can be a good place to start
Also: If you want to contribute you are most welcome for this feature!!πŸ’ͺ
@WhiteFang_Jr i am not familiar with ollama, but from https://github.com/ollama/ollama/issues/305 it's unclaer if they also have an openai compatble api
teh custom embedding class is not what i want i think
i will try to monkeypath the openai predefined models
and see how far i get with that
the new code structure is a bit ... euhm ... complex
in volume it's 90% poetry.lock files
It's pretty organized tbh

For openai-like embeddings, you can just change the api_base and model_name kwargs for OpenAIEmbedding
@Logan M thx, i'll have a look
wrt the code, i guess it's in some hybrid state and the intend is to make them all separate repos? anyway, atm it looks odd but i understand why this is better
the get_engine code won't like unknown models
but it looks easy to patch it
if you pass model_name="model" it will skip get_engine
@Logan M ah, thanks. now i see it.
@Logan M wrt llm openai vs openai-like, it is onlyt the context length and if a model is chat an/or generate?
pretty much πŸ™‚
Add a reply
Sign up and join the conversation on Discord