Find answers from the community

Updated 2 months ago

VLLM Configuration Issues in Llama Index

I noticed couple of things.
  1. vLLM needs the model parameter in the payload but i don't see that is being sent from llama-index-llms-vllm
  2. why the payload to the /chat/completions is being translated to prompt infact it should be messages: [{role:'',content:''},{role:'',content:''}]
W
p
5 comments
For second,
I think the structure of the VLLM server class was likely created that way when it was first added.
So this needs modification then
the reason for this is you can not use /chat/completions, we should use /completions because the payload is not compatable with openai format
i hope i am making it clear
Add a reply
Sign up and join the conversation on Discord