Find answers from the community

Updated 3 months ago

I am having a issue trying to use Open-

I am having a issue trying to use Open-Orca/OpenOrca-Platypus2-13B. I am gertting [/INST] all over the place and the model keeps chatting with itself. I am using vLLM currently as an "openailike" server.

I looked around the and found an issue where it said to use the STOP command in the API. This made everything work a lot better actually:

Plain Text
curl https://localhost:8000/v1/chat/completions \
    -H "Content-Type: application/json" \
    -d '{
        "model": "Open-Orca/OpenOrca-Platypus2-13B",
        "stop": ["[INST]", "[/INST]"],
        "messages": [
            {"role": "user", "content": "What is the square root of two"}
        ] }'


But I can't see if there is a way for llamaindex to do this as well? I have read through the docs and looked at the code but couldnt figure out if there was an easier way to do this. Any ideas?
L
i
9 comments
you can set this under additional_kwargs in the openai like constructor
I don't 100% know if VLLM is doing any prompt templating though

I might recommend also setting up messages_to_prompt and completion_to_prompt hooks to properly format requests using openailike πŸ‘€
Thanks! I checked the docs:
https://docs.llamaindex.ai/en/stable/api_reference/llms/openai_like/

And i did not see the option for messages_to_prompt for openailike
I will try adding this to the kwargs too.
The stop in the kwarg seems to have almost solved it
except now I have a [/SYS] at the end? so odd.
Ah that's awesome. Thank you
Add a reply
Sign up and join the conversation on Discord