I started an OpenAI compatible server

niwang66 · 2024-08-09T08:29:14.939Z

I started an OpenAI compatible server using VLLM, and the model name is "NousResearch/Meta-Llama-3-8B-Instruct", then I construct OpenAI(model=model, openai_api_key=openai_api_key, openai_api_base=openai_api_base, request_timeout=60.0), but got Unknown model error. How to use a local VLLM model?

If you run a server locally, and that server is configured to run a model behind an endpoint, all you likely need is the endpoint and credentials:

Plain Text

# ollama, for example
OLLAMA_LLM_MAX_TOKENS=20_000
OLLAMA_MODEL=llama3.1
OLLAMA_BASE_URL=http://localhost:11434
# use
from llama_index.llms.ollama import Ollama
llm = Ollama(**config)

# or LMStudio
LMSTUDIO_API_BASE=http://localhost:1234/v1
LMSTUDIO_API_KEY=lm-studio
# use
from llama_index.llms.openai import OpenAI
llm = OpenAI(**config)

Just convert the options and pack the config as needed.

Find answers from the community

I started an OpenAI compatible server