Find answers from the community

Updated 5 months ago

I started an OpenAI compatible server

I started an OpenAI compatible server using VLLM, and the model name is "NousResearch/Meta-Llama-3-8B-Instruct", then I construct OpenAI(model=model, openai_api_key=openai_api_key, openai_api_base=openai_api_base, request_timeout=60.0), but got Unknown model error. How to use a local VLLM model?
J
1 comment
If you run a server locally, and that server is configured to run a model behind an endpoint, all you likely need is the endpoint and credentials:

Plain Text
# ollama, for example
OLLAMA_LLM_MAX_TOKENS=20_000
OLLAMA_MODEL=llama3.1
OLLAMA_BASE_URL=http://localhost:11434
# use
from llama_index.llms.ollama import Ollama
llm = Ollama(**config)

# or LMStudio
LMSTUDIO_API_BASE=http://localhost:1234/v1
LMSTUDIO_API_KEY=lm-studio
# use
from llama_index.llms.openai import OpenAI
llm = OpenAI(**config)


Just convert the options and pack the config as needed.
Add a reply
Sign up and join the conversation on Discord