Hi, I am going crazy with finding a solution to this. I am working with a proxy server for OpenAI models. I'm using an ssh tunnel to hit the server on my localhost.
from openai import OpenAI
client = OpenAI(base_url="http://localhost:8000", api_key="sk-xxx")
response = client.chat.completions.create(model="gpt_35_turbo", messages = [
{
"role": "user",
"content": "this is a test request, write a short poem"
}
])
print(response)
This works perfectly. I need to use this with llamaindex, so I need to define my llm and embedding_model, and serve them to my service_context like so:
llm = OpenAI(model="text-davinci-003", temperature=0, max_tokens=256)
embed_model = OpenAIEmbedding()
text_splitter = SentenceSplitter(chunk_size=1024, chunk_overlap=20)
prompt_helper = PromptHelper(
context_window=4096,
num_output=256,
chunk_overlap_ratio=0.1,
chunk_size_limit=None,
)
service_context = ServiceContext.from_defaults(
llm=llm,
embed_model=embed_model,
text_splitter=text_splitter,
prompt_helper=prompt_helper,
)
where I would be calling my own "ada-02" model through the server as well. How can I make this work with my setup? I am totally unable to find any answer to this anywhere and I've already wasted days trying to fix it.