Find answers from the community

Updated last year

Ipenai

At a glance

The community member is working with a proxy server for OpenAI models and using an SSH tunnel to hit the server on their localhost. They are able to use the OpenAI client to create a response, but they need to use this setup with llamaindex. They are trying to define their llm and embedding_model, and serve them to their service_context, but are having issues.

The community members have tried several approaches, including setting the api_base kwarg in the embedding and llm constructor to point to their server, but encountered issues with unexpected keyword arguments and missing attributes. They have provided some code snippets and error messages they encountered.

The solution provided in the comments is to use from llama_index.llms import OpenAILike and create the llm using llm = OpenAILike(api_base="http://localhost:8000", model="name_you_gave_to_the_model", api_key="sk-xxx"). However, the community member notes that this solution does not work for them, but another unspecified solution does.

Hi, I am going crazy with finding a solution to this. I am working with a proxy server for OpenAI models. I'm using an ssh tunnel to hit the server on my localhost.

Plain Text
from openai import OpenAI

client = OpenAI(base_url="http://localhost:8000", api_key="sk-xxx")

response = client.chat.completions.create(model="gpt_35_turbo", messages = [
    {
        "role": "user",
        "content": "this is a test request, write a short poem"
    }
])

print(response)

This works perfectly. I need to use this with llamaindex, so I need to define my llm and embedding_model, and serve them to my service_context like so:

Plain Text
llm = OpenAI(model="text-davinci-003", temperature=0, max_tokens=256)
embed_model = OpenAIEmbedding()
text_splitter = SentenceSplitter(chunk_size=1024, chunk_overlap=20)
prompt_helper = PromptHelper(
    context_window=4096,
    num_output=256,
    chunk_overlap_ratio=0.1,
    chunk_size_limit=None,
)

service_context = ServiceContext.from_defaults(
    llm=llm,
    embed_model=embed_model,
    text_splitter=text_splitter,
    prompt_helper=prompt_helper,
)

where I would be calling my own "ada-02" model through the server as well. How can I make this work with my setup? I am totally unable to find any answer to this anywhere and I've already wasted days trying to fix it.
L
b
6 comments
Set the api_base kwarg in the embedding and lllm constructor to point to your server
Hi Logan and thanks for the answer. I tried this:

Plain Text
llm = OpenAI(model='gpt_35_turbo', api_base="http://localhost:8000")
embed_model = OpenAI(model='Azure-Text-Embedding-ada-002', api_base="http://localhost:8000")

and I get this

Plain Text
---> 31 llm = OpenAI(model='gpt_35_turbo', api_base="http://localhost:8000")
     32 embed_model = OpenAI(model='Azure-Text-Embedding-ada-002', api_base_url="http://localhost:8000")
     39 doc_processor = DataLoader()

TypeError: OpenAI.__init__() got an unexpected keyword argument 'model'


Is there another way to do it ?
A small update:

Plain Text
from llama_index.llms import OpenAI as OpenAILLM
llm = OpenAILLM(model="gpt_35_turbo", base_url="http://localhost:8000")
embed_model = OpenAILLM(model='Azure-Text-Embedding-ada-002', api_base_url="http://localhost:8000")

returns :

Plain Text
AttributeError: 'OpenAI' object has no attribute 'get_text_embedding_batch'

Seems that the llm worked (?) but the embedding model is not well defined yet. Any ideas ?
SOLUTION :
Plain Text
from llama_index.llms import OpenAILike

llm = OpenAILike(api_base="http://localhost:8000",\
                 model="name_you_gave_to_the_model",
                 api_key="sk-xxx")  


Please put this properly in the documentation.... :(((
Is this just azure? We have specific azure classes too
That one does not work for me, but this one does.
Add a reply
Sign up and join the conversation on Discord