Ipenai

At a glance

The community member is working with a proxy server for OpenAI models and using an SSH tunnel to hit the server on their localhost. They are able to use the OpenAI client to create a response, but they need to use this setup with llamaindex. They are trying to define their llm and embedding_model, and serve them to their service_context, but are having issues.

The community members have tried several approaches, including setting the api_base kwarg in the embedding and llm constructor to point to their server, but encountered issues with unexpected keyword arguments and missing attributes. They have provided some code snippets and error messages they encountered.

The solution provided in the comments is to use from llama_index.llms import OpenAILike and create the llm using llm = OpenAILike(api_base="http://localhost:8000", model="name_you_gave_to_the_model", api_key="sk-xxx"). However, the community member notes that this solution does not work for them, but another unspecified solution does.

bbixqu

Hi, I am going crazy with finding a solution to this. I am working with a proxy server for OpenAI models. I'm using an ssh tunnel to hit the server on my localhost.

Plain Text

from openai import OpenAI

client = OpenAI(base_url="http://localhost:8000", api_key="sk-xxx")

response = client.chat.completions.create(model="gpt_35_turbo", messages = [
    {
        "role": "user",
        "content": "this is a test request, write a short poem"
    }
])

print(response)

This works perfectly. I need to use this with llamaindex, so I need to define my llm and embedding_model, and serve them to my service_context like so:

Plain Text

llm = OpenAI(model="text-davinci-003", temperature=0, max_tokens=256)
embed_model = OpenAIEmbedding()
text_splitter = SentenceSplitter(chunk_size=1024, chunk_overlap=20)
prompt_helper = PromptHelper(
    context_window=4096,
    num_output=256,
    chunk_overlap_ratio=0.1,
    chunk_size_limit=None,
)

service_context = ServiceContext.from_defaults(
    llm=llm,
    embed_model=embed_model,
    text_splitter=text_splitter,
    prompt_helper=prompt_helper,
)

where I would be calling my own "ada-02" model through the server as well. How can I make this work with my setup? I am totally unable to find any answer to this anywhere and I've already wasted days trying to fix it.

6 comments

LLogan M

Set the api_base kwarg in the embedding and lllm constructor to point to your server

bbixqu

Hi Logan and thanks for the answer. I tried this:

Plain Text

llm = OpenAI(model='gpt_35_turbo', api_base="http://localhost:8000")
embed_model = OpenAI(model='Azure-Text-Embedding-ada-002', api_base="http://localhost:8000")

and I get this

Plain Text

---> 31 llm = OpenAI(model='gpt_35_turbo', api_base="http://localhost:8000")
     32 embed_model = OpenAI(model='Azure-Text-Embedding-ada-002', api_base_url="http://localhost:8000")
     39 doc_processor = DataLoader()

TypeError: OpenAI.__init__() got an unexpected keyword argument 'model'

Is there another way to do it ?

bbixqu

A small update:

Plain Text

from llama_index.llms import OpenAI as OpenAILLM
llm = OpenAILLM(model="gpt_35_turbo", base_url="http://localhost:8000")
embed_model = OpenAILLM(model='Azure-Text-Embedding-ada-002', api_base_url="http://localhost:8000")

returns :

Plain Text

AttributeError: 'OpenAI' object has no attribute 'get_text_embedding_batch'

Seems that the llm worked (?) but the embedding model is not well defined yet. Any ideas ?

bbixqu

SOLUTION :

Plain Text

from llama_index.llms import OpenAILike

llm = OpenAILike(api_base="http://localhost:8000",\
                 model="name_you_gave_to_the_model",
                 api_key="sk-xxx")

Please put this properly in the documentation.... :(((

LLogan M

Is this just azure? We have specific azure classes too

bbixqu

That one does not work for me, but this one does.

Add a reply

Find answers from the community

Ipenai