Find answers from the community

Updated 4 months ago

Getting streaming working with GPT-35-Turbo and also (ideally) the Azure endpoints too

At a glance
hmm maybe not atm
R
A
j
6 comments
return LLMPredictor(llm=AzureOpenAI(
openai_api_key=jwt,
temperature=0,
max_tokens=512,
deployment_name='text-davinci-003',
model_kwargs={
"api_key": jwt,
"api_base": openai.api_base,
"api_type": openai.api_type,
"api_version": openai.api_version,
}))

this using AzureOpenAI doesn't work either with streaming- would be great to have this working with both Azure & Gpt3.5Turbo
One of my team are looking at it too
Yeah! Waiting for that me too
We got this working. The azure examples folder works fine, and for streaming
sweet! @Runonthespot what did you change?
it was more a case of getting all the Azure parameters right - combination of base_url, deployment endpoint, needing to set the version, api_type, and then carefully setting context window size. Some gotchas: currently (28/03) text-embedding-ada-002 has max size of 4096, so when using langchain stuff, have to set max_chunk_size to 1 (langchain confusingly names this parameter for the number of chunks to send to embedding at a time). We discovered this in experimenting with OpenAI and Langchain versions of embedding models & predictor models. We also found we seem to get decent QA results generally with a 2048 chunk size.
Add a reply
Sign up and join the conversation on Discord