it was more a case of getting all the Azure parameters right - combination of base_url, deployment endpoint, needing to set the version, api_type, and then carefully setting context window size. Some gotchas: currently (28/03) text-embedding-ada-002 has max size of 4096, so when using langchain stuff, have to set max_chunk_size to 1 (langchain confusingly names this parameter for the number of chunks to send to embedding at a time). We discovered this in experimenting with OpenAI and Langchain versions of embedding models & predictor models. We also found we seem to get decent QA results generally with a 2048 chunk size.