The community member recently updated the llama-index-llms-openai package to use GPT-4o mini, but encountered an error when trying to set the max_tokens limit. The context window for GPT-4o mini is 128K tokens, but the error message indicated that the model supports at most 16384 completion tokens. Other community members suggested double-checking that GPT-4o mini is being used, and provided suggestions to make the prompt more clean and descriptive to improve reliability with GPT-4o mini and SQL/database queries.
I recently updated llama-index-llms-openai 0.1.26 to use 4o-mini. On openai's website they said CONTEXT WINDOW 128k tokens but when I tried to set the limit it said
Plain Text
Error code: 400 - {\'error\': {\'message\': \'max_tokens is too large: 120000. This model supports at most 16384 completion tokens, whereas you provided 120000.\', \'type\': \'invalid_request_error\', \'param\': \'max_tokens\', \'code\': None}}\n'}
A month back I asked about gpt-4 and sql query. 3.5 will follow the instruction and do query but gpt-4 will refuse to query directly to db. Has anything changed since then?
Many thanks @WhiteFang_Jr ! Could you say more or share some good+bad examples of a "more clean and more descriptive" prompt for optimal reliability with gpt-4o mini and SQL/db queries?