Just got this error:
worker-1 | [2024-04-18 15:55:25,266: INFO/ForkPoolWorker-7] HTTP Request: POST
https://api.openai.com/v1/chat/completions "HTTP/1.1 400 Bad Request"
worker-1 | [2024-04-18 15:55:25,267: WARNING/ForkPoolWorker-7] Retrying llama_index.llms.openai.base.OpenAI._chat in 7.560941276281184 seconds as it raised BadRequestError: Error code: 400 - {'error': {'message': "This model's maximum context length is 16385 tokens. However, your messages resulted in 16441 tokens (16176 in the messages, 265 in the functions). Please reduce the length of the messages or functions.", 'type': 'invalid_request_error', 'param': 'messages', 'code': 'context_length_exceeded'}}.
i thought the query engine automatically chunks API calls so this cant happen. What is going wrong here? :/
index = initialize_index(model)
base_args = {"llm": get_llm(model_name=model), "filters": get_document_filters(uuid)}
outline_query_engine = index.as_query_engine(
output_cls=PresentationOutlineV1, similarity_top_k=15, **base_args,
)
outline = outline_query_engine.query(outline_query_str).response.dict()