Hi all,
I was running below notebook
Context:
https://docs.llamaindex.ai/en/stable/examples/query_engine/RouterQueryEngine.html I was trying to change from openai API defaults call to opensource models, for the
vector indexer => I was able to use Llama-2-70b-chat-hf model using anyscale but the same model when I tried for the summary indexer gave this error
HTTP Request: POST
https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
Selecting query engine 0: The choice is specifically tailored for summarization questions related to Paul Graham's essay on What I Worked On..
HTTP Request: POST
https://api.endpoints.anyscale.com/v1/chat/completions "HTTP/1.1 400 Bad Request"
---------------------------------------------------------------------------
BadRequestError Traceback (most recent call last)
<ipython-input-66-6f1a493620ea> in <cell line: 1>()
----> 1 response = query_engine.query("What is the summary of the document?")
BadRequestError: Error code: 400 - {'generated_text': None, 'tool_calls': None, 'embedding_outputs': None, 'logprobs': None, 'num_input_tokens': None, 'num_input_tokens_batch': None, 'num_generated_tokens': None, 'num_generated_tokens_batch': None, 'preprocessing_time': None, 'generation_time': None, 'timestamp': 1707664137.3785233, 'finish_reason': None, 'error': {'message': 'rayllm.backend.llm.error_handling.PromptTooLongError: Input too long. Recieved 6209 tokens, but the maximum input length is 4096 tokens. (Request ID: b16c0bba-c7dd-470c-8660-539de4450d13)', 'internal_message': 'rayllm.backend.server.openai_compat.openai_exception.OpenAIHTTPException (Request ID: b16c0bba-c7dd-470c-8660-539de4450d13)', 'code': 400, 'type': 'OpenAIHTTPException', 'param': {}}, 'num_total_tokens': 0, 'num_total_tokens_batch': 0, 'total_time': None}
Then i though this error was because of openai API then I changed the openai api call like below, but the error was not resolved