Find answers from the community

Updated 11 months ago

Hi all,

At a glance
Hi all,

I was running below notebook
Context:
https://docs.llamaindex.ai/en/stable/examples/query_engine/RouterQueryEngine.html

I was trying to change from openai API defaults call to opensource models, for the

vector indexer => I was able to use Llama-2-70b-chat-hf model using anyscale but the same model when I tried for the summary indexer gave this error


HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
Selecting query engine 0: The choice is specifically tailored for summarization questions related to Paul Graham's essay on What I Worked On..
HTTP Request: POST https://api.endpoints.anyscale.com/v1/chat/completions "HTTP/1.1 400 Bad Request"
---------------------------------------------------------------------------
BadRequestError Traceback (most recent call last)
<ipython-input-66-6f1a493620ea> in <cell line: 1>()
----> 1 response = query_engine.query("What is the summary of the document?")

BadRequestError: Error code: 400 - {'generated_text': None, 'tool_calls': None, 'embedding_outputs': None, 'logprobs': None, 'num_input_tokens': None, 'num_input_tokens_batch': None, 'num_generated_tokens': None, 'num_generated_tokens_batch': None, 'preprocessing_time': None, 'generation_time': None, 'timestamp': 1707664137.3785233, 'finish_reason': None, 'error': {'message': 'rayllm.backend.llm.error_handling.PromptTooLongError: Input too long. Recieved 6209 tokens, but the maximum input length is 4096 tokens. (Request ID: b16c0bba-c7dd-470c-8660-539de4450d13)', 'internal_message': 'rayllm.backend.server.openai_compat.openai_exception.OpenAIHTTPException (Request ID: b16c0bba-c7dd-470c-8660-539de4450d13)', 'code': 400, 'type': 'OpenAIHTTPException', 'param': {}}, 'num_total_tokens': 0, 'num_total_tokens_batch': 0, 'total_time': None}


Then i though this error was because of openai API then I changed the openai api call like below, but the error was not resolved
r
L
2 comments
query_engine = RouterQueryEngine(
selector=PydanticSingleSelector.from_defaults(llm = OpenAI(model="gpt-3.5-turbo-0125")),
query_engine_tools=[
summary_tool,
vector_tool,
],
)

After that I tired using "mistralai/Mixtral-8x7B-Instruct-v0.1" for opensource model for summary indexer and it worked then

so I am not sure.. why it did not work earlier and by changing the model it worked
seems like a token counting issue

Try setting the global tokenizer to achieve better token counting
https://docs.llamaindex.ai/en/stable/module_guides/models/llms.html#a-note-on-tokenization
Add a reply
Sign up and join the conversation on Discord