Find answers from the community

Updated 2 months ago

hi im getting when using sql database

hi im getting when using sql_database
; ---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Cell In[89], line 1
----> 1 response = query_engine.query("Algum comentário menciona a empresa BAT?")
2 response

File ~/opt/miniconda3/envs/chatbot/lib/python3.9/site-packages/llama_index/indices/query/base.py:23, in BaseQueryEngine.query(self, str_or_query_bundle)
21 if isinstance(str_or_query_bundle, str):
22 str_or_query_bundle = QueryBundle(str_or_query_bundle)
---> 23 response = self._query(str_or_query_bundle)
24 return response

File ~/opt/miniconda3/envs/chatbot/lib/python3.9/site-packages/llama_index/indices/struct_store/sql_query.py:279, in BaseSQLTableQueryEngine._query(self, query_bundle)
276 metadata["sql_query"] = sql_query_str
278 if self._synthesize_response:
--> 279 response_str = self._service_context.llm_predictor.predict(
280 self._response_synthesis_prompt,
281 query_str=query_bundle.query_str,
282 sql_query=sql_query_str,
283 sql_response_str=raw_response_str,
284 )
285 else:
286 response_str = raw_response_str

File ~/opt/miniconda3/envs/chatbot/lib/python3.9/site-packages/llama_index/llm_predictor/base.py:123, in LLMPredictor.predict(self, prompt, **prompt_args)
...
239 f"Please use a prompt that is less than {context_window} tokens."
240 )
241 return max_token

ValueError: The prompt is too long for the model. Please use a prompt that is less than 4097 tokens.
Output is truncated. View as a scrollable element or open in a text editor. Adjust cell output settings...
L
g
3 comments
I think this is a case where the returned result from the SQL query is too big for the LLM 🤔

We should have better handling for this tbh
Is there a way to break the output into chunks and have the agent read these chunks separately? it would imply a higher charge from openai but it would solve my problem.
Yea exactly, that would be the solution. Under the hood the code should likely use a response synthesizer to generate the response, rather than a low level llm call
Add a reply
Sign up and join the conversation on Discord