Hey so I am trying to use a

At a glance

The community member is encountering an error when using a SubQuestionQueryEngine, specifically an OutputParserException due to the LLM not generating valid JSON. The comments suggest that using a more capable model like GPT-4 or text-davinci-003 may resolve the issue, as they are less prone to this problem compared to GPT-3.5. There is also a discussion on whether the SubQuestionQueryEngine is the appropriate tool for the task at hand, which is to find the most relevant blog post for a given query. Some community members suggest that a normal vector search may be sufficient, while others argue that the SubQuestionQueryEngine is better suited for compare/contrast type queries. Finally, one community member shares their debugging experience, which led them to increase the token limit and set the service context correctly in the LLMQuestionGenerator object, which resolved the JSON decoding error for them.

ZZachHandley

Hey so I am trying to use a SubQuestionQueryEngine and I got this error after trying to run the query

Plain Text

  response = await query_engine.aquery(script_prompt)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/llama_index/indices/query/base.py", line 30, in aquery
    response = await self._aquery(str_or_query_bundle)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/llama_index/query_engine/sub_question_query_engine.py", line 124, in _aquery
    sub_questions = await self._question_gen.agenerate(
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/llama_index/question_gen/llm_generators.py", line 78, in agenerate
    parse = self._prompt.output_parser.parse(prediction)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/llama_index/question_gen/output_parser.py", line 10, in parse
    json_dict = parse_json_markdown(output)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/llama_index/output_parsers/utils.py", line 18, in parse_json_markdown
    raise OutputParserException(f"Got invalid JSON object. Error: {e}")
llama_index.output_parsers.base.OutputParserException: Got invalid JSON object. Error: Unterminated string starting at: line 49 column 22 (char 4229)

10 comments

LLogan M

Are you using gpt-3.5? This error is likely because the LLM didn't output valid JSON for llama-index to parse

text-davinci-003 or gpt-4 make this mistake less often (tbh gpt-3.5 performance has tanked the last few months 💩 )

ZZachHandley

gotcha, so I need to use GPT 4 for this specific query?

LLogan M

Seems like it? or text-davinci-003 (slightly cheaper, faster). At least they should be more dependable

ZZachHandley

Should I be using a SubQuestionQueryEngine here? I am essentially trying to ask it which of my blog posts is relevant to the query and I figure if I use a SubQuestionQueryEngine it'll be more accurate

LLogan M

I think the normal vector search should do a good enough job for a query like that, the sub question query engine is more for compare/contrast type queries

ZZachHandley

well if it is specifically to select the best blog posts is that not a compare/contrast type query?

ZZachHandley

I thought that because it needs to find the "best" option, it would be better to compare each option to the others

SSamuel Kahn

FWIW, I was encountering the exact same issue, but was able to fix it. I had tried text-davinci-003, gpt-3.5 and gpt-4 and was getting the same JSONDecodError . I put in some light debugging and it appears regardless of model-type, it was not generating valid JSON and generating the exact same length partial JSON each time.

So that led me to believe the issue appears to be that the token count is set to 256, so when it's generating a response it stops generating in the middle of generating the JSON object. Increasing the token limit fixed the issue.

The issue I specifically had was when the SubQuestionQueryEngine class was instantiated I was not setting the service context in the right way. I was setting the service context in the ResponseSythesizer object, but it needs to be set in LLMQuestionGenerator object. That fixed the issue for me (at least the JSONDecodeError). Maybe it makes sense to set the default to higher than 256 as most use cases for a compare and contrast query will involve several files? Or at least being clear this could be an issue.

LLogan M

Oh wow, great debugging!

I think now if you pass the service context into the sub query engine directly, the response synthesizer and question generator 🙏

It's a little difficult to set different defaults per query engine like that, but definitely something to look into.

One note is that you can set a global service context so you don't have to pass it in everywhere (in newish versions that is, 0.6.20+ I think)

Plain Text

from llama_index import set_global_service_context

set_global_service_context(service_context)

SSamuel Kahn

@Logan M Great! Thanks for the quick response! Missed the global service context setting! I think in most cases with complex routing or compositions setting different service contexts is likely.

Add a reply

Find answers from the community

Hey so I am trying to use a