Find answers from the community

Updated last year

LLM

At a glance

The community member is experiencing an error when using the subquestion query engine from the documentation. When they ask a question unrelated to the document they provided, it causes a ValueError: No valid JSON found in output error. However, when they query the same question again, the error disappears and the system says it does not know the answer, which is the desired behavior. The community member does not want this unpredictable error to occur.

In the comments, another community member suggests that the issue is because the language model is not following instructions and not writing proper JSON. They recommend the community member file a GitHub issue or wrap the code in a try-except block as a potential solution.

The community member provides the code they are using, which includes the LLM they are using (zephyr-7b-alpha) and the configuration details. They also mention that they will try using a try-catch block as a solution.

There is no explicitly marked answer in the post or comments.

Hello everyone, can anyone tell me how to prevent this error or any solution of the error? I just use the subquestion query engine from the documentation and when i try to ask a question unrelated to the document i gave, it causes an error. But then when i query again using the same question it disappears and it just say it does not know(this is what i want), But i do not want this kind of unpredictable error popping off,


ValueError: No valid JSON found in output: This is not a valid example for the given prompt. The user question is not related to the provided tools.

Their are different errors that will be given but mostly about ValueError: No valid JSON found in output
This happens when i try to query that is outside the scope of the vectorstore
L
H
3 comments
This is happening because the LLM is not following instructions, and is not writing proper json

What LLM are you using?

Tbh you should probably file a github issue (internally this could be more robust) or just wrap with a try/except
the llm i am using is zephyr-7b-alpha and this is from the documentation:
import torch
from transformers import BitsAndBytesConfig
from llama_index.prompts import PromptTemplate
from llama_index.llms import HuggingFaceLLM

quantization_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_compute_dtype=torch.float16,
bnb_4bit_quant_type="nf4",
bnb_4bit_use_double_quant=True,
)


def messages_to_prompt(messages):
prompt = ""
for message in messages:
if message.role == 'system':
prompt += f"<|system|>\n{message.content}</s>\n"
elif message.role == 'user':
prompt += f"<|user|>\n{message.content}</s>\n"
elif message.role == 'assistant':
prompt += f"<|assistant|>\n{message.content}</s>\n"

# ensure we start with a system prompt, insert blank if needed
if not prompt.startswith("<|system|>\n"):
prompt = "<|system|>\n</s>\n" + prompt

# add final assistant prompt
prompt = prompt + "<|assistant|>\n"

return prompt


llm = HuggingFaceLLM(
model_name="HuggingFaceH4/zephyr-7b-alpha",
tokenizer_name="HuggingFaceH4/zephyr-7b-alpha",
query_wrapper_prompt=PromptTemplate("<|system|>\n</s>\n<|user|>\n{query_str}</s>\n<|assistant|>\n"),
context_window=3900,
max_new_tokens=256,
model_kwargs={"quantization_config": quantization_config},
# tokenizer_kwargs={},
generate_kwargs={"temperature": 0.7, "top_k": 50, "top_p": 0.95},
messages_to_prompt=messages_to_prompt,
device_map="auto",
)
using try catch is also one of the option i wll try
Add a reply
Sign up and join the conversation on Discord