Find answers from the community

Updated 4 months ago

Max tokens

At a glance

The community member is facing an issue where setting service_context=service_context in the query_engine prevents the ChatGPT model from accessing general knowledge, while not setting it results in the response being cut off for longer text. The comments suggest that the LLM always has access to external knowledge, and the issue is likely due to prompt engineering. The community members discuss modifying the text_qa_template and refine_template, as well as adjusting the maximum output tokens, as potential solutions to achieve both access to general knowledge and no cut-off of longer responses.

Useful resources
Hi! πŸ™‚

I am running into an issue where if I set service_context=service_context in query_engine like so:
Plain Text
llm_predictor = ChatGPTLLMPredictor(llm=ChatOpenAI(temperature=0, model_name="gpt-3.5-turbo", streaming=False))
service_context = ServiceContext.from_defaults(llm_predictor=llm_predictor)
    query_engine = index.as_query_engine(text_qa_template=CHAT_QA_PROMPT,
                                         refine_template=CHAT_REFINE_PROMPT,
                                         similarity_top_k=3,
                                         streaming=False, 
                                         service_context=service_context)

then the chatGPT does NOT have access to General Knowledge.

However, when I do NOT set service_context=service_context in query_engine like so:
Plain Text
    query_engine = index.as_query_engine(text_qa_template=CHAT_QA_PROMPT,
                                         refine_template=CHAT_REFINE_PROMPT,
                                         similarity_top_k=3)

then I do have access to General Knowledge, but the ChatGPT response is getting cut off when it writes longer text response.

How do I achieve both access to General Knowledge AND no cut off of longer text responses?

Thank you!
L
M
M
8 comments
The LLM technically always has access to external knowledge, just a matter of prompt engineering πŸ‘Œ

In any case, when you don't set the service context, it's actually using a completely different model (text-davinci-003)

All openai models default to 256 max tokens. You can change this by setting max_tokens


https://gpt-index.readthedocs.io/en/latest/how_to/customization/custom_llms.html#example-changing-the-number-of-output-tokens-for-openai-cohere-ai21
@Logan M @Maker I seem to have the same problem of not having access to external knowledge. And if I take it out I can't stream. What could be the solution here?
Yea as I mentioned above, the llm always has access to external knowledge, it's just a matter of prompt engineering.

You'll get the best results if you create a custom text_qa_template and refine_template like @Maker has above
Here's a link to an example, where I added a system prompt for gpt-3.5

You can probably skip the system prompt and just modify the instructions

https://discord.com/channels/1059199217496772688/1109906051727364147/1109972300578693191

If you aren't using gpt-3.5, it will be slightly different, so let me know
I modified the text_qa_template like so
Attachment
image.png
and also added this piece of text into system prompt
and that fixed it for me, the gpt now has access to external knowledge
Add a reply
Sign up and join the conversation on Discord