Find answers from the community

s
F
Y
a
P
Updated last month

Mistral 7b

Hello everyone using mistral 7b instruct as my llm but when applying it into chat engine got incomplete response as compared to llama2 llm model
W
a
10 comments
It could be because of the following reasons I suppose
  • What is the max_token value that you have set. And how much was left for llm to process and generate.
  • opensource llm smaller than 13b sometimes act wierd.
@WhiteFang_Jr max_new_tokens=256
Could it be possible the max new tokens are already generated?
Did you check on that
@WhiteFang_Jr can u plz telll me why i am getting different responses from chatengine and queryengine for the same query
Which chatengine are you trying with? Condense mode?
@WhiteFang_Jr context mode
Context engine lets you set the system prompt, Do you have the same thing for query engine as well? In the form of template?

It could be causing the difference in the response plus context maintains the chat history that can also alter the response part using the llm
@WhiteFang_Jr regarding your 1st point i am using the default template in both the cases

2nd point-for the every first time i think the chat history should be blank
@WhiteFang_Jr can u suggest some ways for better performance on chatengine
Add a reply
Sign up and join the conversation on Discord