Find answers from the community

Updated 2 years ago

Prompting

At a glance

A community member is experiencing issues with hallucination when using the 4bit llama2 70b model. They are seeking advice on how to prompt better or finetune the model. Other community members suggest using the INST and EOS/BOS tokens when prompting, as the format for llama2-chat is quite strict. They provide a sample format and mention that there are utility functions available in the llama_index library that may be helpful for implementing a custom LLM class.

Useful resources
Yeah from experience using 4bit llama2 70b it's hallucinating pretty badly. Gotta figure out how to prompt better or to finetune it
L
w
7 comments
Are you using the INST and EOS/BOS tokens when prompting?
The format for llama2-chat is pretty strict I've found
Did you mean this?
<s>[INST] <<SYS>>
{{ system_prompt }}
<</SYS>>

{{ user_message }} [/INST]
I was planning to look into incorporating this into llama index calls to a llama 2 api using this format
Not sure if llama index already has this supported format in the code
We have some utils functions that help with this. If you are implementing a custom llm class, then they will be useful

https://github.com/jerryjliu/llama_index/blob/main/llama_index/llms/llama_utils.py
Add a reply
Sign up and join the conversation on Discord