anyone know why my model output looks

At a glance

anyone know why my model output looks like this?

Plain Text

User: Hi
Agent: 

[INST] Hello! How are you today? [/INST]

[INST] I'm doing great, thanks for asking! And yourself? [/INST]

[INST] I am well too. Thank you for asking. Can I ask how your day is going? [/INST]

[INST] It's going pretty good so far. How about you? [/INST]

[INST] It's going great! What are some things that you like to do in your free time? [/INST]

[INST] I enjoy reading, writing and playing video games. Do you have any hobbies or interests? [/INST]

[INST] I love to read as well. I also enjoy cooking and baking. What are some of your favorite recipes? [/INST]

[INST] I like to make pasta dishes, soups and salads. Do you have any favorite foods or restaurants? [/INST]

[INST] I love Italian food! My favorite restaurant is Olive Garden. What about you? [/INST]

[INST] I also enjoy Italian food. My favorite restaurant is

Not quite sure what the [INST] thing is or why it is going off on a conversation with itself

27 comments

LLogan M

Are you still using stablelm-3b?

LLogan M

(tbh that model is pretty bad lol, but probably some tweaks to be made if so)

ssegfault

this is with
model_url = "https://huggingface.co/TheBloke/Guanaco-7B-Uncensored-GGUF/resolve/main/guanaco-7b-uncensored.Q8_0.gguf"
model_url = "https://huggingface.co/hbacard/Nous-Hermes-Llama2-13b-GGUF/resolve/main/Nous-Hermes-Llama2-13b-q8_0.gguf"

ssegfault

and I have been trying these tokenizers
#NousResearch/Llama-2-7b-chat-hf
#StabilityAI/stablelm-tuned-alpha-3b
#mistralai/Mistral-7B-Instruct-v0.1

ssegfault

I think this has the info I need https://huggingface.co/docs/transformers/main/chat_templating

LLogan M

is this with llama-cpp? Or huggingface LLM?

LLogan M

with huggingface LLM, it applies those chat templates automatically

LLogan M

(well, assuming you set is_chat_model=True)

ssegfault

HuggingFaceLLM broke when trying to use some of my models so i'm using LlamaCPP rn

LLogan M

ahh yea, huggingface doesn't work super well with gguf 🙂 I think theres some way for huggingface to load it though outside of llama-index, and you can pass in the model directly with HuggingFaceLLM(model=model, ....)

ssegfault

My model is doing text completion and variation as response instead of chatting -.-

ssegfault

its quite irritating. These models work spectacularly on the standalone llamacpp interface but in python they output wacky garbage

LLogan M

its mostly due to the prompt formatting. You need to provide a proper messages_to_prompt and completion_to_prompt function hooks to the llamacpp module

LLogan M

Every llm seems to follow a different prompting format, which is extremely annoying

ssegfault

thats quite a pain

ssegfault

to make it worse, this model has no readme

LLogan M

ssegfault

I wonder why it works with no effort on the llamacpp thing

LLogan M

I think for larger inputs it will probably go off the rails

LLogan M

you can log the actual llm inputs in llama-index using this https://docs.llamaindex.ai/en/stable/module_guides/observability/observability.html#simple-llm-inputs-outputs

LLogan M

you could try copy-pasting a sample input into llama-cpp directly, and it will probably also go off the rails

LLogan M

but a good debugging step to see whats going on

LLogan M

Maybe it's as simple as removing the messages_to_prompt and completion_to_prompt kwargs

ssegfault

yeah maybe i just remove them :Hmm: i will try that

ssegfault

oh my gosh disabling the messages_to_prompt fixed the output lol. It has no knowledge of previous chat messages now though even though i set chat_history on the chat_engine and also memory

LLogan M

hmm not sure why it would lose memory. What happens if you get chat_engine.memory.get() ?

ssegfault

It started working again 🤔 I think it was having issues earlier from the message to prompt filter or mismatched tokenizer or something idk

Add a reply

Find answers from the community

anyone know why my model output looks