Find answers from the community

Updated 4 months ago

I have a hard time using some different

At a glance
I have a hard time using some different gguf models with llama-cpp in llama-index. Some work just find and some only answer garbage. So my first guess is like maybe its the prompting structure and I played with that but only with very little to none success... now I found when initializing the llm with llamacpp there is this:

# transform inputs into Llama2 format
messages_to_prompt=messages_to_prompt,
completion_to_prompt=completion_to_prompt,

Ok so it is kinda "hard coded" to llama2 prompting, even tho it still does not work with llama-2-chat...

is there any other things I could add here to point to different types of models?
W
o
L
6 comments
On a separate note! Did you try Ollama. I hear its pretty good to setup and use
no i did not πŸ™‚ I try to get around that
Yea generally, you need to write the messages_to_prompt and completion_to_prompt functions for each model
There might also be sampling/generation parameters to tune as well per mdoel
ollama automates all this though
I got it working by looking at oobabooga how it prepares the prompts for the models I like to use and now do a direct LLM query instead of the query engine. that works quite well for now
Add a reply
Sign up and join the conversation on Discord