I have a hard time using some different

At a glance

I have a hard time using some different gguf models with llama-cpp in llama-index. Some work just find and some only answer garbage. So my first guess is like maybe its the prompting structure and I played with that but only with very little to none success... now I found when initializing the llm with llamacpp there is this:

# transform inputs into Llama2 format
messages_to_prompt=messages_to_prompt,
completion_to_prompt=completion_to_prompt,

Ok so it is kinda "hard coded" to llama2 prompting, even tho it still does not work with llama-2-chat...

is there any other things I could add here to point to different types of models?

6 comments

WWhiteFang_Jr

On a separate note! Did you try Ollama. I hear its pretty good to setup and use

oosiworx

no i did not 🙂 I try to get around that

LLogan M

Yea generally, you need to write the messages_to_prompt and completion_to_prompt functions for each model

LLogan M

There might also be sampling/generation parameters to tune as well per mdoel

LLogan M

ollama automates all this though

oosiworx

I got it working by looking at oobabooga how it prepares the prompts for the models I like to use and now do a direct LLM query instead of the query engine. that works quite well for now

Add a reply

Find answers from the community

I have a hard time using some different