I am using llama-2 locally for a RAG

At a glance

The community member is using llama-2 locally for a RAG pipeline using llama-cpp-python and wants to change the default system_prompt, but setting it in the LlamaCPP() constructor did not work. Other community members suggest modifying the messages_to_prompt and completion_to_prompt functions, or copying and modifying the original code from the llama_index repository. However, the original poster does not want to add more code or change the llama_utils.py file, and is looking for an easier way to pass the system prompt. One community member mentions that using the ollama library is better than llamacpp because it handles the prompt formatting.

Useful resources

AAnurag Agrawal

I am using llama-2 locally for a RAG pipeline using llama-cpp-python. I don't want to use the default system_prompt. How do I change it? I tried using the system_prompt argument in LlamaCPP() but it didn't work:

device = torch.device("cuda:0") if torch.cuda.is_available() else torch.device("cpu")
embed_model = HuggingFaceEmbedding(model_name="BAAI/bge-small-en-v1.5", device = device, cache_folder=models_dir)
n_gpu_layers = 0 if device == "cpu" else -1
llm = LlamaCPP(
model_url=None,
model_path=f'{models_dir}/llama-2-7b-chat.Q4_K_M.gguf',
temperature=0.1,
max_new_tokens=256,
# llama2 has a context window of 4096 tokens, but we set it lower to allow for some wiggle room
context_window=3900,
generate_kwargs={},
model_kwargs={"n_gpu_layers": n_gpu_layers, "offload_kqv": True},
# transform inputs into Llama2 format
messages_to_prompt=messages_to_prompt,
completion_to_prompt=completion_to_prompt,
verbose=False,
system_prompt = ""
)
service_context = ServiceContext.from_defaults(
llm=llm,
embed_model=embed_model
)
set_global_service_context(service_context)

7 comments

LLogan M

modify these functions

Plain Text

messages_to_prompt=messages_to_prompt,
completion_to_prompt=completion_to_prompt,

LLogan M

I would just copy the original from here, and modify as needed
https://github.com/run-llama/llama_index/blob/main/llama_index/llms/llama_utils.py

LLogan M

(keep in mind this is specific to llama2-chat)

AAnurag Agrawal

Thanks @Logan M ! I knew that but didn't want to add more code or change the llama_utils.py file. I thought there would be an easier way just to pass the system prompt, is there not?

LLogan M

Since those detault utils are inserting the default official llama2 prompt, you'll need to modify them

LLogan M

(imo llamacpp is so hard to use. I've found using ollama is 10000x better because it handles all the prompt formatting)

AAnurag Agrawal

Thanks @Logan M !!

Add a reply

Find answers from the community

I am using llama-2 locally for a RAG