Find answers from the community

Updated last year

I am using llama-2 locally for a RAG

At a glance

The community member is using llama-2 locally for a RAG pipeline using llama-cpp-python and wants to change the default system_prompt, but setting it in the LlamaCPP() constructor did not work. Other community members suggest modifying the messages_to_prompt and completion_to_prompt functions, or copying and modifying the original code from the llama_index repository. However, the original poster does not want to add more code or change the llama_utils.py file, and is looking for an easier way to pass the system prompt. One community member mentions that using the ollama library is better than llamacpp because it handles the prompt formatting.

Useful resources
I am using llama-2 locally for a RAG pipeline using llama-cpp-python. I don't want to use the default system_prompt. How do I change it? I tried using the system_prompt argument in LlamaCPP() but it didn't work:

device = torch.device("cuda:0") if torch.cuda.is_available() else torch.device("cpu")
embed_model = HuggingFaceEmbedding(model_name="BAAI/bge-small-en-v1.5", device = device, cache_folder=models_dir)
n_gpu_layers = 0 if device == "cpu" else -1
llm = LlamaCPP(
model_url=None,
model_path=f'{models_dir}/llama-2-7b-chat.Q4_K_M.gguf',
temperature=0.1,
max_new_tokens=256,
# llama2 has a context window of 4096 tokens, but we set it lower to allow for some wiggle room
context_window=3900,
generate_kwargs={},
model_kwargs={"n_gpu_layers": n_gpu_layers, "offload_kqv": True},
# transform inputs into Llama2 format
messages_to_prompt=messages_to_prompt,
completion_to_prompt=completion_to_prompt,
verbose=False,
system_prompt = ""
)
service_context = ServiceContext.from_defaults(
llm=llm,
embed_model=embed_model
)
set_global_service_context(service_context)
L
A
7 comments
modify these functions

Plain Text
messages_to_prompt=messages_to_prompt,
completion_to_prompt=completion_to_prompt,
(keep in mind this is specific to llama2-chat)
Thanks @Logan M ! I knew that but didn't want to add more code or change the llama_utils.py file. I thought there would be an easier way just to pass the system prompt, is there not?
Since those detault utils are inserting the default official llama2 prompt, you'll need to modify them
(imo llamacpp is so hard to use. I've found using ollama is 10000x better because it handles all the prompt formatting)
Thanks @Logan M !!
Add a reply
Sign up and join the conversation on Discord