Llama cpp

At a glance

The post asks if anyone has tried using Llama.cpp with LlamaIndex to access quantized models on a CPU-only setup, and whether using CustomLLM would work. The comments suggest that the LlamaIndex team plans to add native support for Llama.cpp, but in the meantime, a community member can instantiate the Llama.cpp class from LangChain and wrap it in the LangChainLLM wrapper provided by LlamaIndex. Another community member notes that Llama2 is picky about prompt formatting, and suggests using the LlamaIndex Prompt function to create custom system and prompt formats. The final comment recommends wrapping Llama.cpp with the CustomLLM class, which allows customizing both the completion and chat endpoints, and using utility functions provided by LlamaIndex.

Useful resources

ccaptam_morgan

Curious if anyone tried using Llama.cpp with LlamaIndex so they can easily access quantized models with CPU only setup. Will using CustomLLM do the trick?

4 comments

LLogan M

It's on our todo list to add native support for llama cpp BUT

You can instantiate the llama cpp class from langchain and wrap it in our langchain wrapper

Plain Text

from llama_index.llms import LangChainLLM

llm = LangChainLLM(<lc_llm>)

ccaptam_morgan

Oh great workaround!

ccaptam_morgan

Sorry for the late reply. I was able to load with LangChain, but Llama2 is very picky about prompt format. It needs things like [INST]. Do I just use the LlamaIndex Prompt function to create my own system and prompt formats? I know the HuggingFaceLLM functions have both as parameters

LLogan M

Tbh it might be easiest to wrap llama cpp with the custom llm class

Then you can ensure every prompt is formatted the way you want?

You can customize both the completion and chat endpoints. (If you don't implement chat(), then any call to chat will call complete under the hood)

https://gpt-index.readthedocs.io/en/stable/core_modules/model_modules/llms/usage_custom.html#example-using-a-custom-llm-model-advanced

We do have utils functions too that you could import and use
https://github.com/jerryjliu/llama_index/blob/main/llama_index/llms/llama_utils.py

Add a reply

Find answers from the community

Llama cpp