Find answers from the community

Updated 2 years ago

Logan M Idk why but I m getting this

At a glance
@Logan M Idk why, but I'm getting this error when trying to use llama-cpp. I've provided the path to the ggml bin file, and yet getting this. Am I missing something? I'm on M1, MacOS.

Plain Text
---> 15 llm = LlamaCPP(
     16     # You can pass in the URL to a GGML model to download it automatically
     17    
     18     model_path='./llama-2-13b-chat.ggmlv3.q4_0.bin',
     19     temperature=1,
     20     max_new_tokens=4096,
     21     # llama2 has a context window of 4096 tokens, but we set it lower to allow for some wiggle room
     22     context_window=3900,
     23     # kwargs to pass to __call__()
     24     generate_kwargs={},
     25     # kwargs to pass to __init__()
     26     # set to at least 1 to use GPU
     27     model_kwargs={"n_gpu_layers": 1},
     28     # transform inputs into Llama2 format
     29     messages_to_prompt=messages_to_prompt,
     30     completion_to_prompt=completion_to_prompt,
     31     verbose=True,
     32 )

...


     97         raise ValueError(
     98             "Provided model path does not exist. "
     99             "Please check the path or provide a model_url to download."
    100         )
    101     else:
--> 102         self._model = Llama(model_path=model_path, **model_kwargs)
    103 else:
    104     cache_dir = get_cache_dir()
    320     with suppress_stdout_stderr():
    321         self.model = 
...
llama_cpp.llama_load_model_from_file(
    322             self.model_path.encode("utf-8"), self.params
    323         )
--> 324 assert self.model is not None
    326 if verbose:
    327     self.ctx = llama_cpp.llama_new_context_with_model(self.model, self.params)

AssertionError:
L
V
11 comments
Make sure your llama-cpp-python vesrion is 0.1.78 or less to use ggml

After that, they stopped supporting ggml and switched to gguf

The latest versions of llama-index will detect your llama-cpp version and download the correct default model for you too
I tried letting llama index do that for me
But got the same error
Tried downloading it myself
Still got the same error
The latest version downloaded a ggml file for me, is that correct?
or will it download the gguf file?
Is the version I should use with llama index, or is this just to use ggml with llama index?
The fix I mentioned there was only added in the very latest version of llama-index

you can see your current llama-cpp version with pip show llama-cpp-python

If it is 0.1.78 or less, you should be using ggml

Any newer, and you should be using gguf
So I think the solution here is either change your llama-index version, or downgrade your llama-cpp version
Add a reply
Sign up and join the conversation on Discord