A community member is having issues loading the 70B llama2 model on LlamaCPP, while they were successful with the 7B and 13B models. The community members discuss potential causes, such as the model path being incorrect or the model not loading properly. Eventually, one community member finds a solution by setting the "n_gqa" parameter to 8 for the 70B model, which resolves the issue.
Anyone else having issues loading the 70B llama2 model on LlamaCPP? I was successful with the 7B and 13B models but I’m getting a vague error for 70B. (See attached image)
My cluster is CPU only but has up to 96 workers and 768GB ram.