ggml

At a glance

The post contains information about loading a GPT-J model, including details about the model's size and memory requirements. Community members discuss the challenges of running the model, noting that it requires a large amount of video RAM (at least 12 GB) to run efficiently on a GPU. Some community members suggest trying to run the model on a CPU instead, but note that this may not be as efficient. One community member mentions having access to an A6000 GPU and plans to experiment with running the model on it.

Useful resources

VVaylonn

gptj_model_load: n_vocab = 50400
gptj_model_load: n_ctx   = 2048
gptj_model_load: n_embd  = 4096
gptj_model_load: n_head  = 16
gptj_model_load: n_layer = 28
gptj_model_load: n_rot   = 64
gptj_model_load: f16     = 2
gptj_model_load: ggml ctx size = 4505.45 MB
gptj_model_load: memory_size =   896.00 MB, n_mem = 57344
gptj_model_load: ................................... done
gptj_model_load: model size =  3609.38 MB / num tensors = 285
ggml_new_tensor_impl: not enough space in the context's memory pool (needed 11393620112, available 11390260208)

any info where this could come from ?
not enough ram ? or vram ?

6 comments

LLogan M

I think ggml only runs on cou right? Seems like not enough ram

VVaylonn

The GPT-J model is quite big - the compact version of the model uses 16-bit floating point representation of the weights and is still 12 GB big. This means that in order to run inference on your computer, you would need to have a video card with at least 12 GB of video RAM. Alternatively, you can try to run the python implementations on the CPU, but that would probably not be very efficient as they are primarily optimized for running on a GPU (or at least this is my guess - I don't have much experience with python).

VVaylonn

from: https://github.com/ggerganov/ggml/tree/master/examples/gpt-j#motivation

VVaylonn

so i think not

LLogan M

Ah it does support gpu, neat

VVaylonn

well, i have now an access to a A6000, so ima try som new things :)) ***soon

Add a reply

Find answers from the community

ggml