I m following along with the Custom LLM

At a glance

I'm following along with the Custom LLM guide here: https://gpt-index.readthedocs.io/en/latest/examples/customization/llms/SimpleIndexDemo-Huggingface_stablelm.html
I've replicated the code locally, but I keep getting the error:
"ValueError: The current device_map had weights offloaded to the disk. Please provide an offload_folder for them. Alternatively, make sure you have safetensors installed if the model you are using offers the weights in this format."

I'm having trouble troubleshooting this in this context, any advice?

3 comments

LLogan M

This means you don't have enough memory for the model to fit onto either GPU or CPU

Tbh even if you fixed this error, the model would be insanely slow (since it has to read model weights piece-by-piece from disk 🤮 )

ccodydh

Got it! I didn't realize that's what it was suggesting, but that's exactly what I needed to know. Thanks!

ccodydh

It was a step in my process of trying to use Llama_Index for a simple streamlit app with a custom model, hoping I can get falcon-7b-instruct to do the job (either locally or on a Hugging Face space) as proof of concept for purchasing some hours on a larger machine and trying 40B

Add a reply

Find answers from the community

I m following along with the Custom LLM