Find answers from the community

Updated 3 months ago

I m following along with the Custom LLM

I'm following along with the Custom LLM guide here: https://gpt-index.readthedocs.io/en/latest/examples/customization/llms/SimpleIndexDemo-Huggingface_stablelm.html
I've replicated the code locally, but I keep getting the error:
"ValueError: The current device_map had weights offloaded to the disk. Please provide an offload_folder for them. Alternatively, make sure you have safetensors installed if the model you are using offers the weights in this format."

I'm having trouble troubleshooting this in this context, any advice?
L
c
3 comments
This means you don't have enough memory for the model to fit onto either GPU or CPU

Tbh even if you fixed this error, the model would be insanely slow (since it has to read model weights piece-by-piece from disk 🤮 )
Got it! I didn't realize that's what it was suggesting, but that's exactly what I needed to know. Thanks!
It was a step in my process of trying to use Llama_Index for a simple streamlit app with a custom model, hoping I can get falcon-7b-instruct to do the job (either locally or on a Hugging Face space) as proof of concept for purchasing some hours on a larger machine and trying 40B
Add a reply
Sign up and join the conversation on Discord