Find answers from the community

Updated 6 months ago

Has anyone else noticed HuggingFaceLLMs

At a glance

Has anyone else noticed HuggingFaceLLMs just hanging on their machine once you get to query? I have been unable to get a response from them and ultimately have to do a KeyboardInterrupt

11 comments

LLogan M

I've seen this happen if you don't configure the max input size correctly 🤔 What llm are you using?

ccmagorian

"StabilityAI/stablelm-tuned-alpha-3b"

ccmagorian

and im using an 8GB GPU, i know its not much

ccmagorian

just got the cuda.OutOfMemoryError

ccmagorian

where do i set the max_input_size? @Logan M

LLogan M

ah wrong term, I meant context_window

Maybe since you have some limited memory, you can atrtifically lower it to 2048 (both the context_window and tokenizer kwargs)

https://gpt-index.readthedocs.io/en/latest/core_modules/model_modules/llms/usage_custom.html#example-using-a-huggingface-llm

LLogan M

You probably already saw the demo from that page haha

ccmagorian

ah gotcha, i just tried 1024, but still had CUDA OOM, will try CPU albiet slow to see if that works

ccmagorian

Is there a way to reduce batch_size?

LLogan M

Batch size should already be 1 😅

ngl 8GB is tough to work with. And tbh, open source LLMs in general are still npt great, especially the smaller ones

ccmagorian

yeah i tried a much smaller llm and results werent great

Add a reply