Find answers from the community

Updated 3 months ago

Has anyone else noticed HuggingFaceLLMs

Has anyone else noticed HuggingFaceLLMs just hanging on their machine once you get to query? I have been unable to get a response from them and ultimately have to do a KeyboardInterrupt
L
c
11 comments
I've seen this happen if you don't configure the max input size correctly πŸ€” What llm are you using?
"StabilityAI/stablelm-tuned-alpha-3b"
and im using an 8GB GPU, i know its not much
just got the cuda.OutOfMemoryError
where do i set the max_input_size? @Logan M
ah wrong term, I meant context_window

Maybe since you have some limited memory, you can atrtifically lower it to 2048 (both the context_window and tokenizer kwargs)

https://gpt-index.readthedocs.io/en/latest/core_modules/model_modules/llms/usage_custom.html#example-using-a-huggingface-llm
You probably already saw the demo from that page haha
ah gotcha, i just tried 1024, but still had CUDA OOM, will try CPU albiet slow to see if that works
Is there a way to reduce batch_size?
Batch size should already be 1 πŸ˜…

ngl 8GB is tough to work with. And tbh, open source LLMs in general are still npt great, especially the smaller ones
yeah i tried a much smaller llm and results werent great
Add a reply
Sign up and join the conversation on Discord