Find answers from the community

Updated last year

Higgingface

At a glance
hola, does the hugging face wrapper support using flash attention2 or rope scaling?
L
D
3 comments
The LLM wrapper? It uses whatever you want

Its probably easiest to load the model with whichever settings you want and pass it in directly

HuggingFaceLLM(model=model, tokenizer=tokenizer, ...)
Add a reply
Sign up and join the conversation on Discord