You'll probably want to add two more parameters to help control the input sizes (also I think the tokenizer max length should match the context size, but that's up to you)
llm = HuggingFaceLLM(
context_window=2048,
max_new_tokens=256,
tokenizer_kwargs={"max_length": 2048},
...
)