meta-llama/Llama-2-70b-chat-hf
via the llama_index.llms.HuggingFaceLLM
abstraction.HugginfFaceLLM
?model_kwargs
or will be an env variable you have to set?model = ... llm = HuggingFaceLLM(model=model, tokenizer=tokenizer, ...)
from transformers import LlamaForCausalLM, LlamaTokenizer # I have no idea if this is how you use hub_token, just guessing tokenizer = LlamaTokenizer.from_pretrained("name", hub_token="..") model = LlamaForCausalLM.from_pretrained("name", hub_token="..") llm = HuggingFaceLLM(model=model, tokenizer=tokenizer, ...)