Hopw can i give the hf token while trying to lone the M...

At a glance

The post describes an issue the community member is facing while trying to clone the Meta-Llama-3-8B model from HuggingFace, with an error message indicating a problem with the username. The comments provide a solution, explaining that the community member needs to provide their HuggingFace token to authorize their account and use the model. The comments include example code snippets demonstrating how to use the token with the AutoTokenizer and HuggingFaceLLM classes. The comments also note that the correct model identifier is "meta-llama/Meta-Llama-3-8B-Instruct", not "Meta-Llama-3-8B", and provide the corrected code. The community members also suggest referring to the Llama3 Cookbook and Hugging Face LLMs documentation for more details.

Useful resources

DDoğan

Hopw can i give the hf token while trying to lone the Meta-Llama-3-8B
Cloning into 'Meta-Llama-3-8B'...
fatal: could not read Username for 'https://huggingface.co': No such device or address

6 comments

kkapa.ai

Please stand by @Doğan as I search the knowledge base. I normally take about 20 seconds or so. :wave:

kkapa.ai

----------------------
To use the Meta-Llama-3-8B model from HuggingFace, you need to provide your HuggingFace token. This token is used to authorize your HuggingFace account. You can set this token as an environment variable or directly use it in your code. Here is an example of how you can use it:

Plain Text

hf_token = "your_huggingface_token"

from transformers import AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained(
    "meta-llama/Meta-Llama-3-8B-Instruct",
    token=hf_token,
)

In the above code, replace "your_huggingface_token" with your actual HuggingFace token. This will authorize your HuggingFace account and allow you to use the Meta-Llama-3-8B model.

If you want to use the model with LlamaIndex, you can do so as follows:

Plain Text

from llama_index.llms.huggingface import HuggingFaceLLM

llm = HuggingFaceLLM(
    model_name="meta-llama/Meta-Llama-3-8B-Instruct",
    model_kwargs={
        "token": hf_token,
        "torch_dtype": torch.bfloat16,  # comment this line and uncomment below to use 4bit
        # "quantization_config": quantization_config
    },
    generate_kwargs={
        "do_sample": True,
        "temperature": 0.6,
        "top_p": 0.9,
    },
    tokenizer_name="meta-llama/Meta-Llama-3-8B-Instruct",
    tokenizer_kwargs={"token": hf_token},
    stopping_ids=stopping_ids,
)

In this code, stopping_ids is a list of token ids at which the model should stop generating further tokens. You can define it as per your requirements.

Please note that you need to install the necessary packages before using the above code. You can install them using pip:

Plain Text

!pip install llama-index
!pip install llama-index-llms-huggingface
!pip install llama-index-embeddings-huggingface
!pip install llama-index-embeddings-huggingface-api

kkapa.ai

For more details, you can refer to the Llama3 Cookbook () and Hugging Face LLMs () documentation.
----------------------
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
----------------------

DDoğan

model_name = 'Meta-Llama-3-8B'
hf_token = my_token

tokenizer = AutoTokenizer.from_pretrained(model_name, token=hf_token)

stopping_ids = [
tokenizer.eos_token_id,
tokenizer.convert_tokens_to_ids("<|eot_id|>"),
]
llm = HuggingFaceLLM(
context_window=8192,
max_new_tokens=256,
generate_kwargs={"temperature": 0.1, "do_sample": False},
system_prompt=system_prompt,
query_wrapper_prompt=query_wrapper_prompt,
tokenizer_name=model_name,
model_name=model_name,
device_map="auto",
stopping_ids=stopping_ids,
tokenizer_kwargs={"max_length": 4096},
# uncomment this if using CUDA to reduce memory usage
model_kwargs={"token": hf_token, "torch_dtype": torch.float16}
)

Settings.llm = llm
Settings.chunk_size = 512

OSError: Meta-Llama-3-8B is not a local folder and is not a valid model identifier listed on 'https://huggingface.co/models'
If this is a private repository, make sure to pass a token having permission to this repo either by logging in with huggingface-cli login or by passing token=<your_token>

@kapa.ai

kkapa.ai

Let me check the knowledge base for you. One moment! :books:

kkapa.ai

----------------------
The error message you're seeing suggests that the model identifier you're using, 'Meta-Llama-3-8B', is not recognized as a valid model on HuggingFace's model hub.

The correct model identifier for the Meta Llama 3 model with 8 billion parameters is 'meta-llama/Meta-Llama-3-8B-Instruct'. You should replace 'Meta-Llama-3-8B' with 'meta-llama/Meta-Llama-3-8B-Instruct' in your code.

Here's how your corrected code should look:

Plain Text

model_name = 'meta-llama/Meta-Llama-3-8B-Instruct'
hf_token = my_token

tokenizer = AutoTokenizer.from_pretrained(model_name, token=hf_token)

stopping_ids = [
        tokenizer.eos_token_id,
        tokenizer.convert_tokens_to_ids("<|eot_id|>"),
    ]
llm = HuggingFaceLLM(
    context_window=8192,
    max_new_tokens=256,
    generate_kwargs={"temperature": 0.1, "do_sample": False},
    system_prompt=system_prompt,
    query_wrapper_prompt=query_wrapper_prompt,
    tokenizer_name=model_name,
    model_name=model_name,
    device_map="auto",
    stopping_ids=stopping_ids,
    tokenizer_kwargs={"max_length": 4096},
    # uncomment this if using CUDA to reduce memory usage
    model_kwargs={"token": hf_token, "torch_dtype": torch.float16}
)

Settings.llm = llm
Settings.chunk_size = 512

Please replace my_token with your actual HuggingFace token, and system_prompt and query_wrapper_prompt with the actual prompts you want to use.

For more details, you can refer to the Llama3 Cookbook () and Hugging Face LLMs () documentation.
----------------------
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
----------------------

Add a reply

Find answers from the community

Hopw can i give the hf token while trying to lone the Meta-Llama-3-8BCloning into 'Meta-L