Find answers from the community

Updated 3 months ago

Hopw can i give the hf token while trying to lone the Meta-Llama-3-8BCloning into 'Meta-L

Hopw can i give the hf token while trying to lone the Meta-Llama-3-8B
Cloning into 'Meta-Llama-3-8B'...
fatal: could not read Username for 'https://huggingface.co': No such device or address

k
D
6 comments
Please stand by @Doğan as I search the knowledge base. I normally take about 20 seconds or so. :wave:
----------------------
To use the Meta-Llama-3-8B model from HuggingFace, you need to provide your HuggingFace token. This token is used to authorize your HuggingFace account. You can set this token as an environment variable or directly use it in your code. Here is an example of how you can use it:

Plain Text
hf_token = "your_huggingface_token"

from transformers import AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained(
    "meta-llama/Meta-Llama-3-8B-Instruct",
    token=hf_token,
)


In the above code, replace "your_huggingface_token" with your actual HuggingFace token. This will authorize your HuggingFace account and allow you to use the Meta-Llama-3-8B model.

If you want to use the model with LlamaIndex, you can do so as follows:

Plain Text
from llama_index.llms.huggingface import HuggingFaceLLM

llm = HuggingFaceLLM(
    model_name="meta-llama/Meta-Llama-3-8B-Instruct",
    model_kwargs={
        "token": hf_token,
        "torch_dtype": torch.bfloat16,  # comment this line and uncomment below to use 4bit
        # "quantization_config": quantization_config
    },
    generate_kwargs={
        "do_sample": True,
        "temperature": 0.6,
        "top_p": 0.9,
    },
    tokenizer_name="meta-llama/Meta-Llama-3-8B-Instruct",
    tokenizer_kwargs={"token": hf_token},
    stopping_ids=stopping_ids,
)


In this code, stopping_ids is a list of token ids at which the model should stop generating further tokens. You can define it as per your requirements.

Please note that you need to install the necessary packages before using the above code. You can install them using pip:

Plain Text
!pip install llama-index
!pip install llama-index-llms-huggingface
!pip install llama-index-embeddings-huggingface
!pip install llama-index-embeddings-huggingface-api
For more details, you can refer to the Llama3 Cookbook () and Hugging Face LLMs () documentation.
----------------------
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
----------------------
model_name = 'Meta-Llama-3-8B'
hf_token = my_token

tokenizer = AutoTokenizer.from_pretrained(model_name, token=hf_token)

stopping_ids = [
tokenizer.eos_token_id,
tokenizer.convert_tokens_to_ids("<|eot_id|>"),
]
llm = HuggingFaceLLM(
context_window=8192,
max_new_tokens=256,
generate_kwargs={"temperature": 0.1, "do_sample": False},
system_prompt=system_prompt,
query_wrapper_prompt=query_wrapper_prompt,
tokenizer_name=model_name,
model_name=model_name,
device_map="auto",
stopping_ids=stopping_ids,
tokenizer_kwargs={"max_length": 4096},
# uncomment this if using CUDA to reduce memory usage
model_kwargs={"token": hf_token, "torch_dtype": torch.float16}
)

Settings.llm = llm
Settings.chunk_size = 512

OSError: Meta-Llama-3-8B is not a local folder and is not a valid model identifier listed on 'https://huggingface.co/models'
If this is a private repository, make sure to pass a token having permission to this repo either by logging in with huggingface-cli login or by passing token=<your_token>

@kapa.ai
Let me check the knowledge base for you. One moment! :books:
----------------------
The error message you're seeing suggests that the model identifier you're using, 'Meta-Llama-3-8B', is not recognized as a valid model on HuggingFace's model hub.

The correct model identifier for the Meta Llama 3 model with 8 billion parameters is 'meta-llama/Meta-Llama-3-8B-Instruct'. You should replace 'Meta-Llama-3-8B' with 'meta-llama/Meta-Llama-3-8B-Instruct' in your code.

Here's how your corrected code should look:

Plain Text
model_name = 'meta-llama/Meta-Llama-3-8B-Instruct'
hf_token = my_token

tokenizer = AutoTokenizer.from_pretrained(model_name, token=hf_token)

stopping_ids = [
        tokenizer.eos_token_id,
        tokenizer.convert_tokens_to_ids("<|eot_id|>"),
    ]
llm = HuggingFaceLLM(
    context_window=8192,
    max_new_tokens=256,
    generate_kwargs={"temperature": 0.1, "do_sample": False},
    system_prompt=system_prompt,
    query_wrapper_prompt=query_wrapper_prompt,
    tokenizer_name=model_name,
    model_name=model_name,
    device_map="auto",
    stopping_ids=stopping_ids,
    tokenizer_kwargs={"max_length": 4096},
    # uncomment this if using CUDA to reduce memory usage
    model_kwargs={"token": hf_token, "torch_dtype": torch.float16}
)

Settings.llm = llm
Settings.chunk_size = 512


Please replace my_token with your actual HuggingFace token, and system_prompt and query_wrapper_prompt with the actual prompts you want to use.

For more details, you can refer to the Llama3 Cookbook () and Hugging Face LLMs () documentation.
----------------------
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
----------------------
Add a reply
Sign up and join the conversation on Discord