Find answers from the community

Updated 5 months ago

Tokenizer

At a glance

The post describes an error in changing the global tokenizer to match a GGUF LLM, as the Hugging Face repository only has GGUF files and not the config.json file. Community members suggest loading the tokenizer for a non-GGUF version of the model, downloading the tokenizer files locally, and providing the local file path to the AutoTokenizer. However, there are some issues with providing the local path, and the community members are trying to find the correct command to ensure it is working.

Error in changing the global tokenizer to match GGUF LLM. Huggingface repo have GGUF files only and not the config.json
Attachments
Screenshot_2024-09-29_051559.png
Screenshot_2024-09-29_051610.png
L
J
5 comments
Just load the tokenizer for a non-gguf version of the model
1) when you say "tokenizer", which files you are referring from below screenshot. 2) I am building this code for on-edge application. Is there any way we can bundle tokenizer file(s) with the software so user don't have to download the tokenizer since customer's machine will not be connected to the internet. Thanks.
Attachment
image.png
You can just download it locally and point it towards the file path

AutoTokenizer will look for the tokenizer* files from there (and maybe also config.json)
Trying to provide local path but it is throwing error. Can you provide correct command? Thanks.
Attachment
image.png
I put quotes arounds the path directory strings and this error went away. But how to make sure this command is working? Thanks.
Add a reply
Sign up and join the conversation on Discord