Find answers from the community

Updated 5 months ago

Tokenizer

At a glance

The post describes an error in changing the global tokenizer to match a GGUF LLM, as the Hugging Face repository only has GGUF files and not the config.json file. Community members suggest loading the tokenizer for a non-GGUF version of the model, downloading the tokenizer files locally, and providing the local file path to the AutoTokenizer. However, there are some issues with providing the local path, and the community members are trying to find the correct command to ensure it is working.

JJatin.K

Error in changing the global tokenizer to match GGUF LLM. Huggingface repo have GGUF files only and not the config.json

Attachments

5 comments

LLogan M

Just load the tokenizer for a non-gguf version of the model

JJatin.K

1) when you say "tokenizer", which files you are referring from below screenshot. 2) I am building this code for on-edge application. Is there any way we can bundle tokenizer file(s) with the software so user don't have to download the tokenizer since customer's machine will not be connected to the internet. Thanks.

Attachment

LLogan M

You can just download it locally and point it towards the file path

AutoTokenizer will look for the tokenizer* files from there (and maybe also config.json)

JJatin.K

Trying to provide local path but it is throwing error. Can you provide correct command? Thanks.

Attachment

JJatin.K

I put quotes arounds the path directory strings and this error went away. But how to make sure this command is working? Thanks.

Add a reply