Hi all. If I'm using llamafile for the

At a glance

Hi all. If I'm using llamafile for the models, do I need to configure the tokenizer for them? The docs mention that the default is a tokenizer for the openai models. But if I'm using a local model with llamafile, do I need to configure it to something different and how? In the docs there is a mention to use the AutoTokenizer from the transformers package. But this seems to need to redownload the model. But I'm already running it through llamafile. Also llamafile exposes an openapi compatible rest API that I think offers a tokenizer endpoint. Can this be used somehow instead?I'm new to this space, so please excuse the ignorance of some basic stuff. Thanks

4 comments

LLogan M

the tokenizer is only used for token counting (like creating chunk sizes for example). In most cases its fine to leave it as the openai default, but if you really want, you can set the tokenizer on the Settings to be any function that takes a string and returns a list

tthanos

Thanks a lot. If you happen to know, Is there a way to set a tokenizer for the LLM and a different one for the embedding model (different model from LLM model)? From the settings doc I can see that you can set only one globally.

LLogan M

Yea its only one globally 😅 I know these two models will likely have different tokenizers, but in the grand scheme of things, I think it makes most sense to use the tokenizer for the LLM (since the embedding model can just truncate if it gets too long, the LLM cannot do that)

LLogan M

Plus, if there was two, and you set a chunk size for the embedding model, it could lead to unnexpected results when retrieving those chunks to pass to an LLM, just due to token count differences between the two

Add a reply

Find answers from the community

Hi all. If I'm using llamafile for the