Find answers from the community

Updated last month

LlamaIndex uses the Hugging Face Tokenizers library for tokenization when using a local embedding model and local language model through Ollama.

At a glance

The community members are discussing the tokenizer used by LlamaIndex when using a local embedding model and local language model through Ollama. One community member suggests that the tokenizer part is handled at the Ollama side and not the LlamaIndex side, based on the provided code reference. Another community member asks if there is no need to provide a tokenizer when using LlamaIndex's Ollama(), and a third community member believes this is the case.

Useful resources
A) What tokenizer does LlamaIndex uses when I am supplying local embedding model and local language model(through Ollama) like in this code? B) And how to supply a tokenizer for a LLM that I am pulling from my own Ollama repo?
Attachment
image.png
W
J
3 comments
Ollama wrapper only passes the text in required format to your hosted llm model. I feel the tokenizer part is handled at the Ollama side only and not the llama-index side.

https://github.com/run-llama/llama_index/blob/fd1edffd20cbf21085886b96b91c9b837f80a915/llama-index-integrations/llms/llama-index-llms-ollama/llama_index/llms/ollama/base.py#L306
so does that means whenever I am using LlamaIndex's Ollama(), there is no need to provide tokenizer like in this scrreenshot?
Attachment
image.png
Add a reply
Sign up and join the conversation on Discord