LlamaIndex uses the Hugging Face Tokenizers library for tokenization when using a local embedding model and local language model through Ollama.
LlamaIndex uses the Hugging Face Tokenizers library for tokenization when using a local embedding model and local language model through Ollama.
At a glance
The community members are discussing the tokenizer used by LlamaIndex when using a local embedding model and local language model through Ollama. One community member suggests that the tokenizer part is handled at the Ollama side and not the LlamaIndex side, based on the provided code reference. Another community member asks if there is no need to provide a tokenizer when using LlamaIndex's Ollama(), and a third community member believes this is the case.
A) What tokenizer does LlamaIndex uses when I am supplying local embedding model and local language model(through Ollama) like in this code? B) And how to supply a tokenizer for a LLM that I am pulling from my own Ollama repo?
Ollama wrapper only passes the text in required format to your hosted llm model. I feel the tokenizer part is handled at the Ollama side only and not the llama-index side.