@Logan M im currently trying to switch

At a glance

@Logan M im currently trying to switch from llama.cpp to Ollama but the same model give me different responses. The output from Llama.cpp is correct and in the right language. The output from Ollama is wrong and sometimes in the wrong language... I have also talked to the Ollama community but we have no solution so far.... maybe it has to do with the implementation in llama index?

I have already compared all the settings i could find.
I can provide you with whatever infos you need.

We could (from my viewpoint) greatly increase the quality of ollama if we could find out what is different.

6 comments

LLogan M

I'm probably the wrong person to ask about this lol But I can point you towards source code

Our llama-index integration is pretty bare-bones on top of the existing ollama client

The only settings being touched are temperature and num_ctx

The rest are left as defaults, or ovridden by what the user gives in additional_kwargs
https://github.com/run-llama/llama_index/blob/acaa260f7b5ad4190bd004e5db85290985da7121/llama-index-integrations/llms/llama-index-llms-ollama/llama_index/llms/ollama/base.py#L171

LLogan M

All possible options and their defaults for ollama are documented here
https://github.com/ollama/ollama/blob/main/docs/modelfile.md#valid-parameters-and-values

I have no idea if these match llama.cpp though

DDigital Rally

and what about llama.cpp? is this also a pretty bar-bone implementation of llama-cpp-python?

LLogan M

llama.cpp in llama-index? Yea, its all bare-bones

LLogan M

By default, its setting temperate and max tokens
https://github.com/run-llama/llama_index/blob/faa96cefb93b07c22d1a512e575b4a884a565cd4/llama-index-integrations/llms/llama-index-llms-llama-cpp/llama_index/llms/llama_cpp/base.py#L178

DDigital Rally

allright i guess ill dig through that. thank you

Add a reply

Find answers from the community

@Logan M im currently trying to switch