Find answers from the community

Updated 6 months ago

Any idea why this is taking 18 minutes

At a glance

aakvn

Any idea why this is taking 18 minutes for a simple query? Im using Mistral through Ollama locally.

Attachment

5 comments

WWhiteFang_Jr

Are you running the model on GPU?
After loading the model how much space is left as llm model tends to beef up the space while generating tokens. so if less space is there the process will take more time to generate

aakvn

Nope im running it on a normal laptop with 16 GB RAM, its taking 100% of the RAM and GPU so I am assuming the laptop is the reason this is happening and running it on a GPU should make the response time normal?

WWhiteFang_Jr

Yeah CPU is not good for running llm and that too big LLM is a big no so far.
Try running this on colab

aakvn

but im using Ollama for the models. How would that work on colab?

WWhiteFang_Jr

Ollama allows you to run models in colab. I think you can find more on their GitHub repo or simple Google search

Add a reply