Find answers from the community

Updated 3 months ago

Hello I am trying to run a multimodal

Hello I am trying to run a multimodal model using ollama (minicpm-v) I want to run this model in parallel to process the same query over multiple images at the same time, is this possible? I know that ollama has some concurrency parameters to run multiple models but I couldn't get it to work, I tried the "Parallel Execution of Same Event Example" cookbook workflow but I failed and got this error. Error during frame analysis: Ollama does not support async completion.

11 comments

LLogan M

hmmm looks like async hasn't been implemented yet for the multi modal ollama class

ssphex

hmm I see, is there any alternative approaches I could take to speed up the process without changing the model or device

ssphex

or is there any models with async implementation outside of ollama

LLogan M

probably the ollama llm class should just be updated to include async -- wouldn't be too crazy I think

LLogan M

It looks like its already using the official ollama client, and I know they have an async client, so would be a straightforward PR if you want to give it a shot ❤️

ssphex

dou you mean the async implementation here? https://github.com/ollama/ollama-python/blob/main/ollama/_client.py

LLogan M

Yea exactly, you can do from ollama import Client, AsyncClient

ssphex

I can try I am not sure If I am capable but 😅

ssphex

should I modify this? https://github.com/run-llama/llama_index/blob/main/llama-index-legacy/llama_index/legacy/llms/ollama.py

LLogan M

This one
https://github.com/run-llama/llama_index/blob/main/llama-index-integrations/multi_modal_llms/llama-index-multi-modal-llms-ollama/llama_index/multi_modal_llms/ollama/base.py

ssphex

I created the PR 🤠 https://github.com/run-llama/llama_index/pull/16091

Add a reply