Find answers from the community

Updated 9 months ago

why does it take 15-20 seconds to get an

At a glance

why does it take 15-20 seconds to get an answer when I use ollama(llama3) with llama_index?

4 comments

becuase running a local LLM (especially when using the full context window) takes a lot of resources and can be slow depending on your hardware and what libraries you use to run the model

mmatter of factly

I expected that something that takes no time in the terminal should not take any time in the .py file

LLogan M

The bigger the input, the longer it takes

LLogan M

I doubt in the terminal you are pasting I'm several paragraphs as an input

Add a reply