Find answers from the community

Updated 7 months ago

whats interesting is that I am able to

At a glance

kkhalid

whats interesting is that I am able to run the model with ollama command on server directly though

4 comments

LLogan M

Memory will grow until it reaches the max context limit. Its lazily allocated

LLogan M

Setting a limit on the context window size is the way to limit the memory usage yes

LLogan M

llm = Ollama(..., context_window=3000) for example may help limit memory usage, but the lower you put it, the less context you can fit into the llm, which may increase the number of llm calls needed to run a query

kkhalid

I will try this out, thanks for your help.

Add a reply