The community member is experiencing an issue with their Ollama LLM running locally, where the context window is being exceeded despite using a small prompt and top_k=1. Other community members suggest that the issue may be due to the LLM running on a CPU, and recommend increasing the timeout for the LLM instance to address the problem. The community members suggest increasing the timeout to 600 seconds or more, as CPU-based LLMs can take a significant amount of time to process the requests.
when i use query engine with Ollama llm locally , i get these sometimes: idk why is it exceeding context window even though im using top_k=1 only and a small prompt