What should be the ideal value for num_output in ServiceContext if I want to make the llm responses fairly long. Should I let it use default values or should I manually set this value? I use Ollama and OpenAI classes for the llm models based on some user input parameter. So, should I have some dynamic value for the num_output param based on the model? (Since llama2 models have smaller context window than something like gpt-4)
Yes got it. Actually I use the Ollama LLM class from llama-index with default params there and use that as the llm when creating ServiceContext. While creating ServiceContext, I used to pass in the num_output usually. But on your recommendations, won't do that anymore. Thanks for the heads up!