default
response mode btw, num_output
in ServiceContext
if I want to make the llm responses fairly long. Should I let it use default values or should I manually set this value? I use Ollama and OpenAI classes for the llm models based on some user input parameter. So, should I have some dynamic value for the num_output
param based on the model? (Since llama2 models have smaller context window than something like gpt-4)NotImplementedError