Find answers from the community

Updated last year

Logan M

At a glance

BBorg1903

@Logan M

What should be the ideal value for num_output in ServiceContext if I want to make the llm responses fairly long. Should I let it use default values or should I manually set this value? I use Ollama and OpenAI classes for the llm models based on some user input parameter. So, should I have some dynamic value for the num_output param based on the model? (Since llama2 models have smaller context window than something like gpt-4)

2 comments

LLogan M

Actually, don't set num_output, just set max_tokens directly on the LLM

However, this only leaves room to generate more, you'll still have to try and prompt it to write more to begin with

Ollama's library does not allow you to change this though 🤷‍♂️

BBorg1903

Yes got it. Actually I use the Ollama LLM class from llama-index with default params there and use that as the llm when creating ServiceContext. While creating ServiceContext, I used to pass in the num_output usually. But on your recommendations, won't do that anymore. Thanks for the heads up!

Add a reply