----------------------
Apologies for the confusion. To limit the output size, you can adjust the
num_output
parameter when defining a custom
PromptHelper
. This parameter sets the number of output tokens for the LLM. Here's an example of how to create a
PromptHelper
with a custom
num_output
:
from llama_index import PromptHelper
max_input_size = 4096
num_output = your_desired_output_tokens
max_chunk_overlap = 20
prompt_helper = PromptHelper(max_input_size, num_output, max_chunk_overlap)
Replace
your_desired_output_tokens
with the desired number of output tokens.
After creating the custom
PromptHelper
, you can use it when creating a
ServiceContext
and then use that
ServiceContext
when constructing your index. Here's an example:
from llama_index import LLMPredictor, GPTSimpleVectorIndex, ServiceContext
from langchain import OpenAI
# Define LLM
llm_predictor = LLMPredictor(llm=OpenAI(temperature=0, model_name="text-davinci-003"))
service_context = ServiceContext.from_defaults(llm_predictor=llm_predictor, prompt_helper=prompt_helper)
index = GPTSimpleVectorIndex.from_documents(documents, service_context=service_context)
For more information on
PromptHelper
, refer to the
documentation (
https://gpt-index.readthedocs.io/en/latest/reference/prompt_helper.html).
----------------------
Note: you can ask me a follow up question by @ mentioning me again
:speech_balloon:
----------------------