Yea, languages other than english tend to use more tokens, and the default max_tokens from OpenAI is 256
You can increase the max_tokens by setting both
max_tokens
and
num_output
as seen here
from llama_index import ServiceContext, LLMPredictor, VectorStoreIndex
from langchain.chat_models import ChatOpenAI
from langchain.llms import OpenAI
# use gpt-3?
llm = OpenAI(model_name="text-davinci-003", temperature=0, max_tokens=512)
# use gpt-3.5 or gpt-4?
llm = ChatOpenAI(model_name="gpt-3.5-turbo", temperature=0, max_tokens=512)
llm_predictor = LLMPredictor(llm=llm)
service_context = ServiceContext.from_defaults(llm_predictor=llm_predictor, num_output=512)
index = VectorStoreIndex.from_document(documents, service_context=service_context)