I'm trying to break the text into chunks of 500, with a 20 overlap each. But the following doesn't seem to be working. Any suggestions on what I'm doing wrong here?
llm_predictor = LLMPredictor(llm=OpenAI(temperature=0,
model_name=llama_model,
max_tokens=512))
#prompt helper
context_window = 4096
num_output = 512 # set number of output tokens
chunk_overlap_ratio = 0.04 #set chunk overlap ratio, where .04 of 500 = 20 overlap
chunk_size_limit = 500 #set limit of chunk size
prompt_helper = PromptHelper(context_window, num_output, chunk_overlap_ratio, chunk_size_limit)
service_context = ServiceContext.from_defaults(llm_predictor=llm_predictor, prompt_helper = prompt_helper)
index = VectorStoreIndex.from_documents(documents, service_context = service_context, show_progress = True)
Suggestions are welcome!