----------------------
To set the
chunk_size
and
overlap_size
when using
GPTPineconeIndex
, you need to create a custom
PromptHelper
and pass it to a
ServiceContext
. Here's an example based on the extracted information:
from gpt_index import GPTPineconeIndex, SimpleDirectoryReader, LLMPredictor, PromptHelper, ServiceContext
from langchain import OpenAI
# Load documents
documents = SimpleDirectoryReader('../paul_graham_essay/data').load_data()
# Define LLM
llm_predictor = LLMPredictor(llm=OpenAI(temperature=0, model_name="text-davinci-003"))
# Define PromptHelper with custom chunk_size and overlap_size
chunk_size = 256
overlap_size = 20
prompt_helper = PromptHelper(chunk_size, num_output=chunk_size, max_chunk_overlap=overlap_size)
# Create ServiceContext with custom PromptHelper
service_context = ServiceContext.from_defaults(llm_predictor=llm_predictor, prompt_helper=prompt_helper)
# Build the GPTPineconeIndex
index = GPTPineconeIndex.from_documents(documents, pinecone_index=index, service_context=service_context)
To save the built index for future use, you can use the
save_to_disk
and
load_from_disk
methods. Make sure to pass the same
ServiceContext
when loading the index:
# Save the index to disk
index.save_to_disk('index.json')
# Load the index from disk
index = GPTPineconeIndex.load_from_disk('index.json', service_context=service_context)
Keep in mind that the code snippets provided are based on the information extracted from the sources. You may need to adapt them to your specific use case and import the necessary libraries.
----------------------
Note: you can ask me a follow up question by @ mentioning me again
:speech_balloon:
----------------------