Find answers from the community

Updated 3 months ago

is there a way to define the size of

is there a way to define the size of nodes that documents will be chunked into?
W
w
2 comments
You can do something like this and for changing the size just change the value for chunk _size in nodeparser
Plain Text
from llama_index import SimpleDirectoryReader, VectorStoreIndex, ServiceContext
from llama_index.node_parser import SimpleNodeParser

documents = SimpleDirectoryReader("./data").load_data()

node_parser = SimpleNodeParser.from_defaults(chunk_size=1024, chunk_overlap=20)
service_context = ServiceContext.from_defaults(node_parser=node_parser)

index = VectorStoreIndex.from_documents(documents, service_context=service_context)


For more check here:
https://gpt-index.readthedocs.io/en/latest/core_modules/data_modules/node_parsers/usage_pattern.html
Ah thank you
Add a reply
Sign up and join the conversation on Discord