Find answers from the community

Updated 4 weeks ago

How to Create a Custom Parser for Attached Data Sample

How to create Custom Parser for attached data sample ?

I am using this snippet for text splitter and sentence splitter.

# Setup text splitter and node parser splitter = TokenTextSplitter( chunk_size=1024, chunk_overlap=256, separator=" ", backup_separators=["\n"], ) node_parser = SentenceSplitter( chunk_size=1024, chunk_overlap=256, include_prev_next_rel=True, include_metadata=True, paragraph_separator="-->" )


I want to create sentence splitter with "-->" as available in data.txt. Each node should have seperate Articles. Reason is each articles are different from each other and independent. So if anybody can help me with the configuration ?
L
1 comment
Seems like you could just write some python code to split the text how you want, and then send those chunks into any additional splitters to handle token limits?
Add a reply
Sign up and join the conversation on Discord