Find answers from the community

Updated 2 years ago

Parsing

At a glance
Hey all, is it possible to limit the chunk size in the node parser to be sentences? I have much better results with my data using sentence embedding vs embedding larger chunks. My current process is to use spacy to identify the sentences semantically and then pass them to my embedding model. This is critical for the types of problems I’m trying to solve. Also, I wrote an api around this whole process, would be nice to synthesize it with the node parser somehow..
L
1 comment
You could technically subclass the node parser and use the process you've developed

We also have a sentence splitter, for splitting into chunks, but this would still include many sentences in a chunk. It uses nltk so it's not quite as sophisticated as spacy would be.
Add a reply
Sign up and join the conversation on Discord