Find answers from the community

Updated last year

text splitter

At a glance

The post asks how to customize the Text Splitter using SpacyTextSplitter. Community members provide suggestions, including using a Langchain splitter and providing specific code examples. One community member encounters an error with the expected type, and another community member suggests passing the node parser into the service context and then the service context into the index. The issue is resolved after an update to the llama-index library.

Useful resources
how to customize Text Splitter use SpacyTextSplitter?
b
C
L
11 comments
you can use any Langchain splitter
@bmax i got this Expected type 'TextSplitter | None', got 'SpacyTextSplitter' instead
can you send that portion of code @ChuanYue and imports
Plain Text
text_splitter = SpacyTextSplitter(chunk_size=512)
    parser = SimpleNodeParser.from_defaults(text_splitter=text_splitter)
    documents = SimpleDirectoryReader(file_path, filename_as_id=True).load_data()
    parser.get_nodes_from_documents(documents)
@bmax Is that right
that looks mostly right @ChuanYue -- you'll have to pass the node parser into the service_context and then the service context into the index
what is your error stack trace exactly?
@ChuanYue what version of llama-index do you have? I fixed this in a recent version
@Logan M It's already good after my update, thanks
Add a reply
Sign up and join the conversation on Discord