@kapa.ai Is it possible to specify a

At a glance

The community member is asking if it is possible to specify a node parser at the time of insertion into an index, and if not, what is the recommended way of handling the insertion of multiple different document types that require different node parsers. They have several different node parsers optimized for chunking different types of documents and need to be able to hot-swap them out depending on the document type without rebuilding the index from scratch.

The comments suggest that the community member should parse the nodes before inserting them into the index, using a node_parser to process the documents and then inserting the resulting nodes into the index. One community member suggests removing the node_parser from the transformations array, while another says it could be kept but would not be used.

There is no explicitly marked answer in the provided information.

aalfredmadere

Is it possible to specify a node parser at the time of insertion into an index? If not, what is the recommended way of handling the insertion of multiple different document types that require different node parsers?

I have several different node parsers that are optimized for chunking different types of documents and I need to be able to hot-swap them out depending on which type of document I am inserting without rebuilding the index from scratch every time. I am assuming that creating an index is not cheap. What is the recommended way of doing this?

4 comments

LLogan M

Parse the nodes before inserting imo

LLogan M

Plain Text

node_parser = ...
nodes = node_parser(documents)

index.insert_nodes(nodes)

aalfredmadere

Then i would remove the node_parser from the transformations array correct?

transformations=[node_parser, embedding_component.embedding_model],

LLogan M

You could keep it if you wanted, but it wouldn't be used

Add a reply

Find answers from the community

@kapa.ai Is it possible to specify a