Why pipeline transformations not executing in order

At a glance

The community member is having an issue with their IngestionPipeline where the document is not being split as expected. The pipeline includes transformations such as TitleExtractor, QuestionsAnsweredExtractor, and Settings.embed_model, but the document is not being split.

The comments suggest that the order of the transformations is important, and that the typical order would be "splitter -> extractors -> embeddings". One community member suggests running the "node-parse" first, then splitting, and then trying the extractors, but notes that the splitter is not starting.

There is no explicitly marked answer in the post or comments.

ooedemis

why this dont work? transformations = []
transformations.append(TitleExtractor(nodes=5))
transformations.append(QuestionsAnsweredExtractor(questions=3))
transformations.append(Settings.embed_model)
transformations.append(splitter)
transformations.append(node_parser)

transformations = []

pipeline = IngestionPipeline(
transformations=transformations,
vector_store=vector_store,
)
nodes = pipeline.run(documents=docs), it should first parse then split then do the titleextraction etc. it dont split the doc

4 comments

LLogan M

It runs the transformations in order

LLogan M

If you want to split first, and the splitter/node parser first

LLogan M

Normally the order would be splitter -> extractors -> embeddings

ooedemis

i have a special node-parse i run it first then split then try the extractors, but the splitter dont start

Add a reply

Find answers from the community

Why pipeline transformations not executing in order