Find answers from the community

Updated 5 months ago

Why pipeline transformations not executing in order

At a glance

The community member is having an issue with their IngestionPipeline where the document is not being split as expected. The pipeline includes transformations such as TitleExtractor, QuestionsAnsweredExtractor, and Settings.embed_model, but the document is not being split.

The comments suggest that the order of the transformations is important, and that the typical order would be "splitter -> extractors -> embeddings". One community member suggests running the "node-parse" first, then splitting, and then trying the extractors, but notes that the splitter is not starting.

There is no explicitly marked answer in the post or comments.

why this dont work? transformations = []
transformations.append(TitleExtractor(nodes=5))
transformations.append(QuestionsAnsweredExtractor(questions=3))
transformations.append(Settings.embed_model)
transformations.append(splitter)
transformations.append(node_parser)

transformations = []

pipeline = IngestionPipeline(
transformations=transformations,
vector_store=vector_store,
)
nodes = pipeline.run(documents=docs), it should first parse then split then do the titleextraction etc. it dont split the doc
L
o
4 comments
It runs the transformations in order
If you want to split first, and the splitter/node parser first
Normally the order would be splitter -> extractors -> embeddings
i have a special node-parse i run it first then split then try the extractors, but the splitter dont start
Add a reply
Sign up and join the conversation on Discord