Is it possible to use an ingestion

At a glance

Is it possible to use an ingestion pipeline with a node parser or are they mutually exclusive?

5 comments

A node parser is typically a step in an ingestion pipeline 👀 (although a select few node pasers are more complex, and don't nesccarily work with a linear sequence of actions yet)

LLogan M

Which one are you trying to use?

aaelita

Currently my ingestion pipeline uses SentenceSplitter but I'm interested in using MarkdownElementNodeParser with LlamaParse using the markdown mode. Should I just replace the SentenceSplitter with the MarkdownElementNodeParser?

LLogan M

mmm that one is a bit more complicated, I need to make a PR to leverage that properly!

But you can do it now with a quick hack 😉

Plain Text

from llama_index.core.schema import TransformComponent
from llama_index.core.node_parser import MarkdownElementNodeParser


class CustomTransform(TransformComponent):
  def __call__(nodes):
    node_parser = MarkdownElementNodeParser(llm=llm, num_workers=4)
    nodes = node_parser.get_nodes_from_documents(nodes)
    base_nodes, objects = node_parser.get_nodes_and_objects(nodes)
    return base_nodes + objects


transformation=[..., CustomTransform()]

LLogan M

very untested haha but should work?

Add a reply

Find answers from the community

Is it possible to use an ingestion