Find answers from the community

Updated 4 months ago

Is it possible to use an ingestion

At a glance
Is it possible to use an ingestion pipeline with a node parser or are they mutually exclusive?
L
a
5 comments
A node parser is typically a step in an ingestion pipeline πŸ‘€ (although a select few node pasers are more complex, and don't nesccarily work with a linear sequence of actions yet)
Which one are you trying to use?
Currently my ingestion pipeline uses SentenceSplitter but I'm interested in using MarkdownElementNodeParser with LlamaParse using the markdown mode. Should I just replace the SentenceSplitter with the MarkdownElementNodeParser?
mmm that one is a bit more complicated, I need to make a PR to leverage that properly!

But you can do it now with a quick hack πŸ˜‰

Plain Text
from llama_index.core.schema import TransformComponent
from llama_index.core.node_parser import MarkdownElementNodeParser


class CustomTransform(TransformComponent):
  def __call__(nodes):
    node_parser = MarkdownElementNodeParser(llm=llm, num_workers=4)
    nodes = node_parser.get_nodes_from_documents(nodes)
    base_nodes, objects = node_parser.get_nodes_and_objects(nodes)
    return base_nodes + objects


transformation=[..., CustomTransform()]
very untested haha but should work?
Add a reply
Sign up and join the conversation on Discord