Find answers from the community

Updated 5 months ago

Graph

@Logan M Couple of quick questions for you. I looked for an example of combining neo4 and KnowledgeGraphIndex with HierarchicalNodeParser to extract hierarchical nodes from markdown documents containing headers, tables, etc. I couldn't find an example, and after some trial and error, I was unable to achieve this.

  1. Does hierarchical node processor extract nodes based on headers and other document structure?
  2. Is there a way to accomplish the above, or something like it?
I'm going off the recommendation that since I have legal-like structured documents, that extracting nodes and keeping the structure I can get the better than average results from a Q&A RAG app. I also, am assuming I can use neo4j since it is an established knowledge graph for better performance than simply storing the nodes on disk and querying from there.
L
n
6 comments
Hierarchical node parser is just parsing hierarchies of chunks, not semantic hierarchies.

You could identify semantic hierarchies yourself, but the graph index isn't really designed for this purpose in mind anyways. You'd be better off with some kind of custom retriever at that point
Ah, I see, thank you @Logan M
@Logan M Just to clarify, there isn't an out-of-the-box method for accomplishing the above in some other manner even for semi-structured documents like html, md, etc? https://medium.com/@clappy.ai/hierarchical-trees-in-data-indexing-algorithms-10b21fbd69d5
Technically it's kind of a tree index, or raptor, but that's less specific to the document structure and more about the content in general
@Logan M am I mistaken, or is MarkdownElementNodeParser what I am looking for to extract nodes from the structure of markdown? I admit, I have trouble tracing the inner workings of the library as it has gotten larger.
I mean, it extracts elements, but doesn't do anything with the structure πŸ‘€ it's mostly for pulling out tables imo
Add a reply
Sign up and join the conversation on Discord