Find answers from the community

Updated 3 months ago

What should be a pretty silly question,

What should be a pretty silly question, but looking at the nodes returned from a PropertyGraphIndex retriever.retrieve call, I don't see the immediate path back from the returned nodes to the underlying entity nodes I added in the PropertyGraphStore (which I indexed via from_existing)
L
g
13 comments
good question. Retrievers have to retrun text nodes, so the text nodes are based on the retrieved kg nodes

include_text=False makes each triples a text node: kgnode1 -> rel -> kgnode2

include_text=True kind of merges the triples into the original text chunk
Probably the kg nodes could be added as metadata to these text nodes, but currently not implemented
This is a case where I want to get back to data from the relational DB entities that I constructed the graph from in the first place. They're definitely in the upserted EntityNode objects. If the triples came back as source_id -> rel_id -> target_id I could at least parse that.
I'm assuming that all of a node's properties and MENTIONS text nodes are used for the query that returns them
Does it have to be a TextNode that comes back? Is there any way to decorate the node with some application-defined properties?
I mean, at the moment the TextNode is being constructed, whatever EntityNodes or ChunkNodes that were involved are available; would be nice to add an annotation pipeline to the results so we could render some app-data specific UI for them.
Due to the BaseRetriever base class requiring a node as the return type, yea it has to be a text node (becuase this is what every other component in llamaindex knows how to use)
Now, that doesn't mean we cant add some other method though, like retrieve_raw_nodes_and_relations or something
(That name is way too long)
In this case it's a list of NodeWithScore objects whose node member points to a TextNode
Plain Text
  @classmethod
  def from_node(cls, node: BaseNode, store: PropertyGraphStore):
    triple_str = node.node.text.split(" -> ")
    source_id = triple_str[0].strip()
    target_id = triple_str[2].strip()
    relation = triple_str[1].strip()
    source_node_type, source_node_id = from_global_id(source_id)
    target_node_type, target_node_id = from_global_id(target_id)
    source_node = store.get(properties={'id': source_id})[0]
    target_node = store.get(properties={'id': target_id})[0]
    debug_text = f'{source_node_type}:{json.dumps(source_node.properties)} -> {relation} -> {target_node_type}:{json.dumps(target_node.properties)}'
    return cls(source_node=source_node,
               target_node=target_node,
               text=debug_text)


where the IDs in question are encoded datatype:model_id strings
so, this works for now but maybe I can make a custom retriever that does this for me with the additional method you propose (or I can make you a ticket or a PR)
Add a reply
Sign up and join the conversation on Discord