What should be a pretty silly question,

At a glance

The community member is having trouble understanding the relationship between the nodes returned from a PropertyGraphIndex retriever.retrieve call and the underlying entity nodes they added to the PropertyGraphStore. The comments suggest that retrievers can only return text nodes, which are based on the retrieved knowledge graph nodes. The community members discuss ways to add metadata to these text nodes or create a custom retriever that can return the raw nodes and relations. There is no explicitly marked answer, but the community members are exploring solutions to the issue.

ggavindoughtie

What should be a pretty silly question, but looking at the nodes returned from a PropertyGraphIndex retriever.retrieve call, I don't see the immediate path back from the returned nodes to the underlying entity nodes I added in the PropertyGraphStore (which I indexed via from_existing)

13 comments

LLogan M

good question. Retrievers have to retrun text nodes, so the text nodes are based on the retrieved kg nodes

include_text=False makes each triples a text node: kgnode1 -> rel -> kgnode2

include_text=True kind of merges the triples into the original text chunk

LLogan M

Probably the kg nodes could be added as metadata to these text nodes, but currently not implemented

ggavindoughtie

This is a case where I want to get back to data from the relational DB entities that I constructed the graph from in the first place. They're definitely in the upserted EntityNode objects. If the triples came back as source_id -> rel_id -> target_id I could at least parse that.

ggavindoughtie

I'm assuming that all of a node's properties and MENTIONS text nodes are used for the query that returns them

ggavindoughtie

Does it have to be a TextNode that comes back? Is there any way to decorate the node with some application-defined properties?

ggavindoughtie

I mean, at the moment the TextNode is being constructed, whatever EntityNodes or ChunkNodes that were involved are available; would be nice to add an annotation pipeline to the results so we could render some app-data specific UI for them.

LLogan M

Due to the BaseRetriever base class requiring a node as the return type, yea it has to be a text node (becuase this is what every other component in llamaindex knows how to use)

LLogan M

Now, that doesn't mean we cant add some other method though, like retrieve_raw_nodes_and_relations or something

LLogan M

(That name is way too long)

ggavindoughtie

In this case it's a list of NodeWithScore objects whose node member points to a TextNode

LLogan M

Right right

ggavindoughtie

Plain Text

  @classmethod
  def from_node(cls, node: BaseNode, store: PropertyGraphStore):
    triple_str = node.node.text.split(" -> ")
    source_id = triple_str[0].strip()
    target_id = triple_str[2].strip()
    relation = triple_str[1].strip()
    source_node_type, source_node_id = from_global_id(source_id)
    target_node_type, target_node_id = from_global_id(target_id)
    source_node = store.get(properties={'id': source_id})[0]
    target_node = store.get(properties={'id': target_id})[0]
    debug_text = f'{source_node_type}:{json.dumps(source_node.properties)} -> {relation} -> {target_node_type}:{json.dumps(target_node.properties)}'
    return cls(source_node=source_node,
               target_node=target_node,
               text=debug_text)

where the IDs in question are encoded datatype:model_id strings

ggavindoughtie

so, this works for now but maybe I can make a custom retriever that does this for me with the additional method you propose (or I can make you a ticket or a PR)

Add a reply

Find answers from the community

What should be a pretty silly question,