Find answers from the community

Updated 9 months ago

Has anyone tried defining a custom `kg_

At a glance

The post asks if anyone has tried defining a custom kg_schema_cls for SchemaLLMPathExtractor, or if there is an example that could be reviewed. In the comments, a community member explains that it is a complex task, and provides an example implementation of a custom schema using pydantic. The example includes definitions for Entity, Relation, and Triplet models, as well as a Triplets model that represents the knowledge graph schema. Another community member notes that the provided example is very close to the default implementation under the hood. The original poster thanks the community member and indicates they will review the example to understand it better.

Has anyone tried defining a custom kg_schema_cls for SchemaLLMPathExtractor? Or is there an example that we could go through?
L
D
5 comments
Its pretty complex lol

I left the option open, but didn't give an example because you really need to know what you are doing to do it (and I figured those people would just read the source code)
But basically, its something like this
Plain Text
from pydantic.v1 import BaseModel, Field, validator, root_validator
from typing import Literal

class Entity(BaseModel):
  """An entity in a graph."""
  type: Literal["PERSON", "PLACE", "THING"] = Field(description="Entity in a knowledge graph. Only extract entities with types that are listed as valid: PERSON, PLACE, OR THING.")

class Relation(BaseModel):
  """A relation connecting to entities in a graph."""
  type: Literal["HAS", "PART_OF"] = Field(description="Relation in a knowledge graph. Only extract relations with types that are listed as valid: HAS, PART_OF.")

class Triplet(BaseModel):
  """A triplet of two entities and a relation."""
  subject: Entity
  relation: Relation
  object: Entity

class Triplets(BaseModel):
  """Knowledge Graph Schema."""
  triplets: list[Triplet]

  @validator("triplets", pre=True)
  def validate_triplets(v, values):
    passing_triplets = []
    for i, triplet in enumerate(v):
      # cleanup
      try:
          for key in triplet:
              triplet[key]["type"] = triplet[key]["type"].replace(" ", "_")
              triplet[key]["type"] = triplet[key]["type"].upper()

              # validate, skip if invalid
              _ = triplet_cls(**triplet)
              passing_triplets.append(v[i])
      except (KeyError, ValueError):
          continue

      return passing_triplets


kg_schema_cls = Triplets  
That is very nearly what the default is under the hood
Thanks @Logan M will go through this and try and wrap my head around it
Add a reply
Sign up and join the conversation on Discord