Hi all, I am playing a little bit around with the GPTKnowledgeGraphIndex to extract triples from unstructured text, however, I was wondering whether it was possible to rely on some user-defined knowledge base rather than performing the task in a completely open fashion. I mean, let's say that you have a knowledge graph with defined entities and a closed set of relation types, can you extract knowledge triples sticking to this prior knowledge base, i.e. involving the original entities and relation types only?
Thanks! Indeed, that is what I am doing right now to manually build the knowledge base, but my question is: once you have manually built your knowledge base, can you use that as foundation for extracting the knowledge triples contained in new sentences? If you wish, this can be paraphrased as: given a sentence and a knowledge base, find the subject and object of the sentence, map them to your set of knowledge base entities, and do the same with the relation. In summary map the relation expressed by the sentence to a triple of the knowledge base.
Ohhhh... hmm, that's not currently supported. But you could customize the internal prompt that's used to extract triplets to reflect the kind of data you want to extract? π€
Yeah I was thinking about something along that line too, for example if you place in the prompt the triples composing your knowledge base it might work. The problem is that you usually have knowledge bases with thousands of nodes and triples to say the least, making it hard to fit them in the prompt...
I was thinking: what if I created an index where each node is a different knowledge triple of my knowledge base (or the collection of the triples involving a particular entity) and then, given a sentence, I ask the model to first retrieve the top k relevant nodes (i.e. top k relevant triples) and then include them in the prompt passed to the KnowledgeGraphIndex for knowledge triplet extraction?