Hi all I am playing a little bit around

BBrunoLiegiBastonLiegi

Hi all, I am playing a little bit around with the GPTKnowledgeGraphIndex to extract triples from unstructured text, however, I was wondering whether it was possible to rely on some user-defined knowledge base rather than performing the task in a completely open fashion. I mean, let's say that you have a knowledge graph with defined entities and a closed set of relation types, can you extract knowledge triples sticking to this prior knowledge base, i.e. involving the original entities and relation types only?

7 comments

LLogan M

Definitely you can use an existing set of triplets!

You just need to map triplets to a specific text node that they came from.

See the bottom of this notebook:
https://github.com/jerryjliu/llama_index/blob/main/examples/knowledge_graph/KnowledgeGraphDemo.ipynb

BBrunoLiegiBastonLiegi

Thanks! Indeed, that is what I am doing right now to manually build the knowledge base, but my question is: once you have manually built your knowledge base, can you use that as foundation for extracting the knowledge triples contained in new sentences? If you wish, this can be paraphrased as: given a sentence and a knowledge base, find the subject and object of the sentence, map them to your set of knowledge base entities, and do the same with the relation. In summary map the relation expressed by the sentence to a triple of the knowledge base.

LLogan M

Ohhhh... hmm, that's not currently supported. But you could customize the internal prompt that's used to extract triplets to reflect the kind of data you want to extract? 🤔

BBrunoLiegiBastonLiegi

Yeah I was thinking about something along that line too, for example if you place in the prompt the triples composing your knowledge base it might work. The problem is that you usually have knowledge bases with thousands of nodes and triples to say the least, making it hard to fit them in the prompt...

LLogan M

Yea, you more so need a set of rules rather than examples 🤔

BBrunoLiegiBastonLiegi

I was thinking: what if I created an index where each node is a different knowledge triple of my knowledge base (or the collection of the triples involving a particular entity) and then, given a sentence, I ask the model to first retrieve the top k relevant nodes (i.e. top k relevant triples) and then include them in the prompt passed to the KnowledgeGraphIndex for knowledge triplet extraction?

LLogan M

That's not a bad idea 👀 could work! Would make better examples than what the current template has

Add a reply

Find answers from the community

Hi all I am playing a little bit around