The knowledge graph building in Llama-Index is pretty interesting.
Currently, when you make a knowledge graph index, it asks the LLM to extract triplets for each text chunk. The triplets have the format
(subject, relationship, object)
. There is also the option to customize the prompt that extracts the triplets.
Then, at query time, the library extracts keywords from the query and uses those to find triplets where the keyword is the same as the subject. If you set
include_text=True/False
, it will or will not also include the text where the triplet was found to help answer the query
Alternatively, you can augment this process to use embeddings as well, by setting
include_embeddings=True
when constructing the index
The example notebook covers this all pretty well:
https://github.com/jerryjliu/gpt_index/blob/main/examples/knowledge_graph/KnowledgeGraphDemo.ipynb