The community member is having issues loading RDF data into the Llama index. They have tried using the GPTVectorStoreIndex, which works quickly but doesn't give the best results, and the KnowledgeGraphIndex, which takes a long time and a lot of resources, and the resulting knowledge graph seems to only include some of the entities.
Another community member suggests using the RDF Loader, which is the recommended way to load a pre-made knowledge graph into the Llama index, while the KnowledgeGraphIndex is the recommended way to create a knowledge graph using unstructured data.
The community members discuss a notebook that combines a knowledge graph and vector index, but the original poster clarifies that they are trying to use structured data (RDF) to create a knowledge graph and then use that as context information for the Llama index, and they believe the correct approach is to use the RDFReader and ignore the KGIndex functionality.
Another community member agrees that this may be the best approach, and the original poster shares a blog post they wrote about their experience. A final community member suggests that the poor performance of the initial index approach may be due to the default CSV reader not being great, and they encourage the original poster to contribute any ideas for improving the RDFReader.
Anyone have experience loading RDF data into Llama index? I can use GPTVectorStoreIndex to index the raw RDF (or ttl or JSON-LD) file and that works quickly but doesn't give the best results. The KnowledgeGraphIndex takes super long and takes a lot of resources and I don't know that the results are that much better. Also, when I visualize the knowledge graph that KnowledgeGraphIndex creates it seems like it only took some of the entities in the KG as nodes in the LLM. Any help/guidance would be greatly appreciated.
Am I right in assuming that this is the recommended way to load a pre-made knowledge graph into Llama index, and that the KnowledgeGraphIndex is the recommended way of CREATING a KG with llama using unstructured data?
It looks like that notebook is focused on building a KG using unstructured data, is that correct? I am trying to use structured data to create a KG (using RDFLib/python, not Llama) and then use that RDF data as context info for Llama. I believe the correct approach is then to just use the RDFReader and ignore the KGIndex functionality in Llama entirely, is that correct?
I think one reason for the poor performance on the initial index approach is that our default csv reader is not great. I'm pretty sure it just splits each row into a document/node π