The post inquires about any insights being incorporated into the PropertyGraphIndex now that the GraphRAG source is available. The community members discuss the GraphRAG codebase, noting that it is "extremely rough" and used 180k tokens to answer one question, suggesting it may not be efficient. They also mention that GraphRAG lacks some interesting features like entity resolution and does not provide much detail on its ranking approach. One community member had to implement their own app-specific entity resolution. Another community member points out that GraphRAG's relevance mechanism involves listing the top 5 most relevant record IDs and adding "+more" to indicate there are more. Overall, the community members seem to view GraphRAG as not yet useful or production-ready.
I see: "Do not list more than 5 record ids in a single reference. Instead, list the top 5 most relevant record ids and add "+more" to indicate that there are more." in their system prompt. So... THAT's the relevance mechanism.
I think that high token count applies to the index creating step first time you digest a RAG doc set --- after it is set up and you get these like 10 parquet files then they are kind of diy how you want to use the graph as metadata so token count is up to you and how you set up the yaml config or edit prompts to fit your data. Entity (raw and cleaned) , Nodes , + relationshps each have their own parquet output index. I think this is super useful especially if possible to shift to a local model at some point. Hierarchy like Graphrag or Raptor or .. is imprtant metadata for hybrid search (graph can also be part of ranking etc) -- so please do keep this possibility open for LlamaIndex.