Find answers from the community

Updated 4 months ago

Hello,

Hello,
I am trying to implement GraphRAG using my own knowledge graph in Neo4j. I have used Neo4jPropertyGraphStore and then the PropertyGraphIndex object with the from_existing() function to work with my own KG.

I noticed that I need to set all my nodes as __Entity__ and __Node__ for it to work correctly. However, I'm encountering the following error:

Plain Text
ValidationError: 1 validation error for EntityNode
name
  Input should be a valid string [type=string_type, input_value=None, input_type=NoneType]
    For further information visit https://errors.pydantic.dev/2.8/v/string_type


I checked the provided URL, but I couldn't find any useful information to resolve this issue. Could you please provide guidance or help me understand what might be causing this error?

Thank you!
L
A
d
20 comments
the name from_existing might be misleading -- it means an existing graph created by llama-index πŸ˜… It requires entities and relations to have some certain structure
Wdym? So if I have a neo4j graph store I still need to use from_documents? Can't I use from_existing and then manually insert the documents using index.insert?
@Arthur if the graph was created with the proeprty graph index already, go ahead and use from_exsiting as is πŸ‘

You technically can use from_existing and then insert() as well, but querying will only query the stuff you inserted I think (unless you are using some of the cypher retrievers)
Can I choose a community to insert my nodes? Separate communities like one for a kind of documents, other one for another kind etc
What I found weird about this nomenclature is that from_documents seems to create a temporary index, and after it was created I cannot add more docs.

from_existing seems that I will connect to a persistent index and then, using the insert method, I will insert the docs into the index
from_documents does not create a temporary index? You can still call insert() after from_documents or from_existing

The intention is that from_documents is creating a new/fresh index (or at least, thats the intended usage)

from_existing is for loading a graph you created
this is the exact same as VectorStoreIndex.from_documents() vs. VectorStoreIndex.from_vector_store() if you've used that class before
I see. So if I made an endpoint to users insert data in my existing graph store by letting them pass the entities, relationships and validation schema in the request, I can use either from_documents or from_existing? The user will submit the file too.
Yea I think so?
I think this is confusing lol
Doesn't seem to be persistent by the naming conventions
I think its fine? I'm not really sure where the confusion is

from_documents() is for creating a new index

from_existing() is for loading/connecting to something existing
Seems straightforward?
Most examples in the documentation for Langchain and LlamaIndex tend to demonstrate implementations using methods like from_documents, jumping straight into the retrieval phase. This gives the impression that ingestion and retrieval occur together throughout the application's lifecycle, which is often not the case in production environments. Typically, there are distinct phases for ingestion and retrieval. The from_documents method seems to assume that the location of the documents is already known (e.g., in a folder) and, based on its name, doesn’t appear to support adding more documents after the initial ingestion. Additionally, if I were to set up a separate ingestion endpoint and use from_documents to load documents into a property graph, it seems like this would create a new property graph index each time I consume the endpoint.
I think that is not persistent if you don't call the function storage_context.persist()
So, is there no way to converse through my custom graph using llamaindex?
Based on what you told me, I tried to simulate my graph using PropertyGraphIndex with SchemaLLMPathExtractor to create a knowledge graph from certain documents.
The graph has been created perfectly and seems somewhat meaningful, however, when performing a retrieval, it returns an empty list.

index_schema = PropertyGraphIndex.from_documents(
documents,
kg_extractors=[kg_extractor],
property_graph_store=pg_store,
vec_store=vec_store,
show_progress=True
)

retriever = index_schema.as_retriever().retrieve(query)
And if i try to retrieve the same kg loading it using .from_existing() it appears the following empty error:

Plain Text
AssertionError:
Add a reply
Sign up and join the conversation on Discord