Find answers from the community

Updated 3 weeks ago

Error

Anyone faced this error while using GraphRAG Colab Notebook(https://docs.llamaindex.ai/en/stable/examples/property_graph/property_graph_neo4j/) ?
Attachments
image.png
image.png
L
J
29 comments
Oh I just fixed this
pip install -U llama-index-core
It still a hit or miss. Some time it worked and other times it threw same below error. But this time it got stuck at 59% after restarting it after getting stuck at 22%. This is after running "pip install -U llama-index-core". Thanks.
Attachment
Screenshot_2025-01-06_015820.png
Did you restart the notebook? The latest versions of llama-index-core completely removed that assert, so you are still running old code
It's working now but i have to load packages as below(remove llama-index from your Notebook pip command):
Attachment
image.png
Few points as I am closing this thread: A) Why it takes such a long time to create Neo4j graph? I tested on two PDF documents (300 + 40 pages) and it took 8 mins. Any recommended best way to create graph from 2000 scientific papers(35 millions tokens) database?
B) I am making this tool to be used offline on customer site. Can i create graph by using language model and embedding model from OpenAI and then infer the graph by using a locally hosted language and embedding model on customer site?
Its a lot of llm calls. 8mins sounds resonable for data that size, its making llm calls for every node (which likely works out to every page). It can only run so much in parallel before you hit rate limits. There is a way to increase the concurrency but tbh I wouldn't recommend touching it too much
Plain Text
kg_extractors=[
    SchemaLLMPathExtractor(
        llm=OpenAI(model="gpt-3.5-turbo", temperature=0.0),
        num_workers=4,  # DEFAULT IS 4
    )
],
You can change the LLM at any time, but for embeddings, you need to use the same model during creation/indexing and querying.
Thanks for the above info. Given the rate limits related to all API hosted models. I was thinking to use local language and embedding models. When i use them it throws following "ReadTimeolut" error:
Attachment
image.png
maybe need a longer request_timeout? I'm sure the full stack trace shows what module is timing out
I just cant see it in the screenshot
I have increased request_timeout by 200 and it does go through 100% now but it shows empty string when i query the resulting graph-index. And nothing prints in retrieved nodes.
Attachments
image.png
image.png
Same Empty response with PropertyGraphIndex.from_existing().
Attachment
image.png
Apart from your next suggestions, how to do "full stack trace" to know which module is timing out?
to me this says no graph relations were retrieved (could be possible, especially with open-source models to build the index, I suggest not using the schema extractor with small open-source models)
So just remove schema extractor from below and replace it with just "llm=llm"
Attachment
image.png
I think so, the default is a lot less demanding
okay it looks like I have two ways to go from here: A) scrap schema extractor to use small LLMs to create graph index but it's inference-answer quality will not be as good as next option; B) Or use bigger LLM, ideally use one of the bests i.e. OpenAI, Anthropic etc. to create the index and for inference I can switch back to small local LLMs. No matter which option i picks, I have to use same local embedding model for index creation and inference since A) I can't use API embedding model during inference and B) both steps (index creation + inference) must use the same embedding model. Please correct me if i got anything wrong, thanks.
yea that about sums it up
got it. One last thing as I'm learning LlamaIndex, how to see full stack trace when you said "I'm sure the full stack trace shows what module is timing out I just cant see it in the screenshot"
Your screenshot just cut off the bottom of the trace lol

Sometimes notebooks also truncate the middle of the traceback (very annoying, you'll see three dots somewhere in the middle). Usually there's a button at the bottom of the log saying "view full" or "view as scrollable element"
oh i see what you meant, i thought there was piece of code i was missing to produce full traceback. thanks.
I changed the default OpenAI model to O1 and i ran into following error:
Attachments
image.png
image.png
o1 doesn't support function calling. Use the Dynamic llm extractor (but also, using o1 for this is mega overkill)
okay. I am running into a new error, i guess I shall contact neo4j devs?
maybe? If I had to guess, some name or entity or relation got extracted as a blank string
hot take: I dislike graphs, and I don't think the effort and compute is worth it (in a majority of use-cases) 😁 The hype-sphere really over promises and under-delivers here

anyways, I can probably fix this by filtering out blanks in the extractor source code
Add a reply
Sign up and join the conversation on Discord