Find answers from the community

Updated 6 months ago

Hi everybody, I'm learning with this

At a glance

The community member is learning with a notebook and encountering errors in both Colab and Jupyter Lab. The errors include a RuntimeError related to asyncio, a LocalProtocolError related to an OpenAI API key, and a NameError related to the IndexNode class. The community member also encounters an ImportError when trying to use the FlatReader class from the llama_index.readers.file module.

The community members provide suggestions to resolve the issues, such as using nest_asyncio.apply(), checking the OpenAI API key, and importing the IndexNode class. They also suggest that the community member's environment might be corrupted and recommend starting with a fresh virtual environment or installing the llama-index-readers-file package.

The community member reports that the suggestions helped resolve the issues, and they express gratitude. They also suggest creating a section for "codes from ultra novices" to help other beginners learn from their experiences.

Useful resources
Hi everybody, I'm learning with this notebook. but give me the same error in colab and jupyter-lab. Any help?
The original doc: https://docs.llamaindex.ai/en/latest/examples/query_engine/sec_tables/tesla_10q_table.html
the problematic cell:
import os
import pickle

if not os.path.exists("2021_nodes.pkl"):
raw_nodes_2021 = node_parser.get_nodes_from_documents(docs_2021)
pickle.dump(raw_nodes_2021, open("2021_nodes.pkl", "wb"))
else:
raw_nodes_2021 = pickle.load(open("2021_nodes.pkl", "rb"))


Also, I must to add the cell: !pip install unstructured before:

from pydantic import BaseModel
from unstructured.partition.html import partition_html
import pandas as pd

pd.set_option("display.max_rows", None)
pd.set_option("display.max_columns", None)
pd.set_option("display.width", None)
pd.set_option("display.max_colwidth", None)

Otherwise it don't work.


Question: Which system are you using with your notebooks so your system work perfectly? I'm in windows 10 for jupyter lab. But I guess it must be no problem in colab. But persist.

Thank you in advance!!!
L
y
20 comments
what is the error?
Embeddings have been explicitly disabled. Using MockEmbedding.
31it [00:00, 18469.24it/s]
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
<ipython-input-17-e580cebc1a5b> in <cell line: 4>()
3
4 if not os.path.exists("2021_nodes.pkl"):
----> 5 raw_nodes_2021 = node_parser.get_nodes_from_documents(docs_2021)
6 pickle.dump(raw_nodes_2021, open("2021_nodes.pkl", "wb"))
7 else:

4 frames
/usr/lib/python3.10/asyncio/runners.py in run(main, debug)
31 """
32 if events._get_running_loop() is not None:
---> 33 raise RuntimeError(
34 "asyncio.run() cannot be called from a running event loop")
35

RuntimeError: asyncio.run() cannot be called from a running event loop
Plain Text
import nest_asyncio
nest_asyncio.apply()


Put that at the top of your nb
ok, now a different error appears:

Embeddings have been explicitly disabled. Using MockEmbedding.


31it [00:00, 38377.63it/s]

0%| | 0/31 [00:00<?, ?it/s]

---------------------------------------------------------------------------

LocalProtocolError Traceback (most recent call last)

/usr/local/lib/python3.10/dist-packages/httpx/_transports/default.py in map_httpcore_exceptions()
68 try:
---> 69 yield
70 except Exception as exc:

56 frames

LocalProtocolError: Illegal header value b'Bearer '


The above exception was the direct cause of the following exception:

LocalProtocolError Traceback (most recent call last)

LocalProtocolError: Illegal header value b'Bearer '


The above exception was the direct cause of the following exception:

APIConnectionError Traceback (most recent call last)

/usr/local/lib/python3.10/dist-packages/openai/_base_client.py in _request(self, cast_to, options, stream, stream_cls, remaining_retries)
1489
1490 log.debug("Raising connection error")
-> 1491 raise APIConnectionError(request=request) from err
1492
1493 log.debug(

APIConnectionError: Connection error.
(in the same point)
this means you are missing an openai api key
LocalProtocolError: Illegal header value b'Bearer ' -- empty key
Do you mean openai key? I'm using azure openai. I set, and check with "response = llm.complete("The sky is a beautiful blue and")
print(response)", and works well.

But the code still returns the same error:

Embeddings have been explicitly disabled. Using MockEmbedding.


31it [00:00, 25539.86it/s]

0%| | 0/31 [00:00<?, ?it/s]

---------------------------------------------------------------------------

LocalProtocolError Traceback (most recent call last)

/usr/local/lib/python3.10/dist-packages/httpx/_transports/default.py in map_httpcore_exceptions()
68 try:
---> 69 yield
70 except Exception as exc:

56 frames

LocalProtocolError: Illegal header value b'Bearer '


The above exception was the direct cause of the following exception:

LocalProtocolError Traceback (most recent call last)

LocalProtocolError: Illegal header value b'Bearer '


The above exception was the direct cause of the following exception:

APIConnectionError Traceback (most recent call last)

/usr/local/lib/python3.10/dist-packages/openai/_base_client.py in _request(self, cast_to, options, stream, stream_cls, remaining_retries)
1489
1490 log.debug("Raising connection error")
-> 1491 raise APIConnectionError(request=request) from err
1492
1493 log.debug(

APIConnectionError: Connection error.
Try node_parser = UnstructuredElementNodeParser(llm=llm)
works! ๐Ÿ™‚

now, the next error was for the cell:

example_index_node = [b for b in base_nodes_2021 if isinstance(b, IndexNode)][
20
]

Index Node

print(
f"\n--------\n{example_index_node.get_content(metadata_mode='all')}\n--------\n"
)

Index Node ID

print(f"\n--------\nIndex ID: {example_index_node.index_id}\n--------\n")

Referenceed Table

print(
f"\n--------\n{node_mappings_2021[example_index_node.index_id].get_content()}\n--------\n"
)


it gives:
"NameError Traceback (most recent call last)

<ipython-input-81-bf54a5583d53> in <cell line: 1>()
----> 1 example_index_node = [b for b in base_nodes_2021 if isinstance(b, IndexNode)][
2 20
3 ]
4
5 # Index Node

<ipython-input-81-bf54a5583d53> in <listcomp>(.0)
----> 1 example_index_node = [b for b in base_nodes_2021 if isinstance(b, IndexNode)][
2 20
3 ]
4
5 # Index Node

NameError: name 'IndexNode' is not defined"

Any doc to study that?
from llama_index.core.schema import IndexNode -- just have to import it
solved, thank you! ๐Ÿ™‚

other question. I worked with this notebook in colab. In Jupyter in windows 10, the cell:

"from llama_index.readers.file import FlatReader
from pathlib import Path

reader = FlatReader()
docs_2021 = reader.load_data(Path("tesla_2021_10k.htm"))
docs_2020 = reader.load_data(Path("tesla_2020_10k.htm"))"

returns:

"--------------------------------------------------------------
ImportError Traceback (most recent call last)
Cell In[13], line 1
----> 1 from llama_index.readers.file import FlatReader
2 from pathlib import Path
4 reader = FlatReader()

ImportError: cannot import name 'FlatReader' from 'llama_index.readers.file' (unknown location)"
unknown location makes me think that your env might be corrupted ๐Ÿ˜… Might have to start with a fresh venv?
or pip install llama-index-readers-file
๐Ÿ˜ฎ Now I understand a lot of errors in my system ๐Ÿ˜„ . Now working, is fantastic, thank you! ๐Ÿ™‚
A suggestion: may be you can make a section: codes from ultra novices: and there we put what we did, what we learn, and how we solve. So, novices like me can learn.
using jupyter in windows 10, this cell give me a problem:

cell:
from llama_index.core.schema import IndexNode
example_index_node = [b for b in base_nodes_2021 if isinstance(b, IndexNode)][
20
]

Index Node

print(
f"\n--------\n{example_index_node.get_content(metadata_mode='all')}\n--------\n"
)

Index Node ID

print(f"\n--------\nIndex ID: {example_index_node.index_id}\n--------\n")

Referenceed Table

print(
f"\n--------\n{node_mappings_2021[example_index_node.index_id].get_content()}\n--------\n"
)

returns:
"---------------------------------------------------
IndexError Traceback (most recent call last)
Cell In[34], line 2
1 from llama_index.core.schema import IndexNode
----> 2 example_index_node = [b for b in base_nodes_2021 if isinstance(b, IndexNode)][
3 20
4 ]
6 # Index Node
7 print(
8 f"\n--------\n{example_index_node.get_content(metadata_mode='all')}\n--------\n"
9 )

IndexError: list index out of range"

I use the followiing to check:
"from llama_index.core.schema import IndexNode

Check if base_nodes_2021 is a list and print its length

if isinstance(base_nodes_2021, list):
print(f"base_nodes_2021 is a list with {len(base_nodes_2021)} elements.")
else:
print("base_nodes_2021 is not a list.")

Check if node_mappings_2021 is a dictionary and print some keys

if isinstance(node_mappings_2021, dict):
print(f"node_mappings_2021 is a dictionary with {len(node_mappings_2021)} keys.")
print(list(node_mappings_2021.keys())[:5]) # Print the first 5 keys
else:
print("node_mappings_2021 is not a dictionary.")"

and gives:
"base_nodes_2021 is a list with 168 elements.
node_mappings_2021 is a dictionary with 0 keys.
[]"
just change the index to something smaller, Its just for an example output ๐Ÿ™‚
20 could be 0 for example
Add a reply
Sign up and join the conversation on Discord