Find answers from the community

Updated 2 months ago

Hi everybody, I'm learning with this

Hi everybody, I'm learning with this notebook. but give me the same error in colab and jupyter-lab. Any help?
The original doc: https://docs.llamaindex.ai/en/latest/examples/query_engine/sec_tables/tesla_10q_table.html
the problematic cell:
import os
import pickle

if not os.path.exists("2021_nodes.pkl"):
raw_nodes_2021 = node_parser.get_nodes_from_documents(docs_2021)
pickle.dump(raw_nodes_2021, open("2021_nodes.pkl", "wb"))
else:
raw_nodes_2021 = pickle.load(open("2021_nodes.pkl", "rb"))


Also, I must to add the cell: !pip install unstructured before:

from pydantic import BaseModel
from unstructured.partition.html import partition_html
import pandas as pd

pd.set_option("display.max_rows", None)
pd.set_option("display.max_columns", None)
pd.set_option("display.width", None)
pd.set_option("display.max_colwidth", None)

Otherwise it don't work.


Question: Which system are you using with your notebooks so your system work perfectly? I'm in windows 10 for jupyter lab. But I guess it must be no problem in colab. But persist.

Thank you in advance!!!
L
y
20 comments
what is the error?
Embeddings have been explicitly disabled. Using MockEmbedding.
31it [00:00, 18469.24it/s]
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
<ipython-input-17-e580cebc1a5b> in <cell line: 4>()
3
4 if not os.path.exists("2021_nodes.pkl"):
----> 5 raw_nodes_2021 = node_parser.get_nodes_from_documents(docs_2021)
6 pickle.dump(raw_nodes_2021, open("2021_nodes.pkl", "wb"))
7 else:

4 frames
/usr/lib/python3.10/asyncio/runners.py in run(main, debug)
31 """
32 if events._get_running_loop() is not None:
---> 33 raise RuntimeError(
34 "asyncio.run() cannot be called from a running event loop")
35

RuntimeError: asyncio.run() cannot be called from a running event loop
Plain Text
import nest_asyncio
nest_asyncio.apply()


Put that at the top of your nb
ok, now a different error appears:

Embeddings have been explicitly disabled. Using MockEmbedding.


31it [00:00, 38377.63it/s]

0%| | 0/31 [00:00<?, ?it/s]

---------------------------------------------------------------------------

LocalProtocolError Traceback (most recent call last)

/usr/local/lib/python3.10/dist-packages/httpx/_transports/default.py in map_httpcore_exceptions()
68 try:
---> 69 yield
70 except Exception as exc:

56 frames

LocalProtocolError: Illegal header value b'Bearer '


The above exception was the direct cause of the following exception:

LocalProtocolError Traceback (most recent call last)

LocalProtocolError: Illegal header value b'Bearer '


The above exception was the direct cause of the following exception:

APIConnectionError Traceback (most recent call last)

/usr/local/lib/python3.10/dist-packages/openai/_base_client.py in _request(self, cast_to, options, stream, stream_cls, remaining_retries)
1489
1490 log.debug("Raising connection error")
-> 1491 raise APIConnectionError(request=request) from err
1492
1493 log.debug(

APIConnectionError: Connection error.
(in the same point)
this means you are missing an openai api key
LocalProtocolError: Illegal header value b'Bearer ' -- empty key
Do you mean openai key? I'm using azure openai. I set, and check with "response = llm.complete("The sky is a beautiful blue and")
print(response)", and works well.

But the code still returns the same error:

Embeddings have been explicitly disabled. Using MockEmbedding.


31it [00:00, 25539.86it/s]

0%| | 0/31 [00:00<?, ?it/s]

---------------------------------------------------------------------------

LocalProtocolError Traceback (most recent call last)

/usr/local/lib/python3.10/dist-packages/httpx/_transports/default.py in map_httpcore_exceptions()
68 try:
---> 69 yield
70 except Exception as exc:

56 frames

LocalProtocolError: Illegal header value b'Bearer '


The above exception was the direct cause of the following exception:

LocalProtocolError Traceback (most recent call last)

LocalProtocolError: Illegal header value b'Bearer '


The above exception was the direct cause of the following exception:

APIConnectionError Traceback (most recent call last)

/usr/local/lib/python3.10/dist-packages/openai/_base_client.py in _request(self, cast_to, options, stream, stream_cls, remaining_retries)
1489
1490 log.debug("Raising connection error")
-> 1491 raise APIConnectionError(request=request) from err
1492
1493 log.debug(

APIConnectionError: Connection error.
Try node_parser = UnstructuredElementNodeParser(llm=llm)
works! ๐Ÿ™‚

now, the next error was for the cell:

example_index_node = [b for b in base_nodes_2021 if isinstance(b, IndexNode)][
20
]

Index Node

print(
f"\n--------\n{example_index_node.get_content(metadata_mode='all')}\n--------\n"
)

Index Node ID

print(f"\n--------\nIndex ID: {example_index_node.index_id}\n--------\n")

Referenceed Table

print(
f"\n--------\n{node_mappings_2021[example_index_node.index_id].get_content()}\n--------\n"
)


it gives:
"NameError Traceback (most recent call last)

<ipython-input-81-bf54a5583d53> in <cell line: 1>()
----> 1 example_index_node = [b for b in base_nodes_2021 if isinstance(b, IndexNode)][
2 20
3 ]
4
5 # Index Node

<ipython-input-81-bf54a5583d53> in <listcomp>(.0)
----> 1 example_index_node = [b for b in base_nodes_2021 if isinstance(b, IndexNode)][
2 20
3 ]
4
5 # Index Node

NameError: name 'IndexNode' is not defined"

Any doc to study that?
from llama_index.core.schema import IndexNode -- just have to import it
solved, thank you! ๐Ÿ™‚

other question. I worked with this notebook in colab. In Jupyter in windows 10, the cell:

"from llama_index.readers.file import FlatReader
from pathlib import Path

reader = FlatReader()
docs_2021 = reader.load_data(Path("tesla_2021_10k.htm"))
docs_2020 = reader.load_data(Path("tesla_2020_10k.htm"))"

returns:

"--------------------------------------------------------------
ImportError Traceback (most recent call last)
Cell In[13], line 1
----> 1 from llama_index.readers.file import FlatReader
2 from pathlib import Path
4 reader = FlatReader()

ImportError: cannot import name 'FlatReader' from 'llama_index.readers.file' (unknown location)"
unknown location makes me think that your env might be corrupted ๐Ÿ˜… Might have to start with a fresh venv?
or pip install llama-index-readers-file
๐Ÿ˜ฎ Now I understand a lot of errors in my system ๐Ÿ˜„ . Now working, is fantastic, thank you! ๐Ÿ™‚
A suggestion: may be you can make a section: codes from ultra novices: and there we put what we did, what we learn, and how we solve. So, novices like me can learn.
using jupyter in windows 10, this cell give me a problem:

cell:
from llama_index.core.schema import IndexNode
example_index_node = [b for b in base_nodes_2021 if isinstance(b, IndexNode)][
20
]

Index Node

print(
f"\n--------\n{example_index_node.get_content(metadata_mode='all')}\n--------\n"
)

Index Node ID

print(f"\n--------\nIndex ID: {example_index_node.index_id}\n--------\n")

Referenceed Table

print(
f"\n--------\n{node_mappings_2021[example_index_node.index_id].get_content()}\n--------\n"
)

returns:
"---------------------------------------------------
IndexError Traceback (most recent call last)
Cell In[34], line 2
1 from llama_index.core.schema import IndexNode
----> 2 example_index_node = [b for b in base_nodes_2021 if isinstance(b, IndexNode)][
3 20
4 ]
6 # Index Node
7 print(
8 f"\n--------\n{example_index_node.get_content(metadata_mode='all')}\n--------\n"
9 )

IndexError: list index out of range"

I use the followiing to check:
"from llama_index.core.schema import IndexNode

Check if base_nodes_2021 is a list and print its length

if isinstance(base_nodes_2021, list):
print(f"base_nodes_2021 is a list with {len(base_nodes_2021)} elements.")
else:
print("base_nodes_2021 is not a list.")

Check if node_mappings_2021 is a dictionary and print some keys

if isinstance(node_mappings_2021, dict):
print(f"node_mappings_2021 is a dictionary with {len(node_mappings_2021)} keys.")
print(list(node_mappings_2021.keys())[:5]) # Print the first 5 keys
else:
print("node_mappings_2021 is not a dictionary.")"

and gives:
"base_nodes_2021 is a list with 168 elements.
node_mappings_2021 is a dictionary with 0 keys.
[]"
just change the index to something smaller, Its just for an example output ๐Ÿ™‚
20 could be 0 for example
Add a reply
Sign up and join the conversation on Discord