Find answers from the community

C
Chris
Offline, last seen 4 months ago
Joined September 25, 2024
Hello, when using the get_nodes_from_documents method from UnstructuredElementNodeParser together with OpenAI I get
Plain Text
Embeddings have been explicitly disabled. Using MockEmbedding. 0it [00:00, ?it/s]

And then whenever I try to get the node_mappings dictionary, it is always empty, no matter which html file I use.
Below is the full code and the output:
Plain Text
from llama_index.readers.file.flat_reader import FlatReader
from llama_index.node_parser import UnstructuredElementNodeParser
from llama_index.llms import OpenAI
from pathlib import Path

llm = OpenAI(model="gpt-3.5-turbo", api_key="sk-")

# !wget "https://www.dropbox.com/scl/fi/mlaymdy1ni1ovyeykhhuk/tesla_2021_10k.htm?rlkey=qf9k4zn0ejrbm716j0gg7r802&dl=1" -O tesla_2021_10k.htm

reader = FlatReader()
docs_2021 = reader.load_data(Path("tesla_2021_10k.htm"))

node_parser = UnstructuredElementNodeParser(llm=llm)
raw_nodes_2021 = node_parser.get_nodes_from_documents(docs_2021)
base_nodes_2021, node_mappings_2021 = node_parser.get_base_nodes_and_mappings(raw_nodes_2021)
print(len(node_mappings_2021))

Plain Text
Embeddings have been explicitly disabled. Using MockEmbedding. 0it [00:00, ?it/s]
0
12 comments
L
C
S