Hello, when using the get_nodes_from_documents method from UnstructuredElementNodeParser together with OpenAI I get
Embeddings have been explicitly disabled. Using MockEmbedding. 0it [00:00, ?it/s]
And then whenever I try to get the node_mappings dictionary, it is always empty, no matter which html file I use.
Below is the full code and the output:
from llama_index.readers.file.flat_reader import FlatReader
from llama_index.node_parser import UnstructuredElementNodeParser
from llama_index.llms import OpenAI
from pathlib import Path
llm = OpenAI(model="gpt-3.5-turbo", api_key="sk-")
# !wget "https://www.dropbox.com/scl/fi/mlaymdy1ni1ovyeykhhuk/tesla_2021_10k.htm?rlkey=qf9k4zn0ejrbm716j0gg7r802&dl=1" -O tesla_2021_10k.htm
reader = FlatReader()
docs_2021 = reader.load_data(Path("tesla_2021_10k.htm"))
node_parser = UnstructuredElementNodeParser(llm=llm)
raw_nodes_2021 = node_parser.get_nodes_from_documents(docs_2021)
base_nodes_2021, node_mappings_2021 = node_parser.get_base_nodes_and_mappings(raw_nodes_2021)
print(len(node_mappings_2021))
Embeddings have been explicitly disabled. Using MockEmbedding. 0it [00:00, ?it/s]
0