Find answers from the community

Updated 3 months ago

How to Analyze Tables In Large Financial...

Hi everyone! I'm following this tutorial: https://www.youtube.com/watch?v=xT6JpDELKPg, but when I try to get_nodes_from_documents, I get a message saying, "LLM is explicitly disabled. Using MockLLM. Embeddings have been explicitly disabled. Using MockEmbedding." Also, when I extract the base nodes, no tables are being extracted. My code is exactly the same as in the tutorial. Is this expected behavior?
L
d
F
11 comments
Can you provide some of the code you are running?
camelot isn't garunteed to find tables in every pdf if thats what you are using
the other two warnings are probably benign? It really depends on your code though
This is my code - it's the second part of the tutorial where Jerry uses the UnstructuredElementNodeParser, so I'm not using camelot
  1. The message about the embeddings being disabled is normal
  2. The message about the LLM being disabled is a small bug -- try passing in an LLM explicitly
Plain Text
from llama_index.llms import OpenAI
llm = OpenAI()
node_parser = UnstructuredElementNodeParser(llm=llm)

  1. Unstructured updated recently, and it seems like their table parsing got worse? pip install "unstructured<0.11.0" seemed to work for me
Thank you - all your changes helped. Including the LLM parameter got rid of the LLM warning and installing the old Unstructured library allowed the RAG to identify many more tables, appreciate it
Does the root retriever of the Recursive Retriever have access to the data in the tables, or just the summaries of the tables? I'm trying to find the value for "Land" in the Asset sheet, but it seems like the Asset sheet table summary is vague, so I don't think the retriever knows to look within the Asset sheet to find the land value
And if the table summaries being vague is indeed an issue, does LlamaIndex have ways to adjust how the tables are summarized? how would I go about this?
Hey, i faced same issue today and passing llm into UnstructuredElementNodeParser fixed the issue. Could you add this is example notebook?
in the latest version of llama-index, this should be fixed
Cool. Thank you
Add a reply
Sign up and join the conversation on Discord