Find answers from the community

Updated 5 months ago

Hi Everyone,

At a glance
Hi Everyone,

I'm new to Llama Parse, and trying to implement it in a small Llama 3.1 based RAG app that uses Langchain. I am trying to use Document.to_langchain_format() to make the document object usable in langchain, but I keep running into the "Attribute Error: Tuple object has no attribute 'metadata'", despite passing in a function that returns a metadata dict. The Kapa.ai hasn't been much help. Anyone able to give me some assistance here?

def get_meta(file):
filename, extension = os.path.splitext(file)
metadata_dict = {
'filepath': {"filename": extension}
}
return metadata_dict

def load_document(file):
import os
name, extension = os.path.splitext(file)
os.environ['LLAMA_CLOUD_API_KEY'] = 'llx-Ks8gd2ve9Qwwu0RrHn44RsMcrg79GtrYUFKTMJa4UwSpeFxX'

if extension == '.pdf':
from llama_parse import LlamaParse
from llama_index.core import SimpleDirectoryReader
from llama_index.core.schema import Document


parser = LlamaParse(result_type="markdown") # "markdown" and "text" are available)
file_extractor = {".pdf": parser}

llama_parse_documents = SimpleDirectoryReader(input_files=[file], file_extractor=file_extractor, file_metadata={}).load_data(),

loader = Document.to_langchain_format(llama_parse_documents)
L
M
10 comments
file_metadata is supposed to be a function, not a dict
Plain Text
llama_parse_documents = SimpleDirectoryReader(
  input_files=[file], 
  file_extractor=file_extractor, 
  file_metadata=lambda filename: {}
).load_data()
@Logan M no luck :(. I tried that, and also tried passing in this function:
filename_fn = lambda filename: {"file_name": file}

Now I get the error as 'list object...' instead
Whats the full traceback? That would probably help narrow down the issue
loader = Document.to_langchain_format(llama_parse_documents) this seems wrong
I would expect something like

documents = [x.to_langchain_format() for x in llama_parse_documents]
@Logan M here's the traceback:
Traceback (most recent call last):
File "/Users/mottzerella/Documents/Coding_Practice/ztm_milestone_projects/heart_disease_project/QA_LLM_APP/.conda/lib/python3.11/site-packages/streamlit/runtime/scriptrunner/exec_code.py", line 88, in exec_func_with_error_handling
result = func()
^^^^^^
File "/Users/mottzerella/Documents/Coding_Practice/ztm_milestone_projects/heart_disease_project/QA_LLM_APP/.conda/lib/python3.11/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 590, in code_to_exec
exec(code, module.dict)
File "/Users/mottzerella/Documents/Coding_Practice/ztm_milestone_projects/heart_disease_project/QA_LLM_APP/Project - Streamlit Front-End for Question-Answering App/QA_LLM_Pinecone.py", line 195, in <module>
data = load_document(file_name)
^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/mottzerella/Documents/Coding_Practice/ztm_milestone_projects/heart_disease_project/QA_LLM_APP/Project - Streamlit Front-End for Question-Answering App/QA_LLM_Pinecone.py", line 33, in load_document
loader = Document.to_langchain_format(llama_parse_documents)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/mottzerella/Documents/Coding_Practice/ztm_milestone_projects/heart_disease_project/QA_LLM_APP/.conda/lib/python3.11/site-packages/llama_index/core/schema.py", line 717, in to_langchain_format
metadata = self.metadata or {}
^^^^^^^^^^^^^
AttributeError: 'list' object has no attribute 'metadata'
yea, my fix above will fix that one
@Logan M hell yeah thank you
Add a reply
Sign up and join the conversation on Discord