Find answers from the community

Updated 2 months ago

Does title extractor work with version .

Does title extractor work with version .8.43
L
t
8 comments
It should, What issue are you seeing?
I can't get this to run:

from llama_index.extractors import ( TitleExtractor, QuestionsAnsweredExtractor, ) from llama_index.text_splitter import TokenTextSplitter
the imports?
actually, think I got it working, bro:
# Import the necessary modules for metadata extraction from llama_index.node_parser.extractors import ( MetadataExtractor, QuestionsAnsweredExtractor, TitleExtractor, ) from llama_index.llms import OpenAI # Initialize the LLM and metadata extractor llm = OpenAI(model="gpt-3.5-turbo") metadata_extractor = MetadataExtractor( extractors=[ TitleExtractor(nodes=5, llm=llm), QuestionsAnsweredExtractor(questions=3, llm=llm), ], in_place=False, ) # Process nodes to add additional metadata nodes = metadata_extractor.process_nodes(nodes) print(f"Debug: Processed {len(nodes)} nodes.") # Debugging for node in nodes: print(f"Processed Node Metadata: {node.metadata}")
But the output is this:

Extracted Title: Order and Agreement for Energy Efficiency Project: Utility Consumption and Savings Analysis, AT&T Site Information, Billing Details, Installation and Site Term Dates, Supplier's System Equipment and Software, and Subcontractor Information


whereas the actual title is this (in the PDF):

Order for Saved Utility Service
why the discrepency?
The name TitleExtractor is maybe misleading. It just gives the LLM some text and asks it to write an example title
Add a reply
Sign up and join the conversation on Discord