Hey everyone,
Keyword extractor question:
I am setting up a keyword extractor pipeline like so:
from llama_index.extractors import (
KeywordExtractor,
)
extractors = [
KeywordExtractor(keywords=5, llm = llm, )
]
from llama_index.ingestion import IngestionPipeline
pipeline = IngestionPipeline(transformations=extractors)
nodes = pipeline.run(documents=d[0:5])
nodes[1].metadata
Which ends up printing:
{'excerpt_keywords': "I'm sorry, but your request doesn't match the context provided. Could you please provide more information or clarify your question?"}
or
{'excerpt_keywords': "I'm sorry, but I don't have enough information to answer your question."}
I have tried changing the prompt in the KeywordExtractor definition, but for some documents it works and others it doesn't.
I know for sure the documents that give the above answer have content, and have no meaningful difference from the ones that return actual keywords.
Not familiar enough with the code to know where the prompt might be going wrong here.
Any ideas / suggestions?
Will be glad to submit a PR if I can get some direction on where this might be fixed. I also notices a TODO for the KeywordExtractor prompt that I'll look into as well.