Find answers from the community

Updated 3 months ago

Hi, I have an error with pipeline.run of

Hi, I have an error with pipeline.run of extractors: 'object list can't be used in 'await' expression'. Here is the code:
class CustomExtractor(BaseExtractor):
def aextract(self, nodes):
metadata_list = [
{
"custom": node.metadata["file_name"]
}
for node in nodes
]
return metadata_list
short_summarizer_model = (BartForConditionalGeneration.from_pretrained("facebook/bart-large-cnn").to("cuda").half())
short_summarizer = HuggingFaceLLM(
tokenizer_name = "facebook/bart-large-cnn",
model_name = "facebook/bart-large-cnn",
model = short_summarizer_model
)
keyword_extractor_model = (T5ForConditionalGeneration.from_pretrained("google-t5/t5-base").to("cuda").half())
keyword_extractor = HuggingFaceLLM(
tokenizer_name = "google-t5/t5-base",
model_name = "google-t5/t5-base",
model = keyword_extractor_model
)
extractors = [
# QuestionsAnsweredExtractor(questions=3, llm=chatgpt),
EntityExtractor(prediction_threshold=0.5), # only for intervention paper
SummaryExtractor(summaries=["prev", "self", "next"], llm=short_summarizer),
KeywordExtractor(keywords=10, llm=keyword_extractor),
CustomExtractor()
]
transformations = [text_splitter] + extractors
pipeline = IngestionPipeline(transformations=transformations)
def add_meta_info(docs):
nodes = pipeline.run(documents=docs)
print(nodes[1].metadata)
L
U
4 comments
Should it be

Plain Text
class CustomExtractor(BaseExtractor):
    async def aextract(self, nodes):
        metadata_list = [
            {
                "custom": node.metadata["file_name"]
            }
            for node in nodes
        ]
        return metadata_list
If that doesn't fix it, do share the whole traceback
It works! Thank you๐Ÿ˜€
Sorry, I have another question. The node metainfo output of the above code is in the image. It cannot extract keywords and summary with huggingface model. Is there anything I missing? @Logan M
Attachment
image.png
Add a reply
Sign up and join the conversation on Discord