Hi, I have an error with pipeline.run of

At a glance

The community member is experiencing an error with the pipeline.run of extractors, specifically the error "object list can't be used in 'await' expression". They have provided the code for a custom extractor and the setup of their pipeline. One community member suggests changing the custom extractor to use the async keyword, which the original poster confirms fixes the issue.

The community member then has another question about the node metainfo output, stating that they cannot extract keywords and summary with the Hugging Face models. They ask if they are missing anything, tagging another user, Logan M.

UUtine

Hi, I have an error with pipeline.run of extractors: 'object list can't be used in 'await' expression'. Here is the code:
class CustomExtractor(BaseExtractor):
def aextract(self, nodes):
metadata_list = [
{
"custom": node.metadata["file_name"]
}
for node in nodes
]
return metadata_list
short_summarizer_model = (BartForConditionalGeneration.from_pretrained("facebook/bart-large-cnn").to("cuda").half())
short_summarizer = HuggingFaceLLM(
tokenizer_name = "facebook/bart-large-cnn",
model_name = "facebook/bart-large-cnn",
model = short_summarizer_model
)
keyword_extractor_model = (T5ForConditionalGeneration.from_pretrained("google-t5/t5-base").to("cuda").half())
keyword_extractor = HuggingFaceLLM(
tokenizer_name = "google-t5/t5-base",
model_name = "google-t5/t5-base",
model = keyword_extractor_model
)
extractors = [
# QuestionsAnsweredExtractor(questions=3, llm=chatgpt),
EntityExtractor(prediction_threshold=0.5), # only for intervention paper
SummaryExtractor(summaries=["prev", "self", "next"], llm=short_summarizer),
KeywordExtractor(keywords=10, llm=keyword_extractor),
CustomExtractor()
]
transformations = [text_splitter] + extractors
pipeline = IngestionPipeline(transformations=transformations)
def add_meta_info(docs):
nodes = pipeline.run(documents=docs)
print(nodes[1].metadata)

4 comments

LLogan M

Should it be

Plain Text

class CustomExtractor(BaseExtractor):
    async def aextract(self, nodes):
        metadata_list = [
            {
                "custom": node.metadata["file_name"]
            }
            for node in nodes
        ]
        return metadata_list

LLogan M

If that doesn't fix it, do share the whole traceback

UUtine

It works! Thank you😀

UUtine

Sorry, I have another question. The node metainfo output of the above code is in the image. It cannot extract keywords and summary with huggingface model. Is there anything I missing? @Logan M

Attachment

Add a reply

Find answers from the community

Hi, I have an error with pipeline.run of