Find answers from the community

Updated 6 months ago

Hi, I have an error with pipeline.run of

At a glance

The community member is experiencing an error with the pipeline.run of extractors, specifically the error "object list can't be used in 'await' expression". They have provided the code for a custom extractor and the setup of their pipeline. One community member suggests changing the custom extractor to use the async keyword, which the original poster confirms fixes the issue.

The community member then has another question about the node metainfo output, stating that they cannot extract keywords and summary with the Hugging Face models. They ask if they are missing anything, tagging another user, Logan M.

Hi, I have an error with pipeline.run of extractors: 'object list can't be used in 'await' expression'. Here is the code:
class CustomExtractor(BaseExtractor):
def aextract(self, nodes):
metadata_list = [
{
"custom": node.metadata["file_name"]
}
for node in nodes
]
return metadata_list
short_summarizer_model = (BartForConditionalGeneration.from_pretrained("facebook/bart-large-cnn").to("cuda").half())
short_summarizer = HuggingFaceLLM(
tokenizer_name = "facebook/bart-large-cnn",
model_name = "facebook/bart-large-cnn",
model = short_summarizer_model
)
keyword_extractor_model = (T5ForConditionalGeneration.from_pretrained("google-t5/t5-base").to("cuda").half())
keyword_extractor = HuggingFaceLLM(
tokenizer_name = "google-t5/t5-base",
model_name = "google-t5/t5-base",
model = keyword_extractor_model
)
extractors = [
# QuestionsAnsweredExtractor(questions=3, llm=chatgpt),
EntityExtractor(prediction_threshold=0.5), # only for intervention paper
SummaryExtractor(summaries=["prev", "self", "next"], llm=short_summarizer),
KeywordExtractor(keywords=10, llm=keyword_extractor),
CustomExtractor()
]
transformations = [text_splitter] + extractors
pipeline = IngestionPipeline(transformations=transformations)
def add_meta_info(docs):
nodes = pipeline.run(documents=docs)
print(nodes[1].metadata)
L
U
4 comments
Should it be

Plain Text
class CustomExtractor(BaseExtractor):
    async def aextract(self, nodes):
        metadata_list = [
            {
                "custom": node.metadata["file_name"]
            }
            for node in nodes
        ]
        return metadata_list
If that doesn't fix it, do share the whole traceback
It works! Thank you๐Ÿ˜€
Sorry, I have another question. The node metainfo output of the above code is in the image. It cannot extract keywords and summary with huggingface model. Is there anything I missing? @Logan M
Attachment
image.png
Add a reply
Sign up and join the conversation on Discord