Find answers from the community

U
Utine
Offline, last seen 3 months ago
Joined September 25, 2024
Hi I use document summary index as follws:
doc_summary_index = DocumentSummaryIndex(
nodes=nodes,
service_context=service_context,
storage_context=storage_context,
response_synthesizer=response_synthesizer,
show_progress=True,
)
But got error:
Exception has occurred: ValueError
One of nodes, objects, or index_struct must be provided.
File "/home/csgrad/yhu/projs/SLPChatbot/docSummaryIndex.py", line 156, in getDocumentIndex
index_store=SimpleIndexStore.from_persist_dir(persist_dir=persistent_path)
FileNotFoundError: [Errno 2] No such file or directory: '/home/csgrad/yhu/projs/SLPChatbot/documentSumStore/index_storage/slp_doc_sum/index_store.json'

During handling of the above exception, another exception occurred:

File "/home/csgrad/yhu/projs/SLPChatbot/docSummaryIndex.py", line 168, in getDocumentIndex
doc_summary_index = DocumentSummaryIndex(
File "/home/csgrad/yhu/projs/SLPChatbot/docSummaryIndex.py", line 183, in <module>
doc_summary_index = getDocumentIndex(data_path, persistent_path)
ValueError: One of nodes, objects, or index_struct must be provided.
1 comment
L
Hi, I have an error with pipeline.run of extractors: 'object list can't be used in 'await' expression'. Here is the code:
class CustomExtractor(BaseExtractor):
def aextract(self, nodes):
metadata_list = [
{
"custom": node.metadata["file_name"]
}
for node in nodes
]
return metadata_list
short_summarizer_model = (BartForConditionalGeneration.from_pretrained("facebook/bart-large-cnn").to("cuda").half())
short_summarizer = HuggingFaceLLM(
tokenizer_name = "facebook/bart-large-cnn",
model_name = "facebook/bart-large-cnn",
model = short_summarizer_model
)
keyword_extractor_model = (T5ForConditionalGeneration.from_pretrained("google-t5/t5-base").to("cuda").half())
keyword_extractor = HuggingFaceLLM(
tokenizer_name = "google-t5/t5-base",
model_name = "google-t5/t5-base",
model = keyword_extractor_model
)
extractors = [
# QuestionsAnsweredExtractor(questions=3, llm=chatgpt),
EntityExtractor(prediction_threshold=0.5), # only for intervention paper
SummaryExtractor(summaries=["prev", "self", "next"], llm=short_summarizer),
KeywordExtractor(keywords=10, llm=keyword_extractor),
CustomExtractor()
]
transformations = [text_splitter] + extractors
pipeline = IngestionPipeline(transformations=transformations)
def add_meta_info(docs):
nodes = pipeline.run(documents=docs)
print(nodes[1].metadata)
4 comments
U
L
For document summary index, can I use other model instead of GPT to generate summary? Like "pszemraj/long-t5-tglobal-base-16384-book-summary" in huggingface
4 comments
U
L
W
My code is simple just following the tutorials, but I get empty response. This is my code:
1 comment
A
U
Utine
·

Filter

1 comment
L