I just wanted an opinion on possible approaches. One approach is the following:
# assume a vector index of nodes is already created
retriever = VectorIndexRetriever(
index=index,
similarity_top_k=2,
)
nodes = retriever.retrieve("Name the customers have Gold plan support")
# extract the doc_paths from the nodes
doc_paths = []
for node in nodes:
doc_paths.append(node.metadata['file_path'])
# create summary index for those docs
relevant_docs = SimpleDirectoryReader(input_files=doc_paths).load_data()
doc_summary_index = DocumentSummaryIndex.from_documents(relevant_docs)
query_engine = doc_summary_index.as_query_engine()
response = query_engine.query('my query')