Any one able to ellaborate on the difference between the use cases of the SummaryIndex and the DocumentSummaryIndex? It look like the summary index is a linked list over the documents ndoes and then uses the refine query to run multiple LLM calls to summarise the document on the fly?
My RAG is poor at summaries atm because its just chuncked and vector based. Was thinking to pre-summarising all the documents with like a 7B model and then put them into a key word look up store.
What is the default store for these indexes? The document store?
Yeah I think If you want to synthesize the response across all your nodes you'll want the SummaryIndex and if you want separate summaries from each document stored for retrieval you'd use the DocumentSummaryIndex. So I guess unless you need to have everything read in your index during query time, the DocumentSummaryIndex might be more appropriate for you? And yeah should be the docstore
how could i change the respond language in the docstore.json. i've change the system_promt ,but it didn't work: city_docs = [] for file in pdf_files: docs = SimpleDirectoryReader( input_files=[file] ).load_data() title = file.split(':')[0] docs[0].doc_id = title city_docs.extend(docs)