Optimization suggestions for a Q&A system

Does anyone have any insight on mmr vs standard retrieval? For context, I have a large set of data files that I create my index with. The index uses a tree_summarize response mode with the text_qa_prompt, and some custom prompt engineering added in (It's a lot of rough data that overlaps, this keeps the data in without me having to manually scrub). The problem I have is the response time from my server, back to my application takes to long and i'm trying to reduce my query time.

Find answers from the community

Optimization suggestions for a Q&A system