Lots of more advanced RAG techniques (e.g. those outlined in https://gpt-index.readthedocs.io/en/latest/end_to_end_tutorials/dev_practices/production_rag.html) rely on generation of summaries from data that is being embedded. What do you use to generate these summaries? I find that GPT4 is not usable with larger datasets because of poor performance and too low rate limits and GPT3.5 sometimes does not generate good enough summaries. Are there any alternatives?
In comparison to all the opensource LLMs out there, OpenAI GPT-4, GPT-3.5 seems like the best option to go with. There are other Paid LLMs as well like Claude/PaLM, You can try them in your use case as well for RAG products.