Find answers from the community

Updated 4 weeks ago

Cached Augmented Generation (CAG) with Gemini or other LLMs integrated with LlamaIndex

Hi everyone,
Is there any implementation of Cached Augmented Generation (CAG) with Gemini or other LLMs integrated with LlamaIndex?
L
A
5 comments
correct me if I'm wrong, but doesn't CAG require direct model access (i.e. with pytorch)? I don't think you can implement this over an API
Thanks for sharing that with me.
I'm unclear about direct model access too, which is why I'm looking into how it might work with Gemini. It might not be doable!
As I know, Gemini has the longest context window among the LLMs, so CAG is really only meaningful with Gemini!
There's lots of local models with large context windows too. But of course not as big as gemini
Add a reply
Sign up and join the conversation on Discord