Is there any kind of prompt caching in place? How can I intercept llm calls, to put a cache layer before? Is there any mechanism in-place to do this, instead of implementing by hand?
So I want to have a cache layer so that eveyr prompt gets intercepted and ran through the caching layers. I want to avoid doing it manually for every request. Lets say we have an agent that makes multiple requests that I wont have access manually, I would have to rewrite all of the classes to be able to have cache. Lets say i use a query engine for agent tools, I would have to rewrite query engine so that it intercepts the prompts and checks the cache