Find answers from the community

Updated 3 months ago

Is there any kind of prompt caching in

Is there any kind of prompt caching in place? How can I intercept llm calls, to put a cache layer before? Is there any mechanism in-place to do this, instead of implementing by hand?
W
s
8 comments
You can create a retriver and extract all the nodes and then do the llm call as per your format adding all node content
yeah but implicit
not explicitly creating a vector store
isnt there anything in the LLM abstraction?
Let me see if I got your query right:
  • You want to use llm without creating a vector store ? with prompt changing capability?
So I want to have a cache layer so that eveyr prompt gets intercepted and ran through the caching layers.
I want to avoid doing it manually for every request. Lets say we have an agent that makes multiple requests that I wont have access manually, I would have to rewrite all of the classes to be able to have cache.
Lets say i use a query engine for agent tools, I would have to rewrite query engine so that it intercepts the prompts and checks the cache
Yea currently there is no feature such as this present, You'll have to write this on your own
Add a reply
Sign up and join the conversation on Discord