Find answers from the community

Updated 10 months ago

I assume this comes up time and time

I assume this comes up time and time again when models release new context windows, but curious where RAG, vector dbs, etc comes into play with these attention focused models?
L
c
5 comments
You are referring to models with huge context windows right? (i.e. Gemini 1M context window)

I think RAG will always play a role. Its similar to how computers have L1, L2, L3 cache, seperate from RAM, separate from harddrives

Smaller sizes mean faster speeds, less costs
RAG helps the input to an LLM remain small (but accurate)
sure, cost and speed are relative now, but time has shown that these things will converge
Has the speed of an L3 cache converged with RAM yet? Or Hardrives? πŸ‘€

Idk, there is always going to be a cost to input sizes. People will make them more efficient, but that just means people will also make them bigger
Add a reply
Sign up and join the conversation on Discord