Find answers from the community

Updated last month

Managing large context in agent-based workflow for youtube lecture analysis

hi everyone, I'm building a tool to analyze long youtube lectures using an agent-based workflow, but i'm running into issues with managing the large context. What whould be the best approach or tools to handle this efficiently without losing important information?

What I'm currently doing is splitting the text into fragments and passing each fragment through gpt4o-mini, but the result is still too long to be processed in the agent workflow
L
S
A
4 comments
I mean, feels like the two obvious ideas are
  • summarizing the context when it gets too long
  • dynamically retrieving what you need from the total context (i.e. this is basically RAG over some infinitely sized context)
Okay, I really like the second idea, I'm going to try it out a bit, thank you very much.
Gemini. 2M context window. https://developers.googleblog.com/en/new-features-for-the-gemini-api-and-google-ai-studio/ . IIRC gpt4o has 128K which is awesome but sounds like it's not enough
Also I think Gemini can do it over the video itself. Google says that you can reason over 1 hour of video so...
Add a reply
Sign up and join the conversation on Discord