Managing large context in agent-based workflow for yout...

At a glance

The community member is building a tool to analyze long YouTube lectures using an agent-based workflow, but is facing issues with managing the large context. The community members suggest two approaches: summarizing the context when it gets too long, and dynamically retrieving what is needed from the total context. One community member specifically recommends trying the second approach and mentions the Gemini API, which can handle a 2M context window, as a potential solution, as the community member's current approach using GPT-4 mini with a 128K context window is not sufficient.

Useful resources

SSaúl

hi everyone, I'm building a tool to analyze long youtube lectures using an agent-based workflow, but i'm running into issues with managing the large context. What whould be the best approach or tools to handle this efficiently without losing important information?

What I'm currently doing is splitting the text into fragments and passing each fragment through gpt4o-mini, but the result is still too long to be processed in the agent workflow

4 comments

LLogan M

I mean, feels like the two obvious ideas are

summarizing the context when it gets too long
dynamically retrieving what you need from the total context (i.e. this is basically RAG over some infinitely sized context)

SSaúl

Okay, I really like the second idea, I'm going to try it out a bit, thank you very much.

AAnna

Gemini. 2M context window. https://developers.googleblog.com/en/new-features-for-the-gemini-api-and-google-ai-studio/ . IIRC gpt4o has 128K which is awesome but sounds like it's not enough

AAnna

Also I think Gemini can do it over the video itself. Google says that you can reason over 1 hour of video so...

Add a reply

Find answers from the community

Managing large context in agent-based workflow for youtube lecture analysis