Find answers from the community

Updated 3 weeks ago

Monitoring Runtime Context and Memory Usage

@Logan M can I get to know current available context size and memory used and max tokens used at runtime so that when it approaches the limit, I can reset the variables and the chat engine so that it doesn’t reach the limit and break with error
W
1 comment
You can check the total available context length for openAI model here: https://github.com/run-llama/llama_index/blob/aa1f5776787b8b435f89d2c261fd7ca8002c1f19/llama-index-integrations/llms/llama-index-llms-openai/llama_index/llms/openai/utils.py#L39


For chekcing what is the token remaining you can add instrumentation module: https://docs.llamaindex.ai/en/stable/examples/instrumentation/instrumentation_observability_rundown/

take the LLM event and extract tokens and than update the final token lenght based on the new values.
Add a reply
Sign up and join the conversation on Discord