luiluilui8582

VRAM usage

Hi, I just started learning LlamaIndex and I'm going through this tutorial. I'm using a RTX 3070 with 8GB vram, and loading the mistral-instruct-7B-Q4_K_M LLM which is 4.37GB with LM studio.

Everything is fine. I kept asking questions about my document until at some point it returns the insufficient memory error.

So I print out the VRAM usage at different point of the code. Then I realised that LlamaIndex uses much more VRAM than I expected, especially my testing text is only 5.4KB.
From loading the small text file to VectorStoreIndex() 2.2GB was needed.
I have 3 questions:

why is using so much memory?
is there any way to mitigate this?
how could I estimate how much memory is needed for my document?

This is my code along with output from nvidia-smi.

Thank you very much 🙏

Find answers from the community

# VRAM usage

VRAM usage