Find answers from the community

Home
Members
luiluilui8582
l
luiluilui8582
Offline, last seen 3 months ago
Joined September 25, 2024

VRAM usage


Hi, I just started learning LlamaIndex and I'm going through this tutorial. I'm using a RTX 3070 with 8GB vram, and loading the mistral-instruct-7B-Q4_K_M LLM which is 4.37GB with LM studio.

Everything is fine. I kept asking questions about my document until at some point it returns the insufficient memory error.

So I print out the VRAM usage at different point of the code. Then I realised that LlamaIndex uses much more VRAM than I expected, especially my testing text is only 5.4KB.
From loading the small text file to VectorStoreIndex() 2.2GB was needed.
I have 3 questions:
  1. why is using so much memory?
  2. is there any way to mitigate this?
  3. how could I estimate how much memory is needed for my document?
This is my code along with output from nvidia-smi.

Thank you very much 🙏
3 comments
c
l
L