Find answers from the community

Updated 9 months ago

Is it possible to offload GPU VRAM?

Is it possible to offload GPU VRAM?
Indeed, I load CUDA GPU with an embedding model like this :
Plain Text
    Settings.node_parser = SentenceSplitter(chunk_size=512, chunk_overlap=64)
    Settings.chunk_size = 512
    Settings.chunk_overlap = 64

    # https://huggingface.co/OrdalieTech/Solon-embeddings-large-0.1
    embeded_model_name = "OrdalieTech/Solon-embeddings-large-0.1"
    embed_model = HuggingFaceEmbedding(model_name=embeded_model_name)
    Settings.embed_model = embed_model

.....
    vector_store_index = VectorStoreIndex.from_documents(documents=documents, show_progress=True)

Then for another calculation I have to load another embedding model, but I get the CUDA Out of memory error since the previous model is still present in the GPU VRAM.
L
1 comment
I always find it hard to clear CUDA memory without exiting the program.

My best guess is something like

Plain Text
import torch
import gc
del embed_model
torch.cuda.empty_cache()
gc.collect()
Add a reply
Sign up and join the conversation on Discord