Find answers from the community

Updated 2 years ago

Hello what is the proper way to unload

Hello, what is the proper way to unload HuggingFaceLLMPredictor, LangchainEmbedding and ServiceContext to release GPU memory? Thanks
D
3 comments
I think I kinda figured it out. I can first do llm_predictor.model.to('cpu') then just del llm_predictor and torch.cuda.empty_cache()
hmmm I think I still need to unload embed_model
I guess I can delete embed_model._langchain_embedding.client
Add a reply
Sign up and join the conversation on Discord