3.5GB memory [100% in Azure my setup]

SSagar

3.5GB memory [100% in Azure my setup] usage when VectorIndex connected to Qdrant with just 3,333 chunks and only 1 user query. Is this expected? This is my code, i inserted the chunks in my local machine, deployed the python code in Azure and was just retrieving based on metadata filters. Am baffled why the memory goes upto 100%

17 comments

SSagar

😕

LLogan M

Are you using qdrant in memory? Hosted? What embedding model are you using?

SSagar

No, Qdrant is currently hosted in West US , using their own infrastucture, embedding is OpenAI text-3-small

SSagar

I am running the Query pipeline though, copied from the source docs

SSagar

https://docs.llamaindex.ai/en/stable/examples/pipeline/query_pipeline_memory/

SSagar

Only difference in this query pipeline is i am NOT using ChatMemoryBuffer and my Colbert reranker is part of my source code (within the container itself)

SSagar

Attachment

LLogan M

So, the colbert reranker is running locally then? that's going to use a decent chunk of memory

LLogan M

Using openai as the llm too I'm guessing?

The only thing I'm seeing that uses memory is the reranker. Unless you have enable_hybrid=True om qdrant, that will also run a local model and eat memory

SSagar

Llm is LLama 3 with Groq, and hybrid = false for Qdrant. But Colbert gets downloaded as model.safetensors anyways right? So i assumed why dowload everytime, especially with gunicorn multi workers. If you say that I don't need azure table docstore, then do I need to release memory somehow? any new node retrieved due to different query seems to be staying in the memory?

LLogan M

I honestly think it's just colbert being loaded into memory, but I could be wrong. Try not using colbert maybe?

LLogan M

I don't think it's related to the docstore or nodes themselves

LLogan M

They are tiny

SSagar

Alright, I'll remove Colbert and let you know how that goes. Thanks.

SSagar

Thanks a bunch @Logan M, I completely removed Colbert and it works like a charm now. Will need to look for paid api for Colbert, but that's a different pain. Rally appreciate you're help in this regard.

LLogan M

Great! Glad it works now.

I know cohere has a rerank api that's quite good, if that's an option for you

SSagar

Oh yes..have heard about.. Have no problem working with fellow Canadian company. 🙏

Add a reply

Find answers from the community

3.5GB memory [100% in Azure my setup]