Find answers from the community

Updated 4 months ago

Memory leak?

At a glance

The community member is experiencing a memory leak issue when indexing multiple-page documents. They have tried using the insert_nodes() method, but the memory is not freed after the nodes are added to the database. The memory is only freed when the entire operation is completed.

Another community member suggests that the issue could be related to the token counter, as it may be holding data after each call to the LLM and embedding. They recommend resetting the token counter to clear the data it is holding.

The community members have not found a definitive solution yet, but they are actively investigating the issue and discussing potential fixes, such as overriding the insert_nodes() method in the vector store class.

Useful resources
Memory leak?

Hi, I found that when I index multiple-page documents, my server stops working at some point. After investigation, I found that this code is consuming memory what probably causes a huge memory leak and falling down the server, as a result (THE LAST LINE):
Plain Text
    vector_store = storage_service.get_vector_store(collection_name, db_name)
    storage_context = StorageContext.from_defaults(vector_store=vector_store)
    embed_model = OpenAIEmbedding(mode='similarity', embed_batch_size=2000, api_key=user_settings_data.item.get('openai_key'))

    service_context = ServiceContext.from_defaults(chunk_size=chunk_size, embed_model=embed_model,
                                                    llm=None,
                                                    callback_manager=token_counter_callback_manager)
    node_parser = SimpleNodeParser.from_defaults(chunk_size=chunk_size, chunk_overlap=20)
    VectorStoreIndex(nodes, storage_context=storage_context, service_context=service_context) # <== THIS

I was thinking may be I can create a vectore store index just once, just add nodes to it but it's not working:

Plain Text
index = VectorStoreIndex.from_vector_store(vector_store=vector_store, service_context=service_context)
index._add_nodes_to_index(nodes=content_nodes) # <== EXCEPTION: index structure is not provided

Please help, thanks!
n
S
L
26 comments
have you tried index.insert_nodes() ?
Yes, I've found this method too and it looks like it works. But I still see something that can be considered as memory leaks. I've added the code to watch the memory consumption and printed the current data:
Plain Text
update_vector_index: memory before: 539,373,568, after: 539,381,760, consumed: 8,192; exec time: 00:00:05
update_vector_index: memory before: 539,381,760, after: 539,402,240, consumed: 20,480; exec time: 00:00:04
update_vector_index: memory before: 539,402,240, after: 539,480,064, consumed: 77,824; exec time: 00:00:08
update_vector_index: memory before: 539,484,160, after: 539,648,000, consumed: 163,840; exec time: 00:00:06
update_vector_index: memory before: 539,648,000, after: 539,648,000, consumed: 0; exec time: 00:00:04
hmmm I saw an issue on our github related to memory this morning
Haven't had time to investigate yet
index.insert(document) or index.insert_nodes(nodes) would be the correct methods to use though btw
What vector store are you using? Is it hosted on the same machine?
technically when you ingest documents, the vectors will be in memory until the code puts them in the vector store
I'm using Supabase (or actually, Postgres) vector store. No, the app and the database are hosted on different machines (in AWS). So, does the "insert_nodes" method put data to the database?
Did more tests and found that the memory is not freed after nodes are put to the database. I just add a breakpoint right after the line with "insert_nodes" call and saw if the database was affected. It was but the memory didn't free. It is only freed when the whole operation is done.
Plain Text
update_vector_index: memory before: 488,484,864, after: 491,327,488, consumed: 2,842,624; exec time: 00:00:02
update_vector_index: memory before: 491,470,848, after: 492,474,368, consumed: 1,003,520; exec time: 00:00:00

And here are data between 2 loops:
Plain Text
update_vector_index: memory before: 528,179,200, after: 528,179,200, consumed: 0; exec time: 00:00:00 <== The last update of loop 1

prepare_index_objects: memory before: 486,137,856, after: 486,297,600, consumed: 159,744; exec time: 00:00:00 <== Before calling insert_nodes

What means the LLMIndex waits for the ending the finishing the whole circle.
Sorry, just to clarify, the memory is or isnt freed after calling insert_nodes() ?
No, it's not
Okay, let me describe more precisely. I have N documents to be inserted, I call it loop, when m nodes is put to the database, the memory is not freed. It's only freed when the whole N documents are finished.
OK-- that should be reproducible then.

Let me see if I can figure something out in a bit
Thanks a lot! It could be very, very helpful. I noticed that sometimes the memory usage is about 95% or so, and very often, when the number of pages to be indexed is huge (like hundreds) it just crashes the server.
My expectation is that insert_nodes will create and store vectors in memory, which could be a lot if you call it on a ton of nodes.

BUT it should be freed after that function finishes, and apparently it isn't, so that would be issue #1 to figure out
I feed the insert_nodes no more than 10 documents at once, and the amount of used memory is not huge for one call but as the memory is not freed after every call, it's becoming a problem
Any updates?
So for your case, I think I do know the issue.

For the github issue I linked above, I have no idea.
For you, calling insert_nodes() automatically updates the docstore. Even if you aren't using it. Which is in memory
So that I can fix by overriding the insert nodes in the vector store class
Hi, thanks for your thoughts but can you please elaborate a bit on it? Not sure what you mean by "So that I can fix by overriding the insert nodes in the vector store class"? And what do you mean by "updates the docstore in memory"?
I meant I need to make a PR to fix this in the VectorStoreIndex class

But actually looking at the code again, I was wrong. I think this is probably a similar issue to the github ticket, which tbh has proved very hard to debug.

So, back to square zero
perhaps it is the token counter, if you have that attached
Do you mean, that the token counter could be the reason for the memory leak?
yea, because it's holding data after every LLM and embedding call

token_counter.reset_counts() will clear the data it is holding
Add a reply
Sign up and join the conversation on Discord