hmmm does with auto-merge query engine load something into vRAM that was not loaded before? Im hosting both LLM and embeddings model in a different service. However since the new llama-index upgrade the rag pipeline started loading things into VRAM 🤔