chroma_client = chromadb.PersistentClient(path=chroma_persistent_dir)
chroma_collection = chroma_client.get_collection(collection_name)
vector_store = ChromaVectorStore(chroma_collection=chroma_collection)
llm = llm = Ollama(model="llama3:70b-instruct", request_timeout=3000.0)
embed_model = HuggingFaceEmbedding(model_name="Alibaba-NLP/gte-large-en-v1.5", trust_remote_code=True, embed_batch_size=2)
Settings.llm = llm
Settings.embed_model = embed_model
index = VectorStoreIndex.from_vector_store(vector_store, embed_model=embed_model)
the chromadb is already saved to disk, all i'm doing is loading the collection and creating an index with VectorStoreIndex, and wanted to know if it actually loads the index in the vram of the gpu, since i don't have alot of vram left from loading a big model + embedding model