RuntimeError: MPS backend out of memory
get_query_embedding
step (specifically, https://github.com/run-llama/llama_index/blob/main/llama_index/embeddings/huggingface.py#L139). What's the difference between using get_query_embedding
and just getting the embeddings from a Huggingface model?