A community member is experiencing slow performance when using the instructor-xl model to create a vector database with LlamaIndex, taking 8 minutes to process 23 vectors while running on Colab and using the HF-Langchain wrapper. Another community member suggests that the issue may be due to the model being too large to run on a CPU, and the original poster acknowledges this as the likely cause of the problem.
Hey! Anyone faced speed issues with custom embedding models? I use the instructor-xl model to create a vector db with llama index, but it is extremely slow. 23 vectors take like 8 minutes. Running on colab and using the HF-langchain wrapper.