Hello, I have been following up this tutorial:
https://gpt-index.readthedocs.io/en/latest/examples/llm/llama_2_llama_cpp.html. I have a problem, the query function takes extremely long time. (like 8-10 mins). I know that this is common problem when it's come to llamacpp but llamacpp work pretty ok with just prompting and answering. Problem starts with qa, indexing, embedding and so on. I can share my code as well if needed. Any help is appreciated.