these are the numbers llama_print_timings: load time = 16589.32 ms llama_print_timings: sample time = 47.42 ms / 199 runs ( 0.24 ms per token, 4196.28 tokens per second) llama_print_timings: prompt eval time = 81969.71 ms / 1843 tokens ( 44.48 ms per token, 22.48 tokens per second) llama_print_timings: eval time = 26429.32 ms / 198 runs ( 133.48 ms per token, 7.49 tokens per second) llama_print_timings: total time = 108885.07 ms / 2041 tokens