On LlamaCpp, what parameters affect the inference loading time? I'm using VectorStoreIndex with an embedding model with chunk_size_limit=300 and my query engine is created in the following way cur_index.as_query_engine(streaming=True, similarity_top_k=3)