How can we execute batch inference optimally using query_pipeline? While the documentation provides insights on asynchronous, parallel, and multi-root operations, it lacks details on batch inference methods. Would employing asynchronous execution be the most effective approach?"