|
| 1 | +--- |
| 2 | +layout: post |
| 3 | +title: "Tutorial: Using the MLBench Commandline Interface" |
| 4 | +author: r_grubenmann |
| 5 | +published: false |
| 6 | +tags: [tutorial,guide] |
| 7 | +excerpt_separator: <!--more--> |
| 8 | +--- |
| 9 | + |
| 10 | +We recently released MLBench version 2.1.0, which contains a new commandline interface, making it even easier to run our benchmarks. |
| 11 | + |
| 12 | +In this post we'll introduce the CLI and show you how easy it is to get it up and running. |
| 13 | + |
| 14 | +<!--more--> |
| 15 | + |
| 16 | +**Please beware any costs that might be incurred by running this tutorial on the Google cloud. Usually costs should only be on the order of 5-10USD. We don't take any responsibility costs incurred** |
| 17 | + |
| 18 | +Install the [mlbench-core](https://github.com/mlbench/mlbench-core/tree/master) python package by running: |
| 19 | + |
| 20 | +```shell |
| 21 | +$ pip install mlbench-core |
| 22 | +``` |
| 23 | + |
| 24 | +After installation, mlbench is usable by calling the ``mlbench`` command. |
| 25 | + |
| 26 | +To create a new Google cloud cluster, simply run (this might take a couple of minutes): |
| 27 | + |
| 28 | +```shell |
| 29 | +$ mlbench create-cluster gcloud 3 my-cluster |
| 30 | +[...] |
| 31 | +MLBench successfully deployed |
| 32 | +``` |
| 33 | + |
| 34 | +This creates a cluster with 3 nodes called ``my-cluster-3`` and sets up the mlbench deployment in that cluster. Note that the number of nodes should always be 1 higher than the maximum number of workers you want to run. |
| 35 | + |
| 36 | +To start an experiment, simpy run: |
| 37 | + |
| 38 | +```shell |
| 39 | +$ mlbench run my-run 2 |
| 40 | + |
| 41 | +Benchmark: |
| 42 | + |
| 43 | +[0] PyTorch Cifar-10 ResNet-20 Open-MPI |
| 44 | +[1] PyTorch Cifar-10 ResNet-20 Open-MPI (SCaling LR) |
| 45 | +[2] PyTorch Linear Logistic Regrssion Open-MPI |
| 46 | +[3] Tensorflow Cifar-10 ResNet-20 Open-MPI |
| 47 | +[4] Custom Image |
| 48 | + |
| 49 | +Selection [0]: 1 |
| 50 | + |
| 51 | +[...] |
| 52 | + |
| 53 | +Run started with name my-run-2 |
| 54 | +``` |
| 55 | + |
| 56 | +You will be prompted to select the benchmark image you want to run (or to specify a custom image). Afterwards, a new benchmark run will be started in the cluster with 2 workers. |
| 57 | + |
| 58 | +To see the status of this run, execute: |
| 59 | + |
| 60 | +```shell |
| 61 | +$ mlbench status my-run-2 |
| 62 | +[...] |
| 63 | +id name created_at finished_at state |
| 64 | +--- ------ ----------- ----------- ----- |
| 65 | +1 my-run-2 2019-11-11T13:35:06 started |
| 66 | +No Validation Loss Data yet |
| 67 | +No Validation Precision Data yet |
| 68 | +``` |
| 69 | + |
| 70 | +After the first round of validation, this command also outputs the current validation loss and precision. |
| 71 | + |
| 72 | +To download the results of a current or finished run, use: |
| 73 | + |
| 74 | +```shell |
| 75 | +$ mlbench download my-run-2 |
| 76 | +``` |
| 77 | + |
| 78 | +which will download all the metrics of the run as a zip file. This file also contains the official benchmark result once the run finishes. |
| 79 | + |
| 80 | +You can also access all the information of the run in the dashboard. To get the dashboard URL, simply run: |
| 81 | + |
| 82 | +```shell |
| 83 | +$ mlbench get-dashboard-url |
| 84 | +[...] |
| 85 | +http://34.76.223.123:32535 |
| 86 | +``` |
| 87 | + |
| 88 | +Don't forget to delete the cluster once you're done! |
| 89 | + |
| 90 | +```shell |
| 91 | +$ mlbench delete-cluster gcloud my-cluster-3 |
| 92 | +[...] |
| 93 | +``` |
0 commit comments