Skip to content

Commit 698e945

Browse files
Merge pull request #757 from mlcommons/dev
Dev -> Main
2 parents 5b4914f + 04ea6e1 commit 698e945

24 files changed

Lines changed: 99 additions & 20 deletions

File tree

DOCUMENTATION.md

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -360,6 +360,14 @@ Valid submissions must rely on new algorithmic or mathematical ideas and should
360360

361361
</details>
362362

363+
##### Submissions vs. Baselines
364+
365+
Submitters may also submit algorithms marked as *baselines*. These baseline algorithms are not eligible for winning the competition or prize money but they are also not required to be "substantially different" from other submissions by the same submitters. Baseline algorithms will still appear on the leaderboard but will be clearly marked as such. We highly encourage the submission of baselines for educational purposes.
366+
367+
Baseline algorithms might, for example, include existing algorithms with different search spaces or learning rate schedules.
368+
Another example involves porting submissions to different frameworks. For instance, a participant may wish to assess their algorithm in both JAX and PyTorch to demonstrate the impact of the framework. However, in such cases, one of these submissions must be designated as eligible for prize consideration, while the other is marked as a baseline. This prevents circumventing of tuning rules and the spirit of the benchmark by creating additional "lottery tickets".
369+
Baselines might not be prioritized when using the compute resources by the sponsors of the benchmark.
370+
363371
##### Software dependencies
364372

365373
We require submissions to use specific versions of `PyTorch`/`JAX` as well as additional dependencies in order to facilitate fair comparisons. Submitters must build on top of these provided software packages, which might be provided as a `Docker` container. Additional dependencies can be added as long as they include a comment describing what was added and why. Submitters are free to add dependencies that support new algorithmic and mathematical ideas but they should not circumvent the intention of the benchmark to measure training speedups due to new training methods. For example, software engineering techniques that lead to faster implementations of existing software, e.g. using newer versions of `PyTorch` or `JAX`, are not allowed and these are described in more detail in the [Disallowed submissions](#disallowed-submissions) section.
@@ -545,7 +553,7 @@ new Compute Instance with the "Deep Learning on Linux" Image in Boot disk option
545553

546554
Our benchmark allows multiple submissions by the same team of submitters as long as they are substantially different. We disallow submitters from circumventing the purpose of the benchmark by, for example, submitting dozens of copies of the same submission with slightly different hyperparameters. Such a bulk submission would result in an unfair advantage on the randomized workloads and is not in the spirit of the benchmark.
547555

548-
Submitters may submit algorithms marked as *baselines*. These might include existing algorithms with different search spaces or learning rate schedules. These baseline algorithms are not eligible for winning the competition or prize money but they are also not required to be "substantially different" from other submissions by the same submitters.
556+
Submitters may submit algorithms marked as *baselines*. These might include existing algorithms with different search spaces or learning rate schedules. These baseline algorithms are not eligible for winning the competition or prize money but they are also not required to be "substantially different" from other submissions by the same submitters. See the [Submissions vs. Baselines](#submissions-vs-baselines) Section.
549557

550558
#### Can my submission be structured using multiple files?
551559

algorithmic_efficiency/workloads/wmt/wmt_jax/workload.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -315,7 +315,7 @@ class WmtWorkloadAttentionTemp(WmtWorkload):
315315

316316
@property
317317
def validation_target_value(self) -> float:
318-
return 29.8611
318+
return 29.3379
319319

320320
@property
321321
def test_target_value(self) -> float:
@@ -331,7 +331,7 @@ class WmtWorkloadGLUTanH(WmtWorkload):
331331

332332
@property
333333
def validation_target_value(self) -> float:
334-
return 29.6517
334+
return 29.5779
335335

336336
@property
337337
def test_target_value(self) -> float:

algorithmic_efficiency/workloads/wmt/wmt_pytorch/workload.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -371,7 +371,7 @@ class WmtWorkloadAttentionTemp(WmtWorkload):
371371

372372
@property
373373
def validation_target_value(self) -> float:
374-
return 29.8611
374+
return 29.3379
375375

376376
@property
377377
def test_target_value(self) -> float:
@@ -387,7 +387,7 @@ class WmtWorkloadGLUTanH(WmtWorkload):
387387

388388
@property
389389
def validation_target_value(self) -> float:
390-
return 29.6517
390+
return 29.5779
391391

392392
@property
393393
def test_target_value(self) -> float:

prize_qualification_baselines/external_tuning/jax_nadamw_full_budget.py

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -307,6 +307,10 @@ def get_batch_size(workload_name):
307307
return 32
308308
elif workload_name == 'imagenet_resnet':
309309
return 1024
310+
elif workload_name == 'imagenet_resnet_silu':
311+
return 512
312+
elif workload_name == 'imagenet_resnet_gelu':
313+
return 512
310314
elif workload_name == 'imagenet_vit':
311315
return 1024
312316
elif workload_name == 'librispeech_conformer':

prize_qualification_baselines/external_tuning/jax_nadamw_target_setting.py

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -307,6 +307,10 @@ def get_batch_size(workload_name):
307307
return 32
308308
elif workload_name == 'imagenet_resnet':
309309
return 1024
310+
elif workload_name == 'imagenet_resnet_silu':
311+
return 512
312+
elif workload_name == 'imagenet_resnet_gelu':
313+
return 512
310314
elif workload_name == 'imagenet_vit':
311315
return 1024
312316
elif workload_name == 'librispeech_conformer':

prize_qualification_baselines/external_tuning/pytorch_nadamw_full_budget.py

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -309,6 +309,10 @@ def get_batch_size(workload_name):
309309
return 32
310310
elif workload_name == 'imagenet_resnet':
311311
return 1024
312+
elif workload_name == 'imagenet_resnet_silu':
313+
return 512
314+
elif workload_name == 'imagenet_resnet_gelu':
315+
return 512
312316
elif workload_name == 'imagenet_vit':
313317
return 1024
314318
elif workload_name == 'librispeech_conformer':

prize_qualification_baselines/external_tuning/pytorch_nadamw_target_setting.py

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -309,6 +309,10 @@ def get_batch_size(workload_name):
309309
return 32
310310
elif workload_name == 'imagenet_resnet':
311311
return 1024
312+
elif workload_name == 'imagenet_resnet_silu':
313+
return 512
314+
elif workload_name == 'imagenet_resnet_gelu':
315+
return 512
312316
elif workload_name == 'imagenet_vit':
313317
return 1024
314318
elif workload_name == 'librispeech_conformer':

prize_qualification_baselines/self_tuning/jax_nadamw_full_budget.py

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -322,6 +322,10 @@ def get_batch_size(workload_name):
322322
return 32
323323
elif workload_name == 'imagenet_resnet':
324324
return 1024
325+
elif workload_name == 'imagenet_resnet_silu':
326+
return 512
327+
elif workload_name == 'imagenet_resnet_gelu':
328+
return 512
325329
elif workload_name == 'imagenet_vit':
326330
return 1024
327331
elif workload_name == 'librispeech_conformer':

prize_qualification_baselines/self_tuning/jax_nadamw_target_setting.py

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -322,6 +322,10 @@ def get_batch_size(workload_name):
322322
return 32
323323
elif workload_name == 'imagenet_resnet':
324324
return 1024
325+
elif workload_name == 'imagenet_resnet_silu':
326+
return 512
327+
elif workload_name == 'imagenet_resnet_gelu':
328+
return 512
325329
elif workload_name == 'imagenet_vit':
326330
return 1024
327331
elif workload_name == 'librispeech_conformer':

prize_qualification_baselines/self_tuning/pytorch_nadamw_full_budget.py

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -324,6 +324,10 @@ def get_batch_size(workload_name):
324324
return 32
325325
elif workload_name == 'imagenet_resnet':
326326
return 1024
327+
elif workload_name == 'imagenet_resnet_silu':
328+
return 512
329+
elif workload_name == 'imagenet_resnet_gelu':
330+
return 512
327331
elif workload_name == 'imagenet_vit':
328332
return 1024
329333
elif workload_name == 'librispeech_conformer':

0 commit comments

Comments
 (0)