Skip to content

Commit e29a2d4

Browse files
committed
Merge branch 'dev' into num_workers_fix
2 parents 135c56a + 9365996 commit e29a2d4

29 files changed

Lines changed: 248 additions & 128 deletions

File tree

CHANGELOG.md

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,23 @@
11
# Change Log
22

3+
## algoperf-benchmark-0.1.4 (2024-03-26)
4+
5+
Upgrade CUDA version to CUDA 12.1:
6+
- Upgrade CUDA version in Dockerfiles that will be used for scoring.
7+
- Update Jax and PyTorch package version tags to use local CUDA installation.
8+
9+
Add flag for completely disabling checkpointing.
10+
- Note that we will run with checkpointing off at scoring time.
11+
12+
Update Deepspeech and Conformer variant target setting configurations.
13+
- Note that variant targets are not final.
14+
15+
Fixed bug in scoring code to take best trial in a study for external-tuning ruleset.
16+
17+
Added instructions for submission.
18+
19+
Changed default number of workers for PyTorch data loaders to 0. Running with >0 may lead to incorrect eval results see https://github.com/mlcommons/algorithmic-efficiency/issues/732.
20+
321
## algoperf-benchmark-0.1.2 (2024-03-04)
422
Workload variant additions and fixes:
523
- Add Deepspeech workload variant

GETTING_STARTED.md

Lines changed: 36 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -388,7 +388,42 @@ python score_submissions.py --submission_directory <directory_with_submissions>
388388

389389
We provide the scores and performance profiles for the [paper baseline algorithms](/reference_algorithms/paper_baselines/) in the "Baseline Results" section in [Benchmarking Neural Network Training Algorithms](https://arxiv.org/abs/2306.07179).
390390

391-
## Package Submission for Self-Reporting
391+
## Package your Submission code
392+
393+
If you have registered for the AlgoPerf competition you will receive
394+
an email on 3/27/2024 with a link to a UI to upload a compressed submission folder.
395+
396+
To package your submission modules please make sure your submission folder is structured as follows:
397+
398+
```bash
399+
submission_folder/
400+
├── external_tuning
401+
│ ├── algorithm_name
402+
│ │ ├── helper_module.py
403+
│ │ ├── requirements.txt
404+
│ │ ├── submission.py
405+
│ │ └── tuning_search_space.json
406+
│ └── other_algorithm_name
407+
│ ├── requirements.txt
408+
│ ├── submission.py
409+
│ └── tuning_search_space.json
410+
└── self_tuning
411+
└── algorithm_name
412+
├── requirements.txt
413+
└── submission.py
414+
```
415+
416+
Specifically we require that:
417+
1. There exist subdirectories in the the submission folder named after the ruleset: `external_tuning` or `self_tuning`.
418+
2. The ruleset subdirectories contain directories named according to
419+
some identifier of the algorithm.
420+
3. Each algorithm subdirectory contains a `submission.py` module. Additional helper modules are allowed if prefer to you organize your code into multiple files. If there are additional python packages that have to be installed for the algorithm also include a `requirements.txt` with package names and versions in the algorithm subdirectory.
421+
4. For `external_tuning` algorithms the algorithm subdirectory
422+
should contain a `tuning_search_space.json`.
423+
424+
To check that your submission folder meets the above requirements you can run the `submissions/repo_checker.py` script.
425+
426+
## Package Logs for Self-Reporting Submissions
392427
To prepare your submission for self reporting run:
393428

394429
```

README.md

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -28,8 +28,9 @@
2828

2929
> [!IMPORTANT]
3030
> Upcoming Deadline:
31-
> Submission deadline: **April 04th, 2024** (*moved by a week*) \
32-
> For other key dates please see [Call for Submissions](/CALL_FOR_SUBMISSIONS.md).
31+
> Submission deadline: **April 04th, 2024** (*moved by a week*). \
32+
> For submission instructions please see [Packaging your Submission Code](/GETTING_STARTED.md#package-your-submission-code) section in the Getting Started document.\
33+
> For other key dates please see [Call for Submissions](CALL_FOR_SUBMISSIONS.md).
3334
3435
## Table of Contents <!-- omit from toc -->
3536

algorithmic_efficiency/random_utils.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -26,11 +26,11 @@
2626

2727
def _signed_to_unsigned(seed: SeedType) -> SeedType:
2828
if isinstance(seed, int):
29-
return seed + 2**32 if seed < 0 else seed
29+
return seed % 2**32
3030
if isinstance(seed, list):
31-
return [s + 2**32 if s < 0 else s for s in seed]
31+
return [s % 2**32 for s in seed]
3232
if isinstance(seed, np.ndarray):
33-
return np.array([s + 2**32 if s < 0 else s for s in seed.tolist()])
33+
return np.array([s % 2**32 for s in seed.tolist()])
3434

3535

3636
def _fold_in(seed: SeedType, data: Any) -> List[Union[SeedType, Any]]:

algorithmic_efficiency/workloads/criteo1tb/criteo1tb_jax/workload.py

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -173,7 +173,7 @@ def use_layer_norm(self) -> bool:
173173

174174
@property
175175
def validation_target_value(self) -> float:
176-
return 0.123744
176+
return 0.123757
177177

178178
@property
179179
def test_target_value(self) -> float:
@@ -191,23 +191,23 @@ def use_resnet(self) -> bool:
191191

192192
@property
193193
def validation_target_value(self) -> float:
194-
return 0.124027
194+
return 0.12415
195195

196196
@property
197197
def test_target_value(self) -> float:
198-
return 0.126468
198+
return 0.12648
199199

200200

201201
class Criteo1TbDlrmSmallEmbedInitWorkload(Criteo1TbDlrmSmallWorkload):
202202

203203
@property
204204
def validation_target_value(self) -> float:
205-
return 0.124286
205+
return 0.129657
206206

207207
@property
208208
def test_target_value(self) -> float:
209209
# Todo
210-
return 0.126725
210+
return 0.131967
211211

212212
@property
213213
def embedding_init_multiplier(self) -> float:

algorithmic_efficiency/workloads/criteo1tb/criteo1tb_pytorch/workload.py

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -254,7 +254,7 @@ def use_layer_norm(self) -> bool:
254254

255255
@property
256256
def validation_target_value(self) -> float:
257-
return 0.123744
257+
return 0.123757
258258

259259
@property
260260
def test_target_value(self) -> float:
@@ -272,23 +272,23 @@ def use_resnet(self) -> bool:
272272

273273
@property
274274
def validation_target_value(self) -> float:
275-
return 0.124027
275+
return 0.12415
276276

277277
@property
278278
def test_target_value(self) -> float:
279-
return 0.126468
279+
return 0.12648
280280

281281

282282
class Criteo1TbDlrmSmallEmbedInitWorkload(Criteo1TbDlrmSmallWorkload):
283283

284284
@property
285285
def validation_target_value(self) -> float:
286-
return 0.124286
286+
return 0.129657
287287

288288
@property
289289
def test_target_value(self) -> float:
290290
# Todo
291-
return 0.126725
291+
return 0.131967
292292

293293
@property
294294
def embedding_init_multiplier(self) -> float:

algorithmic_efficiency/workloads/imagenet_resnet/imagenet_jax/workload.py

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -272,11 +272,11 @@ def use_silu(self) -> bool:
272272

273273
@property
274274
def validation_target_value(self) -> float:
275-
return 1 - 0.22009
275+
return 0.75445
276276

277277
@property
278278
def test_target_value(self) -> float:
279-
return 1 - 0.3426
279+
return 0.6323
280280

281281

282282
class ImagenetResNetGELUWorkload(ImagenetResNetWorkload):
@@ -287,11 +287,11 @@ def use_gelu(self) -> bool:
287287

288288
@property
289289
def validation_target_value(self) -> float:
290-
return 1 - 0.22077
290+
return 0.76765
291291

292292
@property
293293
def test_target_value(self) -> float:
294-
return 1 - 0.3402
294+
return 0.6519
295295

296296

297297
class ImagenetResNetLargeBNScaleWorkload(ImagenetResNetWorkload):
@@ -302,8 +302,8 @@ def bn_init_scale(self) -> float:
302302

303303
@property
304304
def validation_target_value(self) -> float:
305-
return 1 - 0.23474
305+
return 0.76526
306306

307307
@property
308308
def test_target_value(self) -> float:
309-
return 1 - 0.3577
309+
return 0.6423

algorithmic_efficiency/workloads/imagenet_resnet/imagenet_pytorch/workload.py

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -326,11 +326,11 @@ def use_silu(self) -> bool:
326326

327327
@property
328328
def validation_target_value(self) -> float:
329-
return 1 - 0.22009
329+
return 0.75445
330330

331331
@property
332332
def test_target_value(self) -> float:
333-
return 1 - 0.342
333+
return 0.6323
334334

335335

336336
class ImagenetResNetGELUWorkload(ImagenetResNetWorkload):
@@ -341,11 +341,11 @@ def use_gelu(self) -> bool:
341341

342342
@property
343343
def validation_target_value(self) -> float:
344-
return 1 - 0.22077
344+
return 0.76765
345345

346346
@property
347347
def test_target_value(self) -> float:
348-
return 1 - 0.3402
348+
return 0.6519
349349

350350

351351
class ImagenetResNetLargeBNScaleWorkload(ImagenetResNetWorkload):
@@ -356,8 +356,8 @@ def bn_init_scale(self) -> float:
356356

357357
@property
358358
def validation_target_value(self) -> float:
359-
return 1 - 0.23474
359+
return 0.76526
360360

361361
@property
362362
def test_target_value(self) -> float:
363-
return 1 - 0.3577
363+
return 0.6423

algorithmic_efficiency/workloads/imagenet_vit/imagenet_jax/workload.py

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -99,11 +99,11 @@ def use_glu(self) -> bool:
9999

100100
@property
101101
def validation_target_value(self) -> float:
102-
return 1 - 0.2233
102+
return 0.75738
103103

104104
@property
105105
def test_target_value(self) -> float:
106-
return 1 - 0.3455
106+
return 0.6359
107107

108108

109109
class ImagenetVitPostLNWorkload(ImagenetVitWorkload):
@@ -114,11 +114,11 @@ def use_post_layer_norm(self) -> bool:
114114

115115
@property
116116
def validation_target_value(self) -> float:
117-
return 1 - 0.24688
117+
return 0.75312
118118

119119
@property
120120
def test_target_value(self) -> float:
121-
return 1 - 0.3714
121+
return 0.6286
122122

123123

124124
class ImagenetVitMapWorkload(ImagenetVitWorkload):
@@ -129,8 +129,8 @@ def use_map(self) -> bool:
129129

130130
@property
131131
def validation_target_value(self) -> float:
132-
return 1 - 0.22886
132+
return 0.77113
133133

134134
@property
135135
def test_target_value(self) -> float:
136-
return 1 - 0.3477
136+
return 0.6523

algorithmic_efficiency/workloads/imagenet_vit/imagenet_pytorch/workload.py

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -90,11 +90,11 @@ def use_glu(self) -> bool:
9090

9191
@property
9292
def validation_target_value(self) -> float:
93-
return 1 - 0.2233
93+
return 0.75738
9494

9595
@property
9696
def test_target_value(self) -> float:
97-
return 1 - 0.3455
97+
return 0.6359
9898

9999

100100
class ImagenetVitPostLNWorkload(ImagenetVitWorkload):
@@ -105,11 +105,11 @@ def use_post_layer_norm(self) -> bool:
105105

106106
@property
107107
def validation_target_value(self) -> float:
108-
return 1 - 0.24688
108+
return 0.75312
109109

110110
@property
111111
def test_target_value(self) -> float:
112-
return 1 - 0.3714
112+
return 0.6286
113113

114114

115115
class ImagenetVitMapWorkload(ImagenetVitWorkload):
@@ -120,8 +120,8 @@ def use_map(self) -> bool:
120120

121121
@property
122122
def validation_target_value(self) -> float:
123-
return 1 - 0.22886
123+
return 0.77113
124124

125125
@property
126126
def test_target_value(self) -> float:
127-
return 1 - 0.3477
127+
return 0.6523

0 commit comments

Comments
 (0)