MLPerf Training

Overview

MLPerf Training is a benchmark suite for measuring how fast systems can train models to a target quality metric.

Benchmarks

Each MLPerf Training benchmark is defined by a Dataset and Quality Target. The following table summarizes the seven benchmarks in version v0.6 of the suite.

Benchmark Dataset Quality Target Reference Implementation Model
Image classification ImageNet (224x224) 75.9% Top-1 Accuracy Resnet-50 v1.5
Object detection (light weight) COCO 2017 23% mAP SSD-ResNet34
Object detection (heavy weight) COCO 2017 0.377 Box min AP, 0.339 Mask min AP Mask R-CNN
Translation (recurrent) WMT English-German 24.0 BLEU GMNT
Translation (non-recurrent) WMT English-German 25.0 BLEU Transformer
Recommendation Undergoing modification
Reinforcement learning N/A Pre-trained checkpoint Mini Go

Metric

Each MLPerf Training benchmark measures the wallclock time required to train a model on the specified dataset to achieve the specified quality target.

To account for the substantial variance in ML training times, final MLPerf Training results are obtained by measuring the benchmark a benchmark-specific number of times, discarding the lowest and highest results, and averaging the remaining results. Even the multiple result average is not sufficient to eliminate all variance. MLPerf imaging benchmark results are very roughly +/- 2.5% and other MLPerf benchmarks are very roughly +/- 5%.

Divisions

MLPerf aims to encourage innovation in software as well as hardware by allowing submitters to reimplement the reference implementations. MLPerf has two Divisions that allow different levels of flexibility during reimplementation. The Closed division is intended to compare hardware platforms or software frameworks “apples-to-apples” and requires using the same model and optimizer as the reference implementation. The Open division is intended to foster faster models and optimizers and allows any ML approach that can reach the target quality.

Rules

The rules are here.

Reference implementations

The reference implementations for the benchmarks are here.

How to submit

If you intend to submit results, please read the submission rules carefully and join the training submitters working group before you start work. In particular, you must notify the chair of the training submitters working group five weeks ahead of the submission deadline as described in the submission rules.

Results

The results are here.

Use results

MLPerf is a trademark. If you use it and refer to MLPerf results, you must follow the terms of use. MLPerf reserves the right to solely determine if uses of its trademark are appropriate.

If you use MLPerf in a publication, please cite this website or the MLPerf papers (forthcoming).