HULK

Save the world, one flop at a time.
An Energy Efficiency Benchmark Platform for Responsible Natural Language Processing


Pretraining Phase - Compare Time & Cost


Submission Time Time (hours) Cost ($) Parameter (M) Source Details

Nov 2019

96 1,728 108 BERT-Base

HULK Baseline

4 TPU Pods, TensorFlow

Nov 2019

96 6,912 334 BERT-Large

HULK Baseline

16 TPU Pods, TensorFlow

Nov 2019

60 61,440 361 XLNet-Large

HULK Baseline

512 TPU v3

Nov 2019

24 75,203 125 RoBERTa Base

HULK Baseline

1024 V100 GPUs

Nov 2019

24 75,203 356 RoBERTa Large

HULK Baseline

1024 V100 GPUs

Nov 2019

36 65,536 223 ALBERT-XXLarge

HULK Baseline

1024 TPU v3

Nov 2019

90 2203.2 66 BERT-Large

HULK Baseline

8×16G V100 GPU


Fine-Tuning Phase - Compare Time


Submission Time NER (sec) SST-2 (sec) MNLI (sec) Score Source Details

Nov 2019

43.43 207.15 N/R 2.52 BERT Base

HULK Baseline

GTX-2080 Ti

Nov 2019

90.26 92.45 9,106.72 3.00 BERT Large

HULK Baseline

GTX-2080 Ti

Nov 2019

67.14 102.45 7,704.71 3.42 XLNet Base

HULK Baseline

GTX-2080 Ti

Nov 2019

243.00 367.11 939.62 10.31 XLNet Large

HULK Baseline

GTX-2080 Ti

Nov 2019

70.57 38.45 274.87 10.82 RoBERTa Base

HULK Baseline

GTX-2080 Ti

Nov 2019

155.43 57.65 397.12 25.11 RoBERTa Large

HULK Baseline

GTX-2080 Ti

Nov 2019

340.64 2767.90 16995.35 0.83 ALBERT v1 Base

HULK Baseline

GTX-2080 Ti

Nov 2019

844.85 3708.49 N/R 0.13 ALBERT v1 Large

HULK Baseline

GTX-2080 Ti

* N/R means that the model does not reach the required performance in reasonable time.



Fine-tuning Phase - Compare Cost


Submission Time NER ($) SST-2 ($) MNLI ($) Score Source Details

Nov 2019

0.04 0.18 N/R 2.52 BERT Base

HULK Baseline

GTX-2080 Ti

Nov 2019

0.08 0.08 7.74 3.00 BERT Large

HULK Baseline

GTX-2080 Ti

Nov 2019

0.06 0.09 6.55 3.42 XLNet Base

HULK Baseline

GTX-2080 Ti

Nov 2019

0.21 0.31 0.80 10.31 XLNet Large

HULK Baseline

GTX-2080 Ti

Nov 2019

70.57 38.45 274.87 10.82 RoBERTa Base

HULK Baseline

GTX-2080 Ti

Nov 2019

0.13 0.05 0.34 25.11 RoBERTa Large

HULK Baseline

GTX-2080 Ti

Nov 2019

0.29 2.35 14.45 0.83 ALBERT v1 Base

HULK Baseline

GTX-2080 Ti

Nov 2019

0.72 3.15 N/R 0.13 ALBERT v1 Large

HULK Baseline

GTX-2080 Ti

* GTX 2080 Ti Results are estimated using p2.3xlarge on AWS at $3.06/h.
* N/R means that the model does not reach the required performance in reasonable time.



Inference Phase - Compare Time per single input


Submission Time NER (ms) SST-2 (ms) MNLI (ms) Score Source Details

Nov 2019

2.68 2.70 2.67 9.5 BERT Base

HULK Baseline

GTX-2080 Ti

Nov 2019

8.51 8.46 8.53 3.00 BERT Large

HULK Baseline

GTX-2080 Ti

Nov 2019

5.16 5.01 5.10 5.01 XLNet Base

HULK Baseline

GTX-2080 Ti

Nov 2019

14.84 14.69 15.27 1.71 XLNet Large

HULK Baseline

GTX-2080 Ti

Nov 2019

2.65 2.68 2.70 9.53 RoBERTa Base

HULK Baseline

GTX-2080 Ti

Nov 2019

8.35 8.36 8.70 3.01 RoBERTa Large

HULK Baseline

GTX-2080 Ti

Nov 2019

2.65 2.68 2.72 9.53 ALBERT v1 Base

HULK Baseline

GTX-2080 Ti

Nov 2019

8.49 8.44 8.78 2.97 ALBERT v1 Large

HULK Baseline

GTX-2080 Ti


Inference Phase - Compare Cost per 100,000 inputs


Submission Time NER ($) SST-2 ($) MNLI ($) Score Source Details

Nov 2019

0.23 0.23 0.23 9.5 BERT Base

HULK Baseline

GTX-2080 Ti

Nov 2019

0.72 0.72 0.73 3.00 BERT Large

HULK Baseline

GTX-2080 Ti

Nov 2019

0.44 0.43 0.43 5.01 XLNet Base

HULK Baseline

GTX-2080 Ti

Nov 2019

1.26 1.25 1.30 1.71 XLNet Large

HULK Baseline

GTX-2080 Ti

Nov 2019

0.23 0.23 0.23 9.53 RoBERTa Base

HULK Baseline

GTX-2080 Ti

Nov 2019

0.71 0.71 0.74 3.01 RoBERTa Large

HULK Baseline

GTX-2080 Ti

Nov 2019

0.23 0.23 0.23 9.53 ALBERT v1 Base

HULK Baseline

GTX-2080 Ti

Nov 2019

0.72 0.72 0.75 2.97 ALBERT v1 Large

HULK Baseline

GTX-2080 Ti

* GTX 2080 Ti Results are estimated using p2.3xlarge on AWS at $3.06/h.



Sponsor


We thank the Institute for Energy Efficienc at UC Santa Barbara for their support.


Paper


Please cite our paper as below if you use the HULK platform.

@misc{zhou2020hulk,
    title={HULK: An Energy Efficiency Benchmark Platform for Responsible Natural Language Processing},
    author={Xiyou Zhou and Zhiyu Chen and Xiaoyong Jin and William Yang Wang},
    year={2020},
    eprint={2002.05829},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}
                        

Contact



Have any questions or suggestions? Feel free to contact us!