TY - GEN
T1 - AUTOTRAINER
T2 - 43rd IEEE/ACM International Conference on Software Engineering, ICSE 2021
AU - Zhang, Xiaoyu
AU - Zhai, Juan
AU - Ma, Shiqing
AU - Shen, Chao
N1 - Funding Information: Ac k n o w l e d g e m e n t The authors thank the anonymous reviewers for their insightful feedback and constructive comments. We also thank Jiong Li for his efforts and feedbacks on this project. This work is, in part supported by the National Science Foundation of China (No. 61802166, 61822309, 61773310, U1736205) and National Key R&D Program of China under Grand No. 2020AAA0107700. Chao Shen is the corresponding author. The views, opinions and/or findings expressed are only those of the authors. Publisher Copyright: © 2021 IEEE.
PY - 2021/5
Y1 - 2021/5
N2 - With machine learning models especially Deep Neural Network (DNN) models becoming an integral part of the new intelligent software, new tools to support their engineering process are in high demand. Existing DNN debugging tools are either post-training which wastes a lot of time training a buggy model and requires expertises, or limited on collecting training logs without analyzing the problem not even fixing them. In this paper, we propose AUTOTRAINER, a DNN training monitoring and automatic repairing tool which supports detecting and auto repairing five commonly seen training problems. During training, it periodically checks the training status and detects potential problems. Once a problem is found, AUTOTRAINER tries to fix it by using built-in state-of-the-art solutions. It supports various model structures and input data types, such as Convolutional Neural Networks (CNNs) for image and Recurrent Neural Networks (RNNs) for texts. Our evaluation on 6 datasets, 495 models show that AUTOTRAINER can effectively detect all potential problems with 100% detection rate and no false positives. Among all models with problems, it can fix 97.33% of them, increasing the accuracy by 47.08% on average.
AB - With machine learning models especially Deep Neural Network (DNN) models becoming an integral part of the new intelligent software, new tools to support their engineering process are in high demand. Existing DNN debugging tools are either post-training which wastes a lot of time training a buggy model and requires expertises, or limited on collecting training logs without analyzing the problem not even fixing them. In this paper, we propose AUTOTRAINER, a DNN training monitoring and automatic repairing tool which supports detecting and auto repairing five commonly seen training problems. During training, it periodically checks the training status and detects potential problems. Once a problem is found, AUTOTRAINER tries to fix it by using built-in state-of-the-art solutions. It supports various model structures and input data types, such as Convolutional Neural Networks (CNNs) for image and Recurrent Neural Networks (RNNs) for texts. Our evaluation on 6 datasets, 495 models show that AUTOTRAINER can effectively detect all potential problems with 100% detection rate and no false positives. Among all models with problems, it can fix 97.33% of them, increasing the accuracy by 47.08% on average.
KW - Deep learning training
KW - Software engineering
KW - Software tools
UR - http://www.scopus.com/inward/record.url?scp=85115669113&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85115669113&partnerID=8YFLogxK
U2 - https://doi.org/10.1109/ICSE43902.2021.00043
DO - https://doi.org/10.1109/ICSE43902.2021.00043
M3 - Conference contribution
T3 - Proceedings - International Conference on Software Engineering
SP - 359
EP - 371
BT - Proceedings - 2021 IEEE/ACM 43rd International Conference on Software Engineering, ICSE 2021
PB - IEEE Computer Society
Y2 - 22 May 2021 through 30 May 2021
ER -