Active defect discovery: A human-in-the-loop learning method

Bo Shen, Zhenyu Kong

Research output: Contribution to journalArticlepeer-review

Abstract

Unsupervised defect detection methods are applied to an unlabeled dataset by producing a ranked list based on defect scores. Unfortunately, many of the top-ranked instances by unsupervised algorithms are not defects, which leads to high false-positive rates. Active Defect Discovery (ADD) is proposed to overcome this deficiency, which sequentially selects instances to get the labeling information (defects or not). However, labeling is often costly. Therefore, balancing detection accuracy and labeling cost is essential. Along this line, this article proposes a novel ADD method to achieve the goal. Our approach is based on the state-of-the-art unsupervised defect detection method, namely, Isolation Forest, as the baseline defect detector to extract features. Thereafter, the sparsity of the extracted features is utilized to adjust the defect detector so that it can focus on more important features for defect detection. To enforce the sparsity of the features and subsequent improvement of the detection accuracy, a new algorithm based on online gradient descent, namely, Sparse Approximated Linear Defect Discovery (SALDD), is proposed with its theoretical Regret analysis. Extensive experiments are conducted on real-world datasets including healthcare, manufacturing, security, etc. The performance demonstrates that the proposed algorithm significantly outperforms the state-of-the-art algorithms for defect detection.

Original languageAmerican English
JournalIISE Transactions
DOIs
StateAccepted/In press - 2023

ASJC Scopus subject areas

  • Industrial and Manufacturing Engineering

Keywords

  • Isolation forest
  • active defect discovery
  • measurement feedback
  • online gradient descent
  • sparsity

Fingerprint

Dive into the research topics of 'Active defect discovery: A human-in-the-loop learning method'. Together they form a unique fingerprint.

Cite this