Failure Prediction in Crowdsourced Software Development

Abdullah Khanfor, Ye Yang, Gregg Vesonder, Guenther Ruhe, Dave Messinger

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

Background: Despite the increasingly reported benefits of software crowdsourcing, one of the major practical concerns is the limited visibility and control over task progress. Aim: This paper reports an empirical study to develop a framework for failure prediction in software crowdsourcing. Method: This process begins with identifying 13 influencing factors in software crowdsourcing failures, across four categories including task characteristics, technology popularity, competition network, and workers reliability. Presenting an algorithm to construct worker competition network and extract its network metrics features. The proposed framework was evaluated on 4,872 software crowdsourcing tasks, extracted from TopCoder platform, using five machine learners, compared with in-house TopCoder predictor. Results: 1) Workers reliability, links in the description, number of registered workers, number of required technologies, and task-workers network modularity are the most influencing factors for predicting crowdsourcing failure; 2) The top-three learners for task failure are Naïve Bayes, Random Forest, and StackingC, with precision above 98.8%, recall above 81.2%, and F-measure above 91.2%; and 3) The proposed best learners significantly outperform the two baseline models in our evaluation. Conclusions: The performance of the proposed framework is better than those of the two baseline models. This paper offers practical recommendations for managing task failure risks.

Original languageEnglish (US)
Title of host publicationProceedings - 24th Asia-Pacific Software Engineering Conference, APSEC 2017
PublisherIEEE Computer Society
Pages495-504
Number of pages10
Volume2017-December
ISBN (Electronic)9781538636817
DOIs
StatePublished - Mar 1 2018
Event24th Asia-Pacific Software Engineering Conference, APSEC 2017 - Nanjing, Jiangsu, China
Duration: Dec 4 2017Dec 8 2017

Other

Other24th Asia-Pacific Software Engineering Conference, APSEC 2017
CountryChina
CityNanjing, Jiangsu
Period12/4/1712/8/17

Fingerprint

Software engineering
Visibility

All Science Journal Classification (ASJC) codes

  • Software

Cite this

Khanfor, A., Yang, Y., Vesonder, G., Ruhe, G., & Messinger, D. (2018). Failure Prediction in Crowdsourced Software Development. In Proceedings - 24th Asia-Pacific Software Engineering Conference, APSEC 2017 (Vol. 2017-December, pp. 495-504). IEEE Computer Society. https://doi.org/10.1109/APSEC.2017.56
Khanfor, Abdullah ; Yang, Ye ; Vesonder, Gregg ; Ruhe, Guenther ; Messinger, Dave. / Failure Prediction in Crowdsourced Software Development. Proceedings - 24th Asia-Pacific Software Engineering Conference, APSEC 2017. Vol. 2017-December IEEE Computer Society, 2018. pp. 495-504
@inproceedings{0117d88928a9456d8dea0f09a51fde56,
title = "Failure Prediction in Crowdsourced Software Development",
abstract = "Background: Despite the increasingly reported benefits of software crowdsourcing, one of the major practical concerns is the limited visibility and control over task progress. Aim: This paper reports an empirical study to develop a framework for failure prediction in software crowdsourcing. Method: This process begins with identifying 13 influencing factors in software crowdsourcing failures, across four categories including task characteristics, technology popularity, competition network, and workers reliability. Presenting an algorithm to construct worker competition network and extract its network metrics features. The proposed framework was evaluated on 4,872 software crowdsourcing tasks, extracted from TopCoder platform, using five machine learners, compared with in-house TopCoder predictor. Results: 1) Workers reliability, links in the description, number of registered workers, number of required technologies, and task-workers network modularity are the most influencing factors for predicting crowdsourcing failure; 2) The top-three learners for task failure are Na{\"i}ve Bayes, Random Forest, and StackingC, with precision above 98.8{\%}, recall above 81.2{\%}, and F-measure above 91.2{\%}; and 3) The proposed best learners significantly outperform the two baseline models in our evaluation. Conclusions: The performance of the proposed framework is better than those of the two baseline models. This paper offers practical recommendations for managing task failure risks.",
author = "Abdullah Khanfor and Ye Yang and Gregg Vesonder and Guenther Ruhe and Dave Messinger",
year = "2018",
month = "3",
day = "1",
doi = "https://doi.org/10.1109/APSEC.2017.56",
language = "English (US)",
volume = "2017-December",
pages = "495--504",
booktitle = "Proceedings - 24th Asia-Pacific Software Engineering Conference, APSEC 2017",
publisher = "IEEE Computer Society",
address = "United States",

}

Khanfor, A, Yang, Y, Vesonder, G, Ruhe, G & Messinger, D 2018, Failure Prediction in Crowdsourced Software Development. in Proceedings - 24th Asia-Pacific Software Engineering Conference, APSEC 2017. vol. 2017-December, IEEE Computer Society, pp. 495-504, 24th Asia-Pacific Software Engineering Conference, APSEC 2017, Nanjing, Jiangsu, China, 12/4/17. https://doi.org/10.1109/APSEC.2017.56

Failure Prediction in Crowdsourced Software Development. / Khanfor, Abdullah; Yang, Ye; Vesonder, Gregg; Ruhe, Guenther; Messinger, Dave.

Proceedings - 24th Asia-Pacific Software Engineering Conference, APSEC 2017. Vol. 2017-December IEEE Computer Society, 2018. p. 495-504.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - Failure Prediction in Crowdsourced Software Development

AU - Khanfor, Abdullah

AU - Yang, Ye

AU - Vesonder, Gregg

AU - Ruhe, Guenther

AU - Messinger, Dave

PY - 2018/3/1

Y1 - 2018/3/1

N2 - Background: Despite the increasingly reported benefits of software crowdsourcing, one of the major practical concerns is the limited visibility and control over task progress. Aim: This paper reports an empirical study to develop a framework for failure prediction in software crowdsourcing. Method: This process begins with identifying 13 influencing factors in software crowdsourcing failures, across four categories including task characteristics, technology popularity, competition network, and workers reliability. Presenting an algorithm to construct worker competition network and extract its network metrics features. The proposed framework was evaluated on 4,872 software crowdsourcing tasks, extracted from TopCoder platform, using five machine learners, compared with in-house TopCoder predictor. Results: 1) Workers reliability, links in the description, number of registered workers, number of required technologies, and task-workers network modularity are the most influencing factors for predicting crowdsourcing failure; 2) The top-three learners for task failure are Naïve Bayes, Random Forest, and StackingC, with precision above 98.8%, recall above 81.2%, and F-measure above 91.2%; and 3) The proposed best learners significantly outperform the two baseline models in our evaluation. Conclusions: The performance of the proposed framework is better than those of the two baseline models. This paper offers practical recommendations for managing task failure risks.

AB - Background: Despite the increasingly reported benefits of software crowdsourcing, one of the major practical concerns is the limited visibility and control over task progress. Aim: This paper reports an empirical study to develop a framework for failure prediction in software crowdsourcing. Method: This process begins with identifying 13 influencing factors in software crowdsourcing failures, across four categories including task characteristics, technology popularity, competition network, and workers reliability. Presenting an algorithm to construct worker competition network and extract its network metrics features. The proposed framework was evaluated on 4,872 software crowdsourcing tasks, extracted from TopCoder platform, using five machine learners, compared with in-house TopCoder predictor. Results: 1) Workers reliability, links in the description, number of registered workers, number of required technologies, and task-workers network modularity are the most influencing factors for predicting crowdsourcing failure; 2) The top-three learners for task failure are Naïve Bayes, Random Forest, and StackingC, with precision above 98.8%, recall above 81.2%, and F-measure above 91.2%; and 3) The proposed best learners significantly outperform the two baseline models in our evaluation. Conclusions: The performance of the proposed framework is better than those of the two baseline models. This paper offers practical recommendations for managing task failure risks.

UR - http://www.scopus.com/inward/record.url?scp=85045920185&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85045920185&partnerID=8YFLogxK

U2 - https://doi.org/10.1109/APSEC.2017.56

DO - https://doi.org/10.1109/APSEC.2017.56

M3 - Conference contribution

VL - 2017-December

SP - 495

EP - 504

BT - Proceedings - 24th Asia-Pacific Software Engineering Conference, APSEC 2017

PB - IEEE Computer Society

ER -

Khanfor A, Yang Y, Vesonder G, Ruhe G, Messinger D. Failure Prediction in Crowdsourced Software Development. In Proceedings - 24th Asia-Pacific Software Engineering Conference, APSEC 2017. Vol. 2017-December. IEEE Computer Society. 2018. p. 495-504 https://doi.org/10.1109/APSEC.2017.56