Platform-Related Factors in Repeatability and Reproducibility of Crowdsourcing Tasks

Rehab Qarout; Alessandro Checco; Gianluca Demartini; Kalina Bontcheva

doi:10.1609/hcomp.v7i1.5264

Authors

Rehab Qarout The University of Sheffield
Alessandro Checco The University of Sheffield
Gianluca Demartini The University of Queensland
Kalina Bontcheva The University of Sheffield

DOI:

https://doi.org/10.1609/hcomp.v7i1.5264

Abstract

Crowdsourcing platforms provide a convenient and scalable way to collect human-generated labels on-demand. This data can be used to train Artificial Intelligence (AI) systems or to evaluate the effectiveness of algorithms. The datasets generated by means of crowdsourcing are, however, dependent on many factors that affect their quality. These include, among others, the population sample bias introduced by aspects like task reward, requester reputation, and other filters introduced by the task design.

In this paper, we analyse platform-related factors and study how they affect dataset characteristics by running a longitudinal study where we compare the reliability of results collected with repeated experiments over time and across crowdsourcing platforms. Results show that, under certain conditions: 1) experiments replicated across different platforms result in significantly different data quality levels while 2) the quality of data from repeated experiments over time is stable within the same platform. We identify some key task design variables that cause such variations and propose an experimentally validated set of actions to counteract these effects thus achieving reliable and repeatable crowdsourced data collection experiments.

Platform-Related Factors in Repeatability and Reproducibility of Crowdsourcing Tasks

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information