AAAI Publications, Workshops at the Twenty-Fifth AAAI Conference on Artificial Intelligence

Font Size: 
Improving Consensus Accuracy via Z-Score and Weighted Voting
Hyun Joon Jung, Matthew Lease

Last modified: 2011-08-24


Using supervised and unsupervised features individually or together, we (a) detect and filter out noisy workers via Z-score, and (b) weight worker votes for consensus labeling. We evaluate on noisy labels from Amazon Mechanical Turk in which workers judge Web search relevance of query/document pairs. In comparison to a majority vote baseline, results show a 6% error reduction (48.83% to 51.91%) for graded accuracy and 5% error reduction (64.88% to 68.33%) for binary accuracy.

Full Text: PDF