AAAI Publications, First AAAI Conference on Human Computation and Crowdsourcing

Font Size: 
CASTLE: Crowd-Assisted System for Text Labeling and Extraction
Sean Louis Goldberg, Daisy Zhe Wang, Tim Kraska

Last modified: 2013-11-03


The amount of text data has been growing exponentially and with it the demand for improved information extraction (IE) efforts to analyze and query such data. While automatic IE systems have proven useful in controlled experiments, in practice the gap between machine learning extraction and human extraction is still quite large. In this paper, we propose a system that uses crowdsourcing techniques to help close this gap. One of the fundamental issues inherent in using a large-scale human workforce is deciding the optimal questions to pose to the crowd. We demonstrate novel solutions using mutual information and token clustering techniques in the domain of bibliographic citation extraction. Our experiments show promising results in using crowd assistance as a cost-effective way to close up the ”last mile” between extraction systems and a human annotator.


crowdsourcing; information extraction

Full Text: PDF