AAAI Publications, First AAAI Conference on Human Computation and Crowdsourcing

Font Size: 
Using Crowdsourcing to Generate an Evaluation Dataset for Name Matching Technologies
Alya Asarina, Olga Simek

Last modified: 2013-11-03


Crowdsourcing can be a fast, flexible and cost-effective approach to obtaining data for training and evaluating machine learning algorithms. In this paper, we discuss a novel crowdsourcing application: creating a dataset for evaluating name matchers. Name matching is the challenging and subjective task of identifying which names refer to the same person; it is crucial for effective entity disambiguation and search. We have developed an effective question interface and work quality analysis algorithm for our task, which can be applied to other ranking tasks (e.g. search result ranking, recommendation system evaluation, etc.). We have demonstrated that our crowdsourced dataset can successfully be used to evaluate automatic name-matching algorithms.


crowdsourcing; work quality; ranking; name matching

Full Text: PDF