出版社:The Japanese Society for Artificial Intelligence
摘要:Crowdsourcing allows human intelligence tasks to be outsourced to a large number of unspecified people at low costs. However, because of the uneven ability and diligence of crowd workers, the quality of their work is also uneven and sometimes quite low. Therefore, quality control is one of the central issues in crowdsourcing research. In this paper, we address a quality control problem of enumeration tasks, in which workers are asked to enumerate as many answers satisfying certain conditions as possible. As examples of enumeration tasks, we consider text collection tasks in addition to POI collection tasks. Since workers neither necessarily provide correct answers nor provide exactly the same answers even if the answers indicate the same object because of orthographic or numerical variations, we propose a two-stage quality control method consisting of an answer clustering stage and a reliability estimation stage. The answer clustering stage with a new constrained exemplar clustering method groups answers indicating the same object into a cluster and requires a representative answer from each cluster, and then the reliability estimation stage with a modified HITS estimates the reliabilities of representative answers and removes unreliable ones. Implemented with a new constrained exemplar clustering and a modified HITS algorithm, the effectiveness of our method is demonstrated as compared to baseline methods on several real crowdsourcing datasets of POI collection tasks and text collection tasks.