出版社:The Japanese Society for Artificial Intelligence
摘要:It has attracted considerable attention to use crowdsourcing services to collect a large amount of labeled data for machine learning, since crowdsourcing services allow one to ask the general public to label data at very low cost through the Internet. The use of crowdsourcing has introduced a new challenge in machine learning, that is, coping with low quality of crowd-generated data. There have been many recent attempts to address the quality problem of multiple labelers, however, there are two serious drawbacks in the existing approaches, that are, (i) non-convexity and (ii) task homogeneity. Most of the existing methods consider true labels as latent variables, which results in non-convex optimization problems. Also, the existing models assume only single homogeneous tasks, while in realistic situations, clients can offer multiple tasks to crowds and crowd workers can work on different tasks in parallel. In this paper, we propose a convex optimization formulation of learning from crowds by introducing personal models of individual crowds without estimating true labels. We further extend the proposed model to multi-task learning based on the resemblance between the proposed formulation and that for an existing multi-task learning model. We also devise efficient iterative methods for solving the convex optimization problems by exploiting conditional independence structures in multiple classifiers.