期刊名称:Bulletin of the Technical Committee on Data Engineering
出版年度:2012
卷号:35
期号:02
出版社:IEEE Computer Society
摘要:Over the past few years, our Data Management, Exploration and Mining (DMX) group at Microsoft
Research has worked closely with the Bing team to address challenging data cleaning and approximate
matching problems. In this article we describe some of the key Big Data challenges in the context of
these Bing services primarily focusing on two key services: Bing Maps and Bing Shopping. We describe
ideas that proved crucial in helping meet the quality, performance and scalability goals demanded by
these services. We also briefly reflect on the lessons learned and comment on opportunities for future
work in data cleaning technology for Big Data.