摘要:The eXtensible Stylesheet Language Transformation (XSLT) is a de-facto standard for XML data transforming and extracting. Efficient processing of large amounts of XML data brings challenges to conventional XSLT processors, which are designed to run in a single machine context. To solve these data-intensive problems, MapReduce paradigm in the cloud computing domain has received a comprehensive attention in both academia and IT industry recently. In this paper, a novel MapReduce-based XSLT distributed processing framework named CloudXSLT is proposed to implement efficient and scalable XML data transforming. First, the architecture of CloudXSLT framework is outlined. Subsequently, several XML data and XSLT rule representation models which are suitable for MapReduce paradigm are defined, and several MapReduce-based XSLT distributed processing algorithms are proposed. Finally, an experiment on a simulation environment with real XML datasets shows our framework is more efficient and scalable than conventional XSLT processors when processing large size of XML data.