期刊名称:International Journal of Computer Science and Network Security
印刷版ISSN:1738-7906
出版年度:2019
卷号:19
期号:4
页码:176-186
出版社:International Journal of Computer Science and Network Security
摘要:High-Performance Computing (HPC) recently has become important in several sectors, including the scientific and manufacturing fields. The continuous growth in building more powerful super machines has become noticeable, and the Exascale supercomputer will be feasible in the next few years. As a result, building massively parallel systems becomes even more important to keep up with the upcoming Exascale-related technologies. For building such systems, a combination of programming models is needed to increase the system's parallelism, especially dual and tri-level programming models to increase parallelism in heterogeneous systems that include CPUs and GPUs. However, building systems with different programming models is error-prone and difficult, and are also hard to test. Also, testing parallel applications is already a difficult task because parallel errors are hard to detect due to the non-determined behavior of the parallel application. Integrating more than one programming model inside the same application makes even it more difficult to test because this integration could come with a new type of errors. We are surveying the existing testing tools that test parallel systems to detect run-time errors. We classify the reviewed testing tools in different categories and sub-categories based on used testing techniques, targeted programming models, and detected run-time errors. Despite the effort of building testing tools for parallel systems, much work still needs to be done, especially in testing heterogeneous and multi-level programming models. Hopefully, these efforts will meet the expected improvement in HPC systems and create more error-free systems.