首页    期刊浏览 2024年12月12日 星期四
登录注册

文章基本信息

  • 标题:Performance Evaluation and Enhancements of a Flood Simulator Application for Heterogeneous HPC Environments
  • 本地全文:下载
  • 作者:Ryan Marshall ; Sheikh Ghafoor ; Mike Rogers
  • 期刊名称:International Journal of Networking and Computing
  • 印刷版ISSN:2185-2847
  • 出版年度:2018
  • 卷号:8
  • 期号:2
  • 页码:387-407
  • 语种:English
  • 出版社:International Journal of Networking and Computing
  • 摘要:This paper presents a practical implementation of a 2D flood simulation model using hybrid distributed-parallel technologies including MPI, OpenMP, CUDA, and evaluations of its performance under various configurations that utilize these technologies. The main objective of this research work was to improve the computational performance of the flood simulation in a hybrid architecture. Modern desktops and small cluster systems owned by domain researchers are able to perform these simulations efficiently due to multicore and GPU computing devices, but lack the expertise needed to fully utilize software libraries designed to take advantage of the latest hardware. By leveraging knowledge of our experimentation environment, we were able to incorporate MPI and multiple GPU devices to improve performance over a single-process OpenMP version up to 18x, depending on the size of the input data. We discuss some observations that have significant effects on overall performance, including process-to-device mapping, communication strategies and data partitioning, and present some experimental results. The limitations of this work are discussed, and we propose some ideas to relieve or overcome such limitations in future work.
  • 其他摘要:This paper presents a practical implementation of a 2D flood simulation model using hybrid distributed-parallel technologies including MPI, OpenMP, CUDA, and evaluations of its performance under various configurations that utilize these technologies. The main objective of this research work was to improve the computational performance of the flood simulation in a hybrid architecture. Modern desktops and small cluster systems owned by domain researchers are able to perform these simulations efficiently due to multicore and GPU computing devices, but lack the expertise needed to fully utilize software libraries designed to take advantage of the latest hardware. By leveraging knowledge of our experimentation environment, we were able to incorporate MPI and multiple GPU devices to improve performance over a single-process OpenMP version up to 18x, depending on the size of the input data. We discuss some observations that have significant effects on overall performance, including process-to-device mapping, communication strategies and data partitioning, and present some experimental results. The limitations of this work are discussed, and we propose some ideas to relieve or overcome such limitations in future work.
国家哲学社会科学文献中心版权所有