首页    期刊浏览 2024年12月03日 星期二
登录注册

文章基本信息

  • 标题:Optimal Periods for Probing Convergence of Infinite-stage Dynamic Programmings on GPUs
  • 其他标题:Optimal Periods for Probing Convergence of Infinite-stage Dynamic Programmings on GPUs
  • 本地全文:下载
  • 作者:Tsutomu Inamoto ; Yoshinobu Higami ; Shin-ya Kobayashi
  • 期刊名称:International Journal of Networking and Computing
  • 印刷版ISSN:2185-2847
  • 出版年度:2014
  • 卷号:4
  • 期号:2
  • 页码:321-335
  • 语种:English
  • 出版社:International Journal of Networking and Computing
  • 摘要:In this paper, we propose a basic technique to minimize the computational time in executing the infinite-stage dynamic programming (DP) on a GPU. The infinite-stage DP involves computations to probe whether a value function gets sufficiently close to the optimal one. Such computations for probing convergence become obvious when an infinite-stage DP is executed on a GPU, since those computations are not necessary for finite-stage DPs, and hide behind loops for updating state values when a DP is executed on a CPU. The heart of the proposed technique is to suppress those computations for probing by thinning out them. By the proposed technique, differences between state values before and after being updated are periodically transferred to the main memory, then are checked to probe convergence. This intermittent probing makes contrast to ordinary methods in which computations for probing are processed every time. The technique also proposes a formulation to determine optimal periods for probing based on simple statistics given by preliminary experiments. The effectiveness of the proposed technique is examined on two problems; the one is a kind of the animat problem in which an agent moves around in a maze to collect foods, and the other is the mountain-car problem in which a pow- erless car on a slope struggles to pass over a higher peak. Computational results display that a method with the proposed technique decreases computational times for both problems compared to methods in which computations for probing convergence are processed every time, and the degree of decreasing seems remarkable when the state space is larger.
  • 关键词:dynamic programming; value iteration; GPU
国家哲学社会科学文献中心版权所有