文章基本信息

标题：Optimal Periods for Probing Convergence of Infinite-stage Dynamic Programmings on GPUs
其他标题：Optimal Periods for Probing Convergence of Infinite-stage Dynamic Programmings on GPUs
本地全文：下载
作者：Tsutomu Inamoto ; Yoshinobu Higami ; Shin-ya Kobayashi 等
期刊名称：International Journal of Networking and Computing
印刷版ISSN：2185-2847
出版年度：2014
卷号：4
期号：2
页码：321-335
语种：English
出版社：International Journal of Networking and Computing
摘要：In this paper, we propose a basic technique to minimize the computational time in executing the infinite-stage dynamic programming (DP) on a GPU. The infinite-stage DP involves computations to probe whether a value function gets sufficiently close to the optimal one. Such computations for probing convergence become obvious when an infinite-stage DP is executed on a GPU, since those computations are not necessary for finite-stage DPs, and hide behind loops for updating state values when a DP is executed on a CPU. The heart of the proposed technique is to suppress those computations for probing by thinning out them. By the proposed technique, differences between state values before and after being updated are periodically transferred to the main memory, then are checked to probe convergence. This intermittent probing makes contrast to ordinary methods in which computations for probing are processed every time. The technique also proposes a formulation to determine optimal periods for probing based on simple statistics given by preliminary experiments. The effectiveness of the proposed technique is examined on two problems; the one is a kind of the animat problem in which an agent moves around in a maze to collect foods, and the other is the mountain-car problem in which a pow- erless car on a slope struggles to pass over a higher peak. Computational results display that a method with the proposed technique decreases computational times for both problems compared to methods in which computations for probing convergence are processed every time, and the degree of decreasing seems remarkable when the state space is larger.
关键词：dynamic programming; value iteration; GPU