文章基本信息

标题：Sequential Decision Task by Adaptive Reinforcement Learning Method
本地全文：下载
作者：Ankur Verma ; Pramod Patil
期刊名称：International Journal of Computer Science and Information Technologies
电子版ISSN：0975-9646
出版年度：2015
卷号：6
期号：4
页码：3677-3681
出版社：TechScience Publications
摘要：There is great interest in building intrinsic motivation for artificial intelligent systems using the reinforcement learning framework. There are many dynamic situations in which sequences of actions come with circumstances which are convenient. When the action is taken these consequences of actions can emerge at a multitude of time, and shall be concern with the strategies for both their short term and long term consequences. An approach is proposed here based on a model which requires constructing the model of state transaction and payoff probabilities. Such kind of tasks can be termed as a dynamical system whose behavior of decision changes over time under the impact of a decision taken for the action. This modeling of the behavior of decision of the system is greatly simplified by the concept of state and policy associates on action with that system states. It proposes methods for estimating optimal policy in the absence of a complete model of the decision tasks which are known as adaptive or decision model. Practical importance of adaptive method is; if this adaptive method can able to make improvement in decision policy sufficiently rapidly may be less.
关键词：Reinforcement Learning; Decision policy;state-action function; Q-Learning; Temporal Difference;Learning