期刊名称:International Journal of New Computer Architectures and their Applications
印刷版ISSN:2220-9085
出版年度:2018
卷号:8
期号:2
页码:53-60
出版社:Society of Digital Information and Wireless Communications
摘要:Until now, in reinforcement learning, a ratio of
a random action as known as exploration often has
not been adjusted dynamically. However, this ratio
will be an index of performance in the reinforcement
learning. In this study, agents learn using information
from the evaluation of achievement for
task of another agent, will be suggested. From this
proposed method, the exploration ratio will be adjusted
from other agents’ behavior, dynamically. In
Human Life, an “atmosphere” will be existed as a
communication method. For example, empirically,
people will be influenced by “serious atmosphere,”
such as in the situation of working, or take an examination.
In this study, this atmosphere as motivation
for task achievement of agent will be defined.
Moreover, in this study, agent’s action decision
when another agent will be solved the task,
will be focused on. In other words, an agent will
be trying to find an optimal solution if other agents
have been found an optimal solution. In this paper,
we propose the action decision based on other
agent’s behavior. Moreover, in this study, we discuss
effectiveness using the maze problem as an example.
In particular, “number of task achievement”
and “influence for task achievement,” and how to
achieve the task quantitative will be focused. As
a result, we confirmed that the proposed method is
well influenced from other agent’s behavior.
关键词:Reinforcement Learning; Exploration ratio; Action
Selection Strategy; Multi Agent; Behavior using
Communication; Cooperative Work; Interworking
Algorithm; Agricultural Weeding Robot