文章基本信息

标题：逆強化学習とファジィ推論に基づくあいまい性を考慮した報酬関数の設計
本地全文：下载
作者：加藤優太 ; 加納政芳 ; 中村剛士等
期刊名称：知能と情報
印刷版ISSN：1347-7986
电子版ISSN：1881-7203
出版年度：2021
卷号：33
期号：4
页码：827-832
DOI：10.3156/jsoft.33.4_827
语种：Japanese
出版社：Japan Society for Fuzzy Theory and Intelligent Informatics
摘要：ロボットの行動則を獲得する方法に逆強化学習を用いて報酬関数を設計する方法がある．ここで，状態空間は，次元数が増えるにつれて指数関数的に大きくなるため，状態空間の広さに対して，観測できる状態遷移数の割合は激減する．部分的な状態遷移情報からでも報酬関数を設計することは可能であるが，得られた報酬関数にはあいまい性が存在することになる．あいまい性を含む報酬関数を用いて学習する場合には，あいまい性を許容可能な報酬関数が必要となる．そこで本稿では，逆強化学習で設計された報酬関数のもつあいまい性をファジィ推論によって数値化する手法を提案する．実験の結果，提案手法によって，危険度や安全度を考慮した行動系列を学習できる可能性が示唆された.
其他摘要：A reward function estimated with inverse reinforcement learning has been used to determine a method for controlling a robot. The number of state transitions that can be observed using the action sequence given to inverse reinforcement learning decreases drastically as the complexity of the planning problem increases. Although a reward function can be designed even from partial state transition information, ambiguity in the obtained reward function exists. A reward function that can tolerate ambiguity is required when learning with a reward function that includes ambiguity. In this paper, we propose a method for quantifying the ambiguity of the reward function, which is designed with inverse reinforcement learning using fuzzy reasoning. Experimental results was suggested that proposed method can learn the action sequence while considering the degree of risk and safety.
关键词：逆強化学習;ファジィ推論;報酬関数
其他关键词：inverse reinforcement learning;fuzzy reasoning;reward function