摘要:Web services composition present a technology to compose complex service applications from individual (atomic) services, that is, through web services composition, distributed applications and enterprise business processes can be integrated by individual service components developed independently. In this paper, we concentrate on the optimization problems of dynamic web service composition, and our goal is to find an optimal composite policy. Different from many traditional composite methods that do not scale to large continuous-time processes, we introduce a hierarchical reinforcement learning technique, i.e., a continuous-time unified MAXQ algorithm, to solve large-scale web service composition problems in the context of continuous-time semi-Markov decision process (SMDP) model under either average- or discounted-cost criteria. The proposed algorithm can avoid the “curse of modeling” and the “curse of dimensionality” existing in the optimization process. Finally, we use a travel reservation as an example to illustrate the high effectiveness of the proposed algorithm, and the simulation results show that, it has better optimization performance and faster learning speed than the flat Q-learning.
关键词:web service composition;hierarchical reinforcement learning;semi-Markov decision process (SMDP);MAXQ