期刊名称:Electronic Colloquium on Computational Complexity
印刷版ISSN:1433-8092
出版年度:2020
卷号:2020
页码:1-73
出版社:Universität Trier, Lehrstuhl für Theoretische Computer-Forschung
摘要:“Games against Nature” [Pap85] are two-player games of perfect information, in which one player’s moves are made randomly (here, uniformly); the final payoff to the non-random player is given by some [0, 1]-valued function of the move history. Estimating the value of such games under optimal play, and computing near-optimal strategies, is an important goal in the study of decision-making under uncertainty, and has seen significant research in AI and allied areas [HRTP11], with only experimental evaluation of most algorithms’ performance. The problem’s PSPACE-completeness does not rule out nontrivial algorithms. Improved algorithms with theoretical guarantees are known in various cases where the payoff function F has special structure, and Littman, Majercik, and Pitassi [LMP01] give a sampling-based improved algorithm for general F, for turn-orders which restrict the number of non-random player strategies. We study the case of general F for which the players strictly alternate with binary moves (w1, r1, w2, r2, . . . , wn/2 , rn/2 )—for which the approach of [LMP01] does not improve over brute force. We give a randomized algorithm to approximate the value of such games under optimal play, and to execute near-optimal strategies. Our algorithm achieves exponential savings over brute-force, making 2(1−δ)n queries to F for some absolute constant δ > 0, and certifies a lower bound ˆv on the game value v with additive expected error bounded as E[v − vˆ] ≤ exp(−Ω(n)). (On the downside, δ is tiny and the algorithm uses exponential space.) Our algorithm is recursive, and bootstraps a “base case” algorithm for fixed-size inputs. The method of recursive composition used, the specific base-case guarantees needed, and the steps to establish these guarantees are interesting and, we feel, likely to find uses beyond the present work.