Watch the Unobserved: A Simple Approach to Parallelizing Monte Carlo Tree Search

Liu, Anji; Chen, Jianshu; Yu, Mingze; Zhai, Yu; Zhou, Xuewen; Liu, Ji

Computer Science > Machine Learning

arXiv:1810.11755 (cs)

[Submitted on 28 Oct 2018 (v1), last revised 25 Feb 2020 (this version, v5)]

Title:Watch the Unobserved: A Simple Approach to Parallelizing Monte Carlo Tree Search

Authors:Anji Liu, Jianshu Chen, Mingze Yu, Yu Zhai, Xuewen Zhou, Ji Liu

View PDF

Abstract:Monte Carlo Tree Search (MCTS) algorithms have achieved great success on many challenging benchmarks (e.g., Computer Go). However, they generally require a large number of rollouts, making their applications costly. Furthermore, it is also extremely challenging to parallelize MCTS due to its inherent sequential nature: each rollout heavily relies on the statistics (e.g., node visitation counts) estimated from previous simulations to achieve an effective exploration-exploitation tradeoff. In spite of these difficulties, we develop an algorithm, WU-UCT, to effectively parallelize MCTS, which achieves linear speedup and exhibits only limited performance loss with an increasing number of workers. The key idea in WU-UCT is a set of statistics that we introduce to track the number of on-going yet incomplete simulation queries (named as unobserved samples). These statistics are used to modify the UCT tree policy in the selection steps in a principled manner to retain effective exploration-exploitation tradeoff when we parallelize the most time-consuming expansion and simulation steps. Experiments on a proprietary benchmark and the Atari Game benchmark demonstrate the linear speedup and the superior performance of WU-UCT comparing to existing techniques.

Subjects:	Machine Learning (cs.LG); Machine Learning (stat.ML)
Cite as:	arXiv:1810.11755 [cs.LG]
	(or arXiv:1810.11755v5 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.1810.11755

Submission history

From: Anji Liu [view email]
[v1] Sun, 28 Oct 2018 03:24:01 UTC (258 KB)
[v2] Tue, 5 Feb 2019 16:42:24 UTC (459 KB)
[v3] Thu, 26 Sep 2019 21:10:56 UTC (5,651 KB)
[v4] Tue, 11 Feb 2020 03:48:32 UTC (7,787 KB)
[v5] Tue, 25 Feb 2020 22:03:23 UTC (7,788 KB)

Computer Science > Machine Learning

Title:Watch the Unobserved: A Simple Approach to Parallelizing Monte Carlo Tree Search

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Watch the Unobserved: A Simple Approach to Parallelizing Monte Carlo Tree Search

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators