Monte Carlo Tree Search (MCTS)
MCTS is a planning algorithm that builds a decision tree through random simulations and identifies the most promising actions.
MCTS builds decision trees through simulation – the planning engine behind AlphaGo, now also adapted for LLM reasoning.
Explanation
MCTS repeats four steps: selection (UCB), expansion, simulation (rollout), backpropagation. Combined with neural networks (AlphaGo), the NN value replaces simulation.
Marketing Relevance
MCTS is the planning component of AlphaGo/AlphaZero and is increasingly adapted for LLM reasoning (tree-of-thought).
Common Pitfalls
Computationally expensive with large branching factors. Quality depends on rollout policy. Not suitable for real-time decisions.
Origin & History
Coulom (2006) and Kocsis & Szepesvári (2006, UCT) introduced MCTS. AlphaGo (DeepMind, 2016) made MCTS world-famous. AlphaZero (2017) used MCTS + self-play without human data.
Comparisons & Differences
Monte Carlo Tree Search (MCTS) vs. Minimax
Minimax searches the entire tree (exact but slow); MCTS samples and focuses on promising paths – scales better.