Speech by Owen Randall from University of Alberta
Efficiently Solving Games with Expected Work Search
Our speaker, Owen Randall, is from the University of Alberta. He completed his Master's degree there and is now pursuing his Ph.D. He is currently part of Professor Martin Müller's research team, focusing on reinforcement learning and game-tree studies. Professor Martin Müller was the Ph.D. advisor to David Silver, the leader of the AlphaGo team.
During the lecture, Mr. Randall introduced a paper titled "Expected Work Search: Combining Win Rate and Proof Size Estimation," where he serves as the lead author. The method was developed in response to his research team’s discovery that evaluating a game state based solely on win rate or proof size would be insufficient and lead to various problems.
Taking the game of Go as an example, if we rely solely on win rate to evaluate the game state, we may encounter situations where the win rates for various moves are close to 100%. This might give the impression that any move on the board could be equally good. However, the reality is that only a few specific moves will lead to victory in the shortest number of steps. While other moves could still lead to winning the game, they would require more steps and computational resources. Therefore, the aim in such situations is to identify moves that lead to victory in the shortest number of steps (i.e., with the smallest proof size) under similar win rate conditions.
When evaluating the game state based on proof size, discovering a scenario where we can capture the opponent's pieces to increase the number of possible moves leads to a higher branching factor. However, the increase in options can discourage further expansion of such a position through proof number search (PNS), ultimately reducing our chances of winning. Therefore, it's important to strike a balance between proof size and win rate, so that we don't spend excessive time on proofs while still maintaining a good chance of winning.
In this paper, a new approach called Expected Work Search (EWS) is introduced. EWS strikes a balance between win rate and proof size by leveraging metrics from both aspects to prioritize the nodes in the game tree. This prioritization process allows the method to focus on nodes with smaller proof sizes and promising win rates for expansion, ultimately aiming for both speed and high win rates. Compared to traditional Go solvers, this method achieves an impressive sixfold speed increase, showcasing its practicality. For more details, the research paper has been accepted at the esteemed IJCAI conference.
After the lecture, Mr. Randall and the audience enjoyed lunch together. We are truly grateful for the unique opportunity to participate in academic discussions with esteemed speakers from leading universities around the world. This experience has left a lasting impression on all of us.