Mastering the Game of Go with Deep Neural Networks and Tree Search

David Silver, Aja Huang, Chris J. Maddison, et al. · 2016

AlphaGo. Combines deep policy/value networks with Monte Carlo tree search to beat the world champion — a landmark demonstration of RL at scale.

What you'll get

Outline: a plain-English breakdown of the paper's core idea, prerequisites, and the concepts you'll need to implement it.
Exercises: five to ten hands-on tasks, each with a concept card, a prompt, a starter code stub, and a collapsible reference solution.
Runnable notebook: a single .ipynb you can download and open in Jupyter or VS Code to work through every exercise.
Extensions: suggested follow-up experiments so you don't stop at a faithful reimplementation.