Connect Four AI
An AlphaZero-style AI for Connect Four, built entirely from scratch in Python. This project combines a neural network with Monte Carlo Tree Search (MCTS) to create a competitive game-playing agent that learns through self-play.
How It Works
The AI uses the same approach that powered DeepMind's AlphaZero, which famously mastered Chess, Shogi, and Go through pure self-play without any human knowledge beyond the game rules.
AlphaZero Architecture
The system has two key components:
- Neural Network with dual output heads:
- Policy head: Predicts probability distribution over 7 columns
- Value head: Estimates win probability from current position
- Monte Carlo Tree Search (MCTS): Uses neural network predictions to guide exploration, building a search tree to find the strongest move
Training Process
The AI improves through an iterative self-play loop:
- Self-Play: Current best model plays games against itself
- Data Collection: Store board positions, MCTS policies, and game outcomes
- Training: Update neural network to better predict policies and values
- Evaluation: New model plays against previous best; replace if stronger
- Repeat: Continue until convergence
Play Against the AI
The demo below uses minimax with alpha-beta pruning to find optimal moves:
Connect Four
Click a column to drop your piece
Features
Self-Play Training
AlphaZero-style training loop with replay buffer and incremental model generations.
Neural Network
Custom implementation with 256 hidden neurons, policy and value heads.
MCTS
Monte Carlo Tree Search with UCB exploration and neural network guidance.
Human vs AI
Interactive gameplay mode to test your skills against trained models.
Model Evaluation
Ladder system comparing successive generations to track improvement.
Debug Tools
Position analysis with neural network output inspection for debugging.
Project Structure
Why AlphaZero?
Traditional game AI relies on handcrafted evaluation functions and extensive domain knowledge. AlphaZero's approach is fundamentally different: it learns entirely from self-play, discovering strategies that humans might never consider.
While Connect Four is a "solved" game (perfect play results in a win for the first player), building an AlphaZero-style agent demonstrates the power of combining deep learning with tree search—the same techniques that have achieved superhuman performance in far more complex games.