Connect Four AI

An AlphaZero-style AI for Connect Four, built entirely from scratch in Python. This project combines a neural network with Monte Carlo Tree Search (MCTS) to create a competitive game-playing agent that learns through self-play.

How It Works

The AI uses the same approach that powered DeepMind's AlphaZero, which famously mastered Chess, Shogi, and Go through pure self-play without any human knowledge beyond the game rules.

AlphaZero Architecture

Board State 6x7 grid → 84D vector
Neural Network 256 hidden neurons
MCTS Tree search
Best Move Action selection

The system has two key components:

Training Process

The AI improves through an iterative self-play loop:

  1. Self-Play: Current best model plays games against itself
  2. Data Collection: Store board positions, MCTS policies, and game outcomes
  3. Training: Update neural network to better predict policies and values
  4. Evaluation: New model plays against previous best; replace if stronger
  5. Repeat: Continue until convergence
Key Insight: The neural network and MCTS reinforce each other. Better neural network predictions make MCTS more efficient, and MCTS provides better training targets for the neural network. This creates a powerful feedback loop that drives improvement.

Play Against the AI

The demo below uses minimax with alpha-beta pruning to find optimal moves:

Connect Four

Click a column to drop your piece

Your turn (Red)
AI thinking

Features

Self-Play Training

AlphaZero-style training loop with replay buffer and incremental model generations.

Neural Network

Custom implementation with 256 hidden neurons, policy and value heads.

MCTS

Monte Carlo Tree Search with UCB exploration and neural network guidance.

Human vs AI

Interactive gameplay mode to test your skills against trained models.

Model Evaluation

Ladder system comparing successive generations to track improvement.

Debug Tools

Position analysis with neural network output inspection for debugging.

Project Structure

# Core modules train.py # AlphaZero training loop neural_network.py # NN with policy + value heads mcts.py # Monte Carlo Tree Search helper.py # Game logic & board state engines.py # Agent wrappers # Evaluation & analysis evaluate.py # Model generation benchmarks eval_baselines.py # Test vs random/heuristic debug.py # Policy/value inspection

Why AlphaZero?

Traditional game AI relies on handcrafted evaluation functions and extensive domain knowledge. AlphaZero's approach is fundamentally different: it learns entirely from self-play, discovering strategies that humans might never consider.

While Connect Four is a "solved" game (perfect play results in a win for the first player), building an AlphaZero-style agent demonstrates the power of combining deep learning with tree search—the same techniques that have achieved superhuman performance in far more complex games.