Connect Four AI

An AlphaZero-style AI for Connect Four, built entirely from scratch in Python. This project combines a neural network with Monte Carlo Tree Search (MCTS) to create a competitive game-playing agent that learns through self-play.

View on GitHub

How It Works

The AI uses the same approach that powered DeepMind's AlphaZero, which famously mastered Chess, Shogi, and Go through pure self-play without any human knowledge beyond the game rules.

AlphaZero Architecture

Board State 6x7 grid → 84D vector

→

            Neural Network
            256 hidden neurons
          

→

MCTS Tree search

→

Best Move Action selection

The system has two key components:

Neural Network with dual output heads:
- Policy head: Predicts probability distribution over 7 columns
- Value head: Estimates win probability from current position
Monte Carlo Tree Search (MCTS): Uses neural network predictions to guide exploration, building a search tree to find the strongest move

Training Process

The AI improves through an iterative self-play loop:

Self-Play: Current best model plays games against itself
Data Collection: Store board positions, MCTS policies, and game outcomes
Training: Update neural network to better predict policies and values
Evaluation: New model plays against previous best; replace if stronger
Repeat: Continue until convergence

Key Insight: The neural network and MCTS reinforce each other. Better neural network predictions make MCTS more efficient, and MCTS provides better training targets for the neural network. This creates a powerful feedback loop that drives improvement.

Play Against the AI

The demo below uses minimax with alpha-beta pruning to find optimal moves:

Connect Four

Click a column to drop your piece

Your turn (Red)

AI thinking

Features

Self-Play Training

AlphaZero-style training loop with replay buffer and incremental model generations.

Neural Network

Custom implementation with 256 hidden neurons, policy and value heads.

MCTS

Monte Carlo Tree Search with UCB exploration and neural network guidance.

Human vs AI

Interactive gameplay mode to test your skills against trained models.

Model Evaluation

Ladder system comparing successive generations to track improvement.

Debug Tools

Position analysis with neural network output inspection for debugging.

Project Structure

# Core modules
train.py          # AlphaZero training loop
neural_network.py # NN with policy + value heads
mcts.py           # Monte Carlo Tree Search
helper.py         # Game logic & board state
engines.py        # Agent wrappers

# Evaluation & analysis
evaluate.py       # Model generation benchmarks
eval_baselines.py # Test vs random/heuristic
debug.py          # Policy/value inspection
      

Why AlphaZero?

Traditional game AI relies on handcrafted evaluation functions and extensive domain knowledge. AlphaZero's approach is fundamentally different: it learns entirely from self-play, discovering strategies that humans might never consider.

While Connect Four is a "solved" game (perfect play results in a win for the first player), building an AlphaZero-style agent demonstrates the power of combining deep learning with tree search—the same techniques that have achieved superhuman performance in far more complex games.