Trung Nguyen, Hormazd Nadir Godrej and Luke Kim
Class project for CS 229: Machine Learning at Stanford University.
Each directory is a separate package for a task with the Connect 4 game.
play
: play a game of Connect 4 on the terminal against yourself (or your friend). It serves as a good starting point for familiarizing oneself with the game.classification
: given a board position, predict who will be the winner. The dataset in use is the UIC and the Kaggle dataset. TODO: add hyperlinks to the datasetsminimax
: solve the game using the Minimax algorithm.alpha_zero
: solve the game using the AlphaZero algorithms. More details below.
The package alpha_zero
is based on
- The publication from DeepMind (here)
- The AlphaZero implementation for Connect4 by @Zeta36 (here)
- The Reversi development based on the AlphaZero paper by @mokemokechicken (here)
Each packages have a different set of requirements, but they all require Python 3.
-
play
: None -
classification
: Requiresscikit
andpygame
pip install scikit pygame
minimax
: Requirespygame
pip install pygame
alpha_zero
: Requirestensorflow
,keras
anddotenv
.
pip install tensorflow keras python-dotenv
To set up Tensorflow as the backend for Keras
export KERAS_BACKEND=tensorflow
play
package
To play the game:
$ python play/src/connect4.py
classification
To run the logistic regression:
$ python classification/src/data_util.py
The script will download data from the UIC database and run logistic regression on it.
minimax
To play against the Minimax solver:
$ python solver/src/minimax_playable.py
alpha_zero
This AlphaGo Zero implementation consists of three worker self
, opt
and eval
that can be run in parallel.
self
is Self-Play to generate training data by self-play using BestModel.opt
is Trainer to train model, and generate next-generation models.eval
is Evaluator to evaluate whether the next-generation model is better than BestModel. If better, replace BestModel.
data/model/model_best_*
: BestModel.data/model/next_generation/*
: next-generation models.data/play_data/play_*.json
: generated training data.logs/main.log
: log file.
python src/connect4_zero/run.py self
When executed, Self-Play will start using BestModel. If the BestModel does not exist, new random model will be created and become BestModel.
Options:
--new
: create new BestModel--type mini
: use mini config for testing, (seesrc/connect4_zero/configs/mini.py
)
python src/connect4_zero/run.py opt
When executed, Training will start. A base model will be loaded from latest saved next-generation model. If not existed, BestModel is used. Trained model will be saved every 2000 steps(mini-batch) after epoch.
Options:
--type mini
: use mini config for testing, (seesrc/connect4_zero/configs/mini.py
)--total-step
: specify total step(mini-batch) numbers. The total step affects learning rate of training.
python src/connect4_zero/run.py eval
When executed, Evaluation will start. It evaluates BestModel and the latest next-generation model by playing about 200 games. If next-generation model wins, it becomes BestModel.
Options:
--type mini
: use mini config for testing, (seesrc/connect4_zero/configs/mini.py
)--single
: run a single loop of evaluation against BestModel, without overwriting.
Model configurations can be set in alpha_zero/src/connect4_zero/configs/
.
Dua, D. and Graff, C. (2019). UCI Machine Learning Repository [http://archive.ics.uci.edu/ml]. Irvine, CA: University of California, School of Information and Computer Science.
Team members contact:
- Trung Nguyen: trungcn@stanford.edu
- Hormazd Nadir Godrej: hormazd@stanford.edu
- Luke Kim: mkim14@stanford.edu