A Reinforcement Learning approach to predicting college football play calls using Deep Q-Networks (DQN)
This project uses Deep Reinforcement Learning to predict offensive play calls (run vs. pass) in college football based on game situation. The model learns from historical play-by-play data from the University of Georgia's 2024 season, observing game states like down, distance, field position, score differential, and time remaining to make predictions.
- Custom Gymnasium Environment — Football-specific state representation with 10 game situation features
- DQN Agent — Deep Q-Network implementation using Stable-Baselines3
- Reward Shaping — Configurable bonuses for first downs, touchdowns, and situational awareness
- Hyperparameter Optimization — Bayesian optimization with Optuna including early stopping (pruning)
- Comprehensive Data Pipeline — Automated data fetching, cleaning, and feature engineering
CFBPlay_RL/
├── src/
│ ├── Fetch_Data.py # API data collection from College Football Data
│ ├── Clean_data.py # Data preprocessing and feature engineering
│ ├── FootballPlayEnv.py # Custom Gymnasium environment
│ ├── Train_dqn.py # DQN training script
│ └── optimize_hyperparameters.py # Bayesian hyperparameter optimization
├── data/ # Raw and processed data files
├── models/ # Saved trained models
├── results/ # Training curves, confusion matrices, summaries
├── logs/ # TensorBoard logs
└── optimization_results/ # Hyperparameter search results
- Python 3.8+
- College Football Data API key (get one here)
-
Clone the repository
git clone https://github.com/yourusername/CFBPlay_RL.git cd CFBPlay_RL -
Create a virtual environment
python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate
-
Install dependencies
pip install pandas numpy gymnasium stable-baselines3 optuna matplotlib seaborn scikit-learn requests
-
Set your API key
export CFBD_API_KEY="your_api_key_here"
Collect play-by-play data from the College Football Data API:
cd src
python Fetch_Data.pyThis retrieves all offensive plays for Georgia's 2024 regular season.
Transform raw data into RL-ready features:
python Clean_data.pyFeatures engineered:
| Feature | Description |
|---|---|
down |
Current down (1-4) |
distance |
Yards to first down |
yardsToGoal |
Yards to end zone |
period |
Game quarter |
seconds_remaining |
Time left in quarter |
score_diff |
Point differential (offense perspective) |
offenseTimeouts |
Remaining offensive timeouts |
defenseTimeouts |
Remaining defensive timeouts |
is_redzone |
Inside opponent's 20-yard line |
is_goal_to_go |
Distance ≥ yards to goal |
Train a Deep Q-Network on the processed data:
python Train_dqn.pyTraining outputs:
- Saved model (
.zip) - Learning curves (
.png) - Confusion matrix (
.png) - Training summary (
.json)
Run Bayesian optimization to find the best hyperparameters:
python optimize_hyperparameters.pyOptimized parameters:
- Learning rate
- Replay buffer size
- Batch size
- Discount factor (gamma)
- Exploration schedule
- Network architecture
- Target update interval
The agent observes a 10-dimensional continuous state vector (normalized to [0, 1]):
state_features = [
'down', 'distance', 'yardsToGoal', 'period',
'seconds_remaining', 'score_diff', 'offenseTimeouts',
'defenseTimeouts', 'is_redzone', 'is_goal_to_go'
]Binary discrete actions:
0— Run play1— Pass play
Base reward: Yards gained on the play
With reward shaping enabled:
| Event | Reward Modifier |
|---|---|
| First down | +10 |
| Touchdown | +30 |
| Turnover | -25 |
| Negative play | -5 |
| 3rd & short run | +3 |
| 3rd & long pass | +3 |
| Strategy | Accuracy |
|---|---|
| Random | 50.0% |
| Always Pass | 56.8% |
| DQN Agent | Target: >56.8% |
EVALUATION RESULTS
════════════════════════════════════════════════════════════════════════════════
Performance Metrics:
Average Reward: XXX.XX ± XX.XX
Prediction Accuracy: XX.X% ± X.X%
Avg Yards/Play: X.XX
Key hyperparameters in Train_dqn.py:
TOTAL_TIMESTEPS = 500000 # Training steps
LEARNING_RATE = 0.0001 # Adam optimizer LR
BUFFER_SIZE = 10000 # Replay buffer capacity
BATCH_SIZE = 64 # Minibatch size
GAMMA = 0.99 # Discount factor
EXPLORATION_FRACTION = 0.3 # Exploration scheduleThe training pipeline generates:
- Learning Curves — Episode rewards over training
- Confusion Matrix — Run/pass prediction accuracy breakdown
- Optimization History — Hyperparameter search progress (if using Optuna)
- Expand to multiple teams and seasons
- Add more granular play types (screen, deep pass, inside/outside run)
- Incorporate defensive alignment features
- Implement PPO and A2C for comparison
- Real-time game prediction interface
- Add player personnel groupings as features
pandas— Data manipulationnumpy— Numerical computinggymnasium— RL environment frameworkstable-baselines3— DQN implementationoptuna— Hyperparameter optimizationmatplotlib/seaborn— Visualizationscikit-learn— Metrics and evaluationrequests— API data fetching
This project is for educational and personal portfolio use only.
- College Football Data API for play-by-play data
- Stable-Baselines3 for RL implementations
- Optuna for hyperparameter optimization
Jake Pearlman
Go Dawgs! 🐶