NFLverse is the home for our NFL and college football data work.
The goal is simple:
- Game schedules and results
- Team and conference data
- Rosters and player info
- Play-by-play and drive-level data
- Team and player stats
- Rankings, standings, and advanced metrics
- Historical data and season summaries
- Keep the data traceable back to a source.
- Keep workflows repeatable.
- Treat NFL and CFB as related but separate products.
- Keep the repo easy to grow.
Version notes live in CHANGELOG.md.
The actual version number lives in VERSION.
Normal flow:
- add notes under
## [Unreleased]inCHANGELOG.md - add one PR label:
release:major,release:minor,release:patch, orrelease:none - let GitHub handle the version bump, tag, and release after merge
Formatting is checked in pull requests.
Use:
npm installnpm run formatnpm run format:check
The first ingestion slice is NFL only.
It pulls verified parquet assets from the official
nflverse-data GitHub releases and
stores them under data/raw/nfl/.
Initial datasets:
playersschedulesteamsrostersdraft_pickscombinepbp
The sync step is conservative on purpose:
playersandschedulespull one parquet eachrostersandpbpare season-partitioned- if you do not pass
--seasonsfor a season-partitioned dataset, the pipeline only pulls the latest available season
Setup:
make install
Examples:
make sample-ingestmake sync-stage-slicemake build-dbmake build-stagemake validate-stagemake run-app
The database step creates a local DuckDB file at data/staging/nflverse.duckdb
and registers raw parquet-backed views in the raw_nfl schema.
The first staged layer builds stage_nfl tables for the 2021-2025 NFL seasons:
stage_nfl.teamsstage_nfl.playersstage_nfl.gamesstage_nfl.game_teamsstage_nfl.roster_snapshotsstage_nfl.draft_picksstage_nfl.combinestage_nfl.rookies
See docs/nfl-ingestion.md for details.
A small Flask app is available for browsing the local DuckDB tables.
- run
make run-app - open
http://127.0.0.1:5000
See docs/table-browser.md for details.
Current working layout:
data/
raw/
nfl/
staging/
curated/
config/
scripts/
tests/
docs/
The repo now has an initial NFL ingestion scaffold and local raw database build step. The larger transformation, modeling, and analytics layers are still ahead.