Compare historical baseball players across any era using the Lahman database. Rank players by configurable weighted stats with era-adjusted z-scores.
- Batter and Pitcher rankings with separate stat sets
- Era-adjusted z-scores so players from different eras are compared fairly
- Configurable stat weights - choose which stats matter and how much
- Position filtering for batters (C, 1B, 2B, 3B, SS, LF, CF, RF, OF, DH)
- Year range selection from 1871 to 2025
- Minimum qualification thresholds (plate appearances / innings pitched)
-
Install dependencies:
pip install -r requirements.txt
-
Build the database (uses the CSV files already in the repo):
python scripts/build_db.py
-
Run the app:
streamlit run app.py
For each stat in each season, a z-score is computed: z = (player_value - league_mean) / league_std_dev. This normalizes stats across eras so a .300 BA in the dead-ball era is valued differently than in the steroid era.
When aggregating across seasons, z-scores are weighted by plate appearances (batters) or innings pitched (pitchers). The final composite score is a weighted average of selected z-scores based on user-configured weights.
Lahman Baseball Database - complete batting, pitching, and fielding statistics from 1871 to 2025.
Deploy for free on Streamlit Community Cloud:
- Push this repo to GitHub
- Connect the repo in Streamlit Cloud
- Set entry point to
app.py - The build script runs automatically if configured as a setup command
- Python with Streamlit for the web UI
- SQLite for the database (built from Lahman CSVs)
- pandas for data manipulation and z-score computation