Starfish-FL is an agentic federated learning (FL) framework that is native to AI agents. It is an essential component of the STARFISH project. It focuses on federated learning and analysis for the Analysis Mandate function of STARFISH.
Starfish-FL also offers a friendly user interface for easy use in domains including healthcare, computing resource allocation, and finance. Starfish-FL enables secure, privacy-preserving collaborative machine learning across multiple sites without centralizing sensitive data.
Biostatistics & Clinical Research — Starfish-FL supports the methods biostatisticians use daily — logistic regression, Cox proportional hazards, Kaplan-Meier survival curves, Poisson and negative binomial models for count data, censored regression (Tobit) for detection-limit outcomes, MICE for missing data, and more — all federated out of the box with proper inverse-variance weighted meta-analysis and built-in diagnostics (VIF, residuals, goodness-of-fit tests). Every task is available in both Python and R, so researchers can work in their preferred language. Hospitals and research institutions can collaboratively build models on patient data without sharing sensitive records.
Carbon-Aware Computing — Starfish-FL enables predicting energy consumption and carbon footprints for containerized workloads across distributed infrastructure. By training regression models federally across edge and cloud sites, organizations can forecast resource energy demands and make carbon-conscious scheduling decisions — all without centralizing sensitive operational data. See our paper on federated learning for carbon-aware container orchestration for details.
Starfish-FL is a complete federated learning platform consisting of three main components:
- Controller - Site management and FL task execution
- Router - Central coordination and message routing
- CLI - Typer-based CLI (
starfishcommand) for human and AI agent use, with built-in LLM agent for autonomous orchestration and end-to-end experiment automation - Workbench - Development and testing environment
In this section, we use healthcare as an example how Starfish-FL can be used.
Sites: Local environments that can act as coordinators or participants in federated learning projects.
Controllers: Installed on each site to manage local training, model aggregation, and provide a web interface for users.
Coordinators: Sites that create and manage FL projects, orchestrate training rounds, and perform model aggregation.
Participants: Sites that join existing projects and contribute their local data to collaborative training.
Router: Central routing server that maintains global state, facilitates communication between sites, and forwards messages between participants and coordinators.
Projects: Define one or multiple FL tasks with specified coordinator and participants.
Tasks: Individual machine learning operations (e.g., LogisticRegression, CoxProportionalHazards, CensoredRegression, PoissonRegression) within a project. Tasks can be implemented in Python or R.
Runs: Execution instances of a project, allowing repeated training over time.
-
Clone the repository
git clone https://github.com/denoslab/starfish-fl.git cd starfish-fl -
Start all services using Workbench
The workbench provides a complete development environment with all components:
cd workbench make build make up -
Initialize the database (first time only)
./init_db.sh
-
Create superuser for router (first time only)
docker exec -it starfish-router poetry run python3 manage.py makemigrations docker exec -it starfish-router poetry run python3 manage.py migrate docker exec -it starfish-router poetry run python3 manage.py createsuperuser
Make sure the username and password match what's configured in
workbench/config/controller.env. -
Access the applications
- Router API: http://localhost:8000/starfish/api/v1/
- Controller Web Interface: http://localhost:8001/
-
Stop the services
make stop # Stop services make down # Stop and remove containers
The Controller component is installed on each site participating in federated learning.
Key Features:
- Web-based user interface for project management
- Local model training and dataset management
- Support for 20 ML tasks across Python and R (regression, classification, survival analysis, censored regression, count data models, multiple imputation)
- Built-in model diagnostics (VIF, residual analysis, goodness-of-fit tests, prediction intervals)
- Real-time progress monitoring
- Celery-based distributed task processing
- Optional LLM agent hooks for per-round summaries, convergence detection, and failure triage (opt-in via task config)
Standalone Setup:
See controller/README.md for detailed installation and configuration.
Quick Start:
cd controller
docker-compose up -d
docker exec -it starfish-controller poetry run python3 manage.py migrateAccess at: http://localhost:8001/
The Router (Routing Server) maintains global state and coordinates communication between sites.
Key Features:
- RESTful API for site and project management
- Message forwarding between participants and coordinators
- Persistent storage for administrative data
- Support for end-to-end encryption
- Automated health checks via cron jobs
- Embedded LLM agent for adaptive aggregation, smart scheduling, and failure triage (opt-in per project)
Standalone Setup:
See router/README.md for detailed installation and configuration.
Quick Start:
cd router
docker-compose up -d
docker exec -it starfish-router poetry run python3 manage.py makemigrations
docker exec -it starfish-router poetry run python3 manage.py migrate
docker exec -it starfish-router poetry run python3 manage.py createsuperuserAccess at: http://localhost:8000/starfish/api/v1/
The Workbench provides a unified development and testing environment for the entire Starfish platform.
Key Features:
- Docker Compose orchestration for all components
- Unified configuration management
- Development utilities and scripts
- Makefile-based build system
Documentation:
See workbench/README.md for detailed usage.
Commands:
cd workbench
make build # Build all services
make up # Start all services
make stop # Stop services
make down # Stop and remove containers- Controller User Guide - Comprehensive guide for using the Controller web interface
- Task Configuration Guide - How to configure FL tasks and models
- CLI Agent Guide - Using the AI agent for autonomous FL orchestration
- Autonomous Experiment Guide - End-to-end experiment automation: dataset analysis, model selection, execution, and result interpretation
- Backend: Python 3.10.10, Django
- Task Queue: Celery
- Databases: PostgreSQL (Router), SQLite (Controller)
- Cache: Redis
- Python ML Libraries: scikit-learn, NumPy, Pandas, statsmodels, scipy, lifelines
- R Runtime: R 4.x with
jsonlite,survival,mice,MASS - Containerization: Docker, Docker Compose
Router Tests:
cd router
docker exec -it starfish-router poetry run python3 manage.py testController Tests:
cd controller
docker exec -it starfish-controller poetry run python3 manage.py testCLI Agent Tests:
cd cli
poetry install --extras agent
poetry run pytest tests/ -vEach component uses environment files for configuration:
- Controller:
controller/.envorworkbench/config/controller.env - Router:
router/.envorworkbench/config/router.env
Key configuration options:
Controller:
SITE_UID: Unique identifier for the siteROUTER_URL: URL of the routing serverROUTER_USERNAME: Authentication credentialsROUTER_PASSWORD: Authentication credentialsCELERY_BROKER_URL: Redis connection for CeleryREDIS_HOST: Redis host for caching
Router:
DATABASE_HOST: PostgreSQL hostPOSTGRES_DB: Database namePOSTGRES_USER: Database usernamePOSTGRES_PASSWORD: Database passwordSECRET_KEY: Django secret key
Every task below is available in both Python and R (where noted), so researchers and data scientists can work in whichever language they prefer.
| Task | Python | R | Description |
|---|---|---|---|
| Logistic Regression | LogisticRegression |
RLogisticRegression |
Binary classification with standard logistic regression |
| Statistical Logistic Regression | LogisticRegressionStats |
— | Binary classification with statistical inference (coefficients, p-values, CI, odds ratios) |
| Linear Regression | LinearRegression |
— | Continuous value prediction |
| SVM Regression | SvmRegression |
— | Support Vector Machine regression |
| ANCOVA | Ancova |
— | Analysis of Covariance for group comparisons controlling for covariates |
| Ordinal Logistic Regression | OrdinalLogisticRegression |
— | Proportional odds model for ordered categorical outcomes |
| Mixed Effects Logistic Regression | MixedEffectsLogisticRegression |
— | Multilevel logistic regression for clustered/hierarchical binary data |
| Task | Python | R | Description |
|---|---|---|---|
| Cox Proportional Hazards | CoxProportionalHazards |
RCoxProportionalHazards |
Time-to-event regression with hazard ratios (lifelines / survival::coxph) |
| Kaplan-Meier | KaplanMeier |
RKaplanMeier |
Non-parametric survival estimation with log-rank test (lifelines / survival::survfit) |
| Censored Regression (Tobit) | CensoredRegression |
RCensoredRegression |
Tobit Type I model for continuous outcomes with left/right censoring (custom MLE / survival::survreg) |
| Task | Python | R | Description |
|---|---|---|---|
| Poisson Regression | PoissonRegression |
RPoissonRegression |
GLM for count data with rate ratios (statsmodels / glm(family=poisson)) |
| Negative Binomial Regression | NegativeBinomialRegression |
RNegativeBinomialRegression |
Overdispersed count data (statsmodels / MASS::glm.nb) |
| Task | Python | R | Description |
|---|---|---|---|
| Multiple Imputation (MICE) | MultipleImputation |
RMultipleImputation |
Multiple imputation by chained equations with Rubin's rules (sklearn / mice::mice) |
| Task | Python | R | Description |
|---|---|---|---|
| Federated UNet | FederatedUNet |
— | Federated image segmentation using UNet with FedAvg aggregation |
All regression tasks include built-in diagnostics in their output:
- VIF (multicollinearity), residual summaries, Cook's distance
- Shapiro-Wilk normality test, Hosmer-Lemeshow GOF, overdispersion test
- Schoenfeld residuals for Cox PH proportional hazards assumption
- Censoring summary and AIC/BIC for censored regression
- Confidence and prediction interval summaries
See TASK_GUIDE.md for configuration details and diagnostic field reference.
- End-to-end encryption support for message payloads
- Secure private key exchange between sites
- Authentication required for router access
- Site-specific UIDs for identification
- No centralized data storage - data remains at local sites
Contributions are welcome! Please ensure:
- Code follows existing style and conventions
- Tests are included for new features
- Documentation is updated as needed
- Docker configurations are tested
Apache 2.0
For issues, questions, or contributions:
- Check component-specific README files
- Review user guides and task documentation
- Open an issue in the repository
If you use Starfish in your research, please cite:
@article{bao2026starfish,
title = {Privacy-Preserving Federated Analysis Reproduces Non-Inferiority Results from the {AcT} Multicentre Stroke Trial},
author = {Bao, Yunkai and Saad, Zainab and Duarte, Kaue and Abbas, Farhan and Sajobi, Tolulope and Holodinsky, Jessalyn K. and Menon, Bijoy K. and Drew, Steve},
journal = {SSRN preprint},
year = {2026},
doi = {10.2139/ssrn.6426303},
url = {https://ssrn.com/abstract=6426303}
}
