This repository contains the Hopf-GAE, a physics-informed deep learning architecture that detects depression-related dynamical abnormalities without ever training on depressed brains. Rather than framing MDD detection as binary classification (which fails at
The key innovations:
-
Biophysically grounded node features — every node carries the per-region bifurcation parameter
$a_j$ , natural frequency$\omega_j$ , and goodness-of-fit$\chi^2_j$ estimated by the Stuart-Landau / Hopf bifurcation framework via the UKF-MDD pipeline, plus Yeo 7-network one-hot encodings (11 features per ROI). -
Multi-relational graph attention — three edge types (PLV phase synchrony, MVAR Granger causality, SC structural connectivity) with learned per-relation attention weights.
-
Denoising graph autoencoder — Gaussian noise injection (
$\sigma = 0.1$ ) on encoder input and dropout ($p = 0.3$ ) on the latent code replace the variational bottleneck, which collapsed in all tested configurations due to low within-HC variance of bifurcation parameters. -
Expanded reconstruction targets — 7-dimensional per-node targets (3 physics + 4 connectivity-derived: PLV node strength, MVAR in/out-strength, within-network PLV) force the bottleneck to encode richer per-ROI structure.
-
Two-level Fisher Linear Discriminant scoring — data-driven combination of node dynamics and edge connectivity anomalies, with per-relation weighting at level 1 and node-vs-edge weighting at level 2. No manual tuning.
|
Classification (insufficient data)
|
Normative anomaly detection (this work)
|
| Feature | Dim | Source | Meaning |
|---|---|---|---|
| 1 | UKF-MDD | Bifurcation parameter — distance from critical point | |
| 1 | Hilbert phase | Natural oscillation frequency (Hz) | |
| 1 | UKF fit | Goodness-of-fit (model–data agreement) | |
| Network one-hot | 8 | Yeo 7 + Subcortical | Functional network membership |
| Feature | Weight | Source | Purpose |
|---|---|---|---|
| 2.0 | UKF | Primary clinical marker | |
| 1.0 | Hilbert | Oscillation dynamics | |
| 1.0 | UKF fit | Model–data agreement | |
| PLV node strength | 0.5 | Edge aggregation | Phase synchrony profile |
| MVAR in-strength | 0.5 | Edge aggregation | Directed input connectivity |
| MVAR out-strength | 0.5 | Edge aggregation | Directed output connectivity |
| Within-network PLV | 0.5 | Edge aggregation | Intra-network coherence |
| Relation | Type | Source | Encoder Weight (Conv₁) |
|---|---|---|---|
| PLV | Undirected | Phase Locking Value | 0.771 |
| SC | Undirected | 0.185 | |
| MVAR | Directed | Lasso-MVAR | 0.044 |
Each GAT layer maintains separate learnable projections
Two multi-relational GAT layers (input_proj (
Denoising: During training, Gaussian noise (
Node path: Deterministic projection $\mathbf{h}j \to z_j \in \mathbb{R}^8$ → dropout ($p = 0.3$) → linear decoder ($8 \to 7$) → reconstructed $(a, \omega, \chi^2, s\text{PLV}, s_\text{MVAR-in}, s_\text{MVAR-out}, \text{PLV}_\text{within})$.
Edge path: Three MLP edge decoders predict edge existence from
Graph-level loss: Per-graph mean and standard deviation of the bifurcation parameter
| Component | Shape | Parameters | Status |
|---|---|---|---|
|
|
264 | Trainable | |
| Linear decoder | 63 | Trainable | |
| Edge decoders (PLV, SC, MVAR) |
|
1,635 | Trainable |
| Total trainable | 1,962 |
Level 1 — Per-relation edge weighting: Each edge type gets a signed Fisher weight
Level 2 — Node-vs-edge weighting: The composite edge score and node reconstruction error are combined with signed Fisher weights:
Both levels are fully data-driven. Signed weights handle reversed signals naturally.
Loss function (GAE training):
Total parameters: 7,447
├── Frozen encoder: 5,485 (74%)
│ ├── conv1 (3-relation GAT, 11→32): 1,286
│ ├── conv2 (3-relation GAT, 32→32): 3,302
│ ├── input_proj (masked residual, 11→32): 352
│ └── physics_head (32→16→1): 545
└── Trainable GAE: 1,962 (26%)
├── fc_z (32→8): 264
├── linear_decoder (8→7): 63
└── edge_decoders (3 × MLP 32→16→1): 1,635
┌────────────┬─────────────────┬──────────────┬──────────────┬──────────────┬──────────────┐
│ Synthetic │ HC train │ HC holdout │ HC test │ MDD rest1 │ MDD rest2 │
│ n = 200 │ 24 subj (199s) │ ~5 subj (36s)│ 6 subj (60s) │ 19 subj │ 18 subj │
│ Stage 1 │ Stage 2 │ Test only │ Test only │ Test only │ Test only │
└────────────┴─────────────────┴──────────────┴──────────────┴──────────────┴──────────────┘
Synthetic + HC train = train | HC holdout + HC test + MDD = never trained on
The HC train/test split is by subject (not session) to prevent leakage. HC holdout subjects (~15%) provide an unbiased false positive rate estimate (0/36 = 0.0%). MDD subjects are never seen during any training stage. HC train vs. test overfitting check:
Denoising autoencoder (not variational) — The variational bottleneck (
Linear decoder (63 parameters) — An MLP decoder (
Expanded reconstruction targets (7 features) — Reconstructing only
Node-level (not graph-level) bottleneck — Graph-level pooling into a single
Edge decoders on
Absolute difference
Feature-weighted reconstruction — Weights
| Metric | Value | 95% CI | UKF Reference |
|---|---|---|---|
| HC vs MDD separation |
|
— | — |
| Permutation null (10,000) | — | — | |
| HC holdout FP rate | 0/36 (0.0%) | — | — |
| HC holdout vs MDD | — | — | |
| Overfitting check | — | — | |
| Whole-brain intervention |
|
|
|
| Circuit intervention |
|
|
|
| Limbic intervention |
|
— | |
| Subcortical intervention |
|
— | |
| Circuit enrichment (top-10) | 2.50× (8/10), hypergeom |
— | — |
| Circuit enrichment (top-20) | 2.03× (13/20), hypergeom |
— | — |
| Circuit vs non-circuit |
|
— | — |
| Heterogeneity (raw |
|
— |
|
| #1 anomalous ROI | RH Default PFCdPFCm₄ | — | Converges with Ch. 5 cluster |
| #1 anomalous network | Limbic | — | — |
All four intervention scales survive Benjamini-Hochberg FDR correction. Active group moves away from HC (increased anomaly), sham moves toward HC (decreased anomaly).
| Rank | ROI | Network | Circuit? |
|---|---|---|---|
| 1 | RH Default PFCdPFCm₄ | Default Mode | ✓ |
| 2 | LH Limbic TempPole₁ | Limbic | ✓ |
| 3 | RH Vis₁ | Visual | |
| 4 | NAcc-rh | Subcortical | ✓ |
| 5 | LH Limbic TempPole₂ | Limbic | ✓ |
| 6 | RH SalVentAttn FrOperIns₁ | Salience/VentAttn | |
| 7 | LH Limbic TempPole₄ | Limbic | ✓ |
| 8 | LH Default Temp₅ | Default Mode | ✓ |
| 9 | RH Limbic TempPole₁ | Limbic | ✓ |
| 10 | RH Limbic TempPole₂ | Limbic | ✓ |
| 11 | LH Cont Cing₂ | Frontoparietal | |
| 12 | LH Default PHC₁ | Default Mode | |
| 13 | Thal-rh | Subcortical | ✓ |
| 14 | LH Default Par₁ | Default Mode | |
| 15 | RH Default PFCdPFCm₃ | Default Mode | ✓ |
Results are robust to the bottleneck dimension
| Params | HC–MDD |
WB |
WB |
Sub |
Sub |
Top-10 | hp$_{10}$ | |
|---|---|---|---|---|---|---|---|---|
| 3 | 1,762 | +3.68 | +1.36 | 0.020 | +1.79 | 0.002 | 2.50× | 0.002 |
| 4 | 1,802 | +3.49 | +1.36 | 0.018 | +1.80 | 0.002 | 2.50× | 0.002 |
| 6 | 1,882 | +3.54 | +1.26 | 0.031 | +1.75 | 0.004 | 2.19× | 0.013 |
| 8 | 1,962 | +3.44 | +1.30 | 0.026 | +1.75 | 0.005 | 1.88× | 0.059 |
| 12 | 2,122 | +3.57 | +1.29 | 0.027 | +1.70 | 0.006 | 1.88× | 0.059 |
Whole-brain and subcortical intervention effects are significant (
The Hopf-GAE consumes outputs from the R biophysical pipeline (UKF-MDD):
| Input | File | Format |
|---|---|---|
| Bifurcation parameters | results/v3/sl_stage1_results_216roi.csv |
CSV (one row per ROI per subject per session) |
| PLV matrices | results/v3/plv/plv_all_216roi.rds |
R list, keyed "subject_id|session" |
| MVAR matrices | results/v3/s2_mvar_all_216roi.rds |
R list, keyed "subject_id|session" |
| HC comparison data | results/ch5_v4def/ch5_v4def_results.rds |
R list |
|
Python packages |
Upstream (R pipeline) |
System: Python ≥ 3.9 · PyTorch ≥ 2.0 · PyTorch Geometric ≥ 2.4 · R ≥ 4.2 (for upstream pipeline only)
# 1. Ensure upstream pipeline has been run
# (github.com/skaraoglu/UKF-MDD)
# 2. Install Python dependencies
pip install torch torch_geometric pyreadr scikit-learn statsmodels
# 3. Run the full pipeline
jupyter execute main_analysis.ipynb
# Pipeline stages:
# S1–S6: Data loading, graph construction, quality control
# S7–S10: Synthetic pre-training (encoder, 100 epochs)
# S11–S12: HC data loading + augmentation, GAE training (200 epochs)
# S13: Anomaly scoring (two-level Fisher LDA)
# S14: Statistical analysis (FDR, permutation tests, enrichment)If you use this architecture or build on this work, please cite:
Built with PyTorch Geometric · Node dynamics from UKF-MDD · Parcellation: Schaefer 2018 + Melbourne Subcortex