This repository provides a standalone pipeline for tracking infant pose from video and predicting the risk of cerebral palsy (CP) using a spatio‑temporal graph convolutional network (ST‑GCN) ensemble.
⚠️ Important
This software is intended for research use only.
It is not a medical device and must not be used for diagnosis, screening, treatment, or any clinical decision making.
The pipeline has two main stages:
-
Tracking (TensorFlow/Keras)
- Infant detection using a fine‑tuned EfficientPoseDet model.
- Full‑body infant pose estimation using a fine-tuned EfficientHourglass model.
- Output: Per‑frame 2D skeleton coordinates (x, y, confidence) for 29 body keypoints.
-
Prediction (PyTorch)
- Temporal resampling and standardization of skeletons.
- Construction of rich motion features (position, velocity, and acceleration).
- CP risk prediction using a specialized ensemble of ST‑GCN models.
- Output: Window‑level and overall CP risk with uncertainty estimates.
├── main.py # Entry point: tracking + prediction + CLI
├── utils
│ ├── track_helpers.py # TensorFlow/Keras tracking helpers
│ └── predict_helpers.py # PyTorch prediction helpers
├── models
│ ├── track # TensorFlow/Keras tracking models
│ │ ├── EfficientPoseDet_128x128.pb
│ │ └── EfficientHourglassB4_512x512
│ └── predict # PyTorch prediction models
│ └── CP
│ └── experiments
│ ├── search_19/...
│ ├── crossval_19/val1/...
│ └── ...
│ └── stgcn.py
├── videos # Videos of infant spontaneous movements meeting GMA standards
│ ├── Infant1.mp4
│ └── ...
├── requirements.txt # Python dependencies
├── LICENSE # Software license (CC BY-NC 4.0)
└── README.md
Python 3.9 is recommended.
Install core dependencies:
pip install -r requirements.txtGPU usage is optional but highly recommended for practical runtime.
The main entry point is main.py. It can process either a single video file (e.g., videos/Infant1.mp4 in section 4.1) or all videos in a directory (e.g., videos in section 4.2). Both usages support --visualize and --store:
--visualize
Produces for each input video two MP4 videos,<video_name>_roi.mp4and<video_name>_tracked.mp4, visualizing tracking results (see section 5.1 for details).--store
Stores for each input video two CSV files,<video_name>_coordinates.csvfor per‑frame keypoint coordinates (see section 5.1) and<video_name>_risk.csvfor window‑wise and overall CP risk (see section 5.2).
python main.py --path videos/Infant1.mp4 --visualize --storepython main.py --path videos --visualize --storeAll outputs are written to an outputs folder next to the input video(s).
Coordinates CSV (<video_name>_coordinates.csv) with columns:
frame,
head_top_x, head_top_y, head_top_conf,
nose_x, nose_y, nose_conf,
...,
left_big_toe_x, left_big_toe_y, left_big_toe_conf
Description:
frame: Video frame number corresponding with the original video.<body_keypoint>_x: Normalized x coordinate in [0, 1] of the body keypoint's 2D location.<body_keypoint>_y: Normalized y coordinate in [0, 1] of the body keypoint's 2D location.<body_keypoint>_conf: Confidence of body keypoint placement, with higher value reflecting more certainty in localization.
Visualization videos
<video_name>_roi.mp4: Video displaying region of interest (ROI) crops localized by infant detection.<video_name>_tracked.mp4: Skeleton overlay of identified body keypoints on the original video.
Risk CSV (<video_name>_risk.csv) with columns:
window_cp_risk,
window_cp_risk_lower,
window_cp_risk_upper,
window_uncertainty,
cp_risk,
cp_risk_lower,
cp_risk_upper,
classification_uncertainty,
interval
Description:
window_cp_risk: Median CP risk in [0, 1] for each temporal window.window_cp_risk_lower,window_cp_risk_upper: 25th and 75th percentiles of window CP risk across ensemble models.window_uncertainty: Discrete per‑window uncertainty code (0 = confident low risk, 1 = uncertain low risk, 2 = uncertain high risk, 3 = confident high risk).cp_risk: Overall median CP risk in [0, 1] across all windows.cp_risk_lower,cp_risk_upper: 25th and 75th percentiles of overall CP risk across ensemble models.classification_uncertainty: Discrete overall uncertainty code (0 = confident low risk, 1 = uncertain low risk, 2 = uncertain high risk, 3 = confident high risk).interval: Temporal spacing in seconds between prediction windows.
This repository provides research code for tracking infant movements in video and predicting the risk of CP:
- It has not been certified as a medical device.
- It must not be used for clinical diagnosis, triage, or screening.
All code and artifacts in this repository are released under the Creative Commons Attribution‑NonCommercial 4.0 International (CC BY‑NC 4.0) license.