Motion Forcing: A Decoupled Framework for Robust Video Generation in Motion Dynamics

Tianshuo Xu¹, Zhifei Chen¹, Leyi Wu¹, Hao Lu¹, Ying-cong Chen^1,2*

¹HKUST (GZ) ²HKUST * corresponding author

Motion Forcing decouples physical reasoning from visual synthesis via a hierarchical Point → Shape → Appearance paradigm, enabling precise and physically consistent video generation from a single image and user-drawn trajectories. Given sparse motion anchors, the model first generates dynamic depth (Shape), then renders high-fidelity RGB frames (Appearance) — bridging the gap between control signals and complex scene dynamics.

Visualization

Driving Ego-Action Control

Turn Left	Turn Right	Speed Up	Slow Down

Complex Driving Scenarios

Dangerous Cut-In	Double Cut-In	Right Cut-In	Left Cut-In & Brake

TODO

Setup

git clone --recurse-submodules https://github.com/Tianshuo-Xu/Motion-Forcing.git
cd Motion-Forcing
pip install -r requirements.txt

Build VGGT:

git clone git@github.com:facebookresearch/vggt.git 
cd vggt
pip install -e .

Download depth estimation weights:

cd Video-Depth-Anything
bash get_weights.sh

Download YOLO segmentation weights into weights/yolo11l-seg.pt (used for interactive object selection in the demo).

CogVideoX base model and the fine-tuned transformer (TSXu/MotionForcing_driving) are downloaded automatically from HuggingFace on first run.

Run the Demo

python gradio_demo.py

Open http://localhost:7860. Upload an image, click objects to draw trajectories, then generate.

Acknowledgements

We thank the authors of CogVideoX, Video-Depth-Anything, VGGT, and Ultralytics YOLO for their outstanding open-source contributions.

Citation

@misc{xu2026motion,
      title={Motion Forcing: A Decoupled Framework for Robust Video Generation in Motion Dynamics}, 
      author={Tianshuo Xu and Zhifei Chen and Leyi Wu and Hao Lu and Ying-cong Chen},
      year={2026},
      eprint={2603.10408},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2603.10408}, 
}

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
Video-Depth-Anything @ 4f5ae23		Video-Depth-Anything @ 4f5ae23
docs		docs
models		models
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md
gradio_demo.py		gradio_demo.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Motion Forcing: A Decoupled Framework for Robust Video Generation in Motion Dynamics

Visualization

Driving Ego-Action Control

Complex Driving Scenarios

TODO

Setup

Run the Demo

Acknowledgements

Citation

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Motion Forcing: A Decoupled Framework for Robust Video Generation in Motion Dynamics

Visualization

Driving Ego-Action Control

Complex Driving Scenarios

TODO

Setup

Run the Demo

Acknowledgements

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages