Jiyuan Wang1,2, Chunyu Lin1✉, Lei Sun2✝, Rongying Liu1, Mingxing Li2, Lang Nie3, Kang Liao4, Xiangxiang Chu2, Yao Zhao1
1Beijing Jiaotong University
2Alibaba Group
3Chongqing University of Posts and Telecommunications
4Nanyang Technological University
✉Corresponding author. ✝Project leader.
We present FE2E, a DiT-based foundation model for monocular dense geometry prediction. FE2E adapts an advanced image editing model to dense geometry tasks and achieves strong zero-shot performance on both monocular depth and normal estimation.
- [2026-03-17]: Code and Checkpoint are available now!
- [2026-02-21]: FE2E was accepted by CVPR 2026!!! 🎉🎉🎉
- [2025-09-05]: Paper released on arXiv.
This codebase is prepared as an inference/evaluation release.
pip install -r requirements.txtRecommended local layout:
FE2E/
├── pretrain/
│ ├── step1x-edit-i1258.safetensors
│ ├── step1x-edit-v1p1-official.safetensors
│ └── vae.safetensors
├── lora/
│ └── LDRN.safetensors
├── infer/
│ ├── eth3d/
│ │ └── eth3d.tar
│ └── dsine_eval/
│ ├── nyuv2/
│ └── scannet/
└── logs/
[ ] Training code will be released later.
- Download the base weights, which from the official Step1X-Edit release.
- Download FE2E LoRA checkpoint
- Depth benchmarks follow the external evaluation data convention from Marigold.
- Normal benchmarks follow the external evaluation data convention from DSINE.
Supported depth benchmarks:
nyu_v2,kitti,eth3d,diode,scannet
Supported normal benchmarks:
nyuv2,scannet,ibims,sintel
[dataset] normal:
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 \
MASTER_PORT=21258 \
PYTHONUNBUFFERED=1 \
python -u evaluation.py \
--model_path ./pretrain \
--eval_data_root ./infer \
--output_dir ./infer/eval_verify_scannet_normal_8gpu \
--num_gpus 8 \
--num_samples -1 \
--lora ./lora/LDRN.safetensors \
--single_denoise \
--prompt_type empty \
--norm_type ln \
--task_name normal \
--normal_eval_datasets [dataset][dataset] depth:
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 \
MASTER_PORT=21257 \
PYTHONUNBUFFERED=1 \
python -u evaluation.py \
--model_path ./pretrain \
--eval_data_root ./infer \
--output_dir ./infer/eval_verify_eth3d_8gpu \
--num_gpus 8 \
--num_samples -1 \
--lora ./lora/LDRN.safetensors \
--single_denoise \
--prompt_type empty \
--norm_type ln \
--task_name depth \
--depth_eval_datasets [dataset]If you want to known the successful status, this repo includes run logs in logs/:
logs/verify_scannet_normal_8gpu_20260317_171345.loglogs/verify_eth3d_8gpu_20260317_172004.log
If you find our work useful, please cite:
@article{wang2025editor,
title={From Editor to Dense Geometry Estimator},
author={Wang, JiYuan and Lin, Chunyu and Sun, Lei and Liu, Rongying and Nie, Lang and Li, Mingxing and Liao, Kang and Chu, Xiangxiang and Zhao, Yao},
journal={arXiv preprint arXiv:2509.04338},
year={2025}
}
