Skip to content

neu-vi/LASER

Repository files navigation

LASER: Layer-wise Scale Alignment for Training-Free Streaming 4D Reconstruction

arXiv Project Page

Tianye Ding1*, Yiming Xie1*, Yiqing Liang2*, Moitreya Chatterjee3, Pedro Miraldo3, Huaizu Jiang1
1 Northeastern University, 2 Independent Researcher, 3 Mitsubishi Electric Research Laboratories
* Equal Contribution

📢 Updates

  • [2026-03-12] Loop-closure module released with robustness fix.
  • [2026-02-21] Paper accepted by CVPR 2026.
  • [2025-12-15] ArXiv preprint released.

📝 To-Do List

  • Release framework codebase
  • Release inference code
  • Add data preparation instruction
  • Release evaluation code
  • Add Viser integration
  • Release loop-closure demo

💡 Abstract

We propose LASER, a training-free framework that converts an offline reconstruction model into a streaming system by aligning predictions across consecutive temporal windows. We observe that simple similarity transformation (Sim(3)) alignment fails due to layer depth misalignment: monocular scale ambiguity causes relative depth scales of different scene layers to vary inconsistently between windows. To address this, we introduce layer-wise scale alignment, which segments depth predictions into discrete layers, computes per-layer scale factors, and propagates them across both adjacent windows and timestamps.

🛠️ Installation

# 1. Clone the repository
git clone --recursive git@github.com:neu-vi/LASER.git
cd LASER

# 2. Create environment
conda create -n laser -y python=3.11
conda activate laser

# 3. Install dependencies
pip install -r requirements.txt

# 4. Compile cython modules
python setup.py build_ext --inplace

# 5. Install Viser
pip install -e viser

(Optional) Download checkpoints needed for loop-closure inference

bash ./scripts/download_weights.sh

🚀 Usage

Inference

To run the inference code, you can use the following command:

export PYTHONPATH="./":$PYTHONPATH

python demo.py \
    --data_path DATA_PATH \
    --output_path "./viser_results" \
    --cache_path "./cache" \
    --sample_interval SAMPLE_INTERVAL \
    --window_size WINDOW_SIZE \
    --overlap OVERLAP \
    --depth_refine

# example inference script
python demo.py \
    --data_path "examples/titanic" \
    --output_path "./viser_results" \
    --cache_path "./cache" \
    --sample_interval 1 \
    --window_size 30 \
    --overlap 10 \
    --depth_refine

The results will be saved in the viser_results/SEQ_NAMEdirectory for future visualization.

Loop-closure inference

Loop-closure requires additional dependencies for package faiss can be installed through:

pip install faiss-gpu-cu12 numpy==1.26.4

Run loop-closure inference for kilometer-scale sequence with the following command:

python demo_lc.py \
    --config_path "configs/loop_config.yaml" \
    --data_path DATA_PATH \
    --output_path "./viser_results" \
    --cache_path "./cache" \
    --sample_interval SAMPLE_INTERVAL \
    --window_size WINDOW_SIZE \
    --overlap OVERLAP

rm -r cache/

Visualization

To visualize the interactive 4D results, you can use the following command:

python viser/visualizer_monst3r.py --data viser_results/SEQ_NAME

# example visualization script
python viser/visualizer_monst3r.py --data viser_results/titanic

Evaluation

Please refer to MonST3R for dataset setup details.

Put all datasets in data/.

Video Depth

Sintel

export PYTHONPATH="./":$PYTHONPATH

CUDA_VISIBLE_DEVICES=0 torchrun --nproc_per_node=1 --master_port=12345 eval_launch.py \
    --mode=eval_pose \
    --model=streaming_pi3 \
    --eval_dataset=sintel \
    --output_dir="outputs/video_depth/sintel_depth" \
    --full_seq \
    --no_crop

CUDA_VISIBLE_DEVICES=0 torchrun --nproc_per_node=1 --master_port=12345 depth_metric.py \
    --eval_dataset=sintel \
    --result_dir="outputs/video_depth/sintel_depth" \
    --output_dir="outputs/video_depth"

Bonn

export PYTHONPATH="./":$PYTHONPATH

CUDA_VISIBLE_DEVICES=0 torchrun --nproc_per_node=1 --master_port=12345 eval_launch.py \
    --mode=eval_pose \
    --model=streaming_pi3 \
    --eval_dataset=bonn \
    --output_dir="outputs/video_depth/bonn_depth" \
    --no_crop

CUDA_VISIBLE_DEVICES=0 torchrun --nproc_per_node=1 --master_port=12345 depth_metric.py \
    --eval_dataset=bonn \
    --result_dir="outputs/video_depth/bonn_depth" \
    --output_dir="outputs/video_depth"

KITTI

export PYTHONPATH="./":$PYTHONPATH

CUDA_VISIBLE_DEVICES=0 torchrun --nproc_per_node=1 --master_port=12345 eval_launch.py \
    --mode=eval_pose \
    --model=streaming_pi3 \
    --eval_dataset=kitti \
    --output_dir="outputs/video_depth/kitti_depth" \
    --no_crop \
    --flow_loss_weight 0 \
    --translation_weight 1e-3

CUDA_VISIBLE_DEVICES=0 torchrun --nproc_per_node=1 --master_port=12345 depth_metric.py \
    --eval_dataset=kitti \
    --result_dir="outputs/video_depth/kitti_depth" \
    --output_dir="outputs/video_depth"

Camera Pose

Sintel

export PYTHONPATH="./":$PYTHONPATH

CUDA_VISIBLE_DEVICES=0 torchrun --nproc_per_node=1 --master_port=12345 eval_launch.py \
    --mode=eval_pose \
    --model=streaming_pi3 \
    --eval_dataset=sintel \
    --output_dir="outputs/cam_pose/sintel_pose"

ScanNet

export PYTHONPATH="./":$PYTHONPATH

CUDA_VISIBLE_DEVICES=0 torchrun --nproc_per_node=1 --master_port=12345 eval_launch.py \
    --mode=eval_pose \
    --model=streaming_pi3 \
    --eval_dataset=scannet \
    --output_dir="outputs/cam_pose/scannet_pose"

TUM

export PYTHONPATH="./":$PYTHONPATH

CUDA_VISIBLE_DEVICES=0 torchrun --nproc_per_node=1 --master_port=12345 eval_launch.py \
    --mode=eval_pose \
    --model=streaming_pi3 \
    --eval_dataset=tum \
    --output_dir="outputs/cam_pose/tum_pose"

Citation

If you find this repository useful in your research, please consider giving a star ⭐ and a citation

@article{ding2025laser,
  title={LASER: Layer-wise Scale Alignment for Training-Free Streaming 4D Reconstruction},
  author={Ding, Tianye and Xie, Yiming and Liang, Yiqing and Chatterjee, Moitreya and Miraldo, Pedro and Jiang, Huaizu},
  year={2025}
}

Acknowledgements

We would like to thank the authors for the following excellent open source projects: VGGT, π3, MonST3R, CUT3R, VGGT-Long and many other inspiring works in the community.

About

[CVPR 2026] Layer-wise Scale Alignment for Training-Free Streaming 4D Reconstruction

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages