OVGGT: O(1) Constant-Cost Streaming Visual Geometry Transformer

Si-Yu Lu¹, Po-Ting Chen², Hui-Che Hsu², Sin-Ye Jhong², Wen-Huang Cheng¹, Yung-Yao Chen²

¹National Taiwan University ²National Taiwan University of Science and Technology

TL;DR: OVGGT is a training-free framework enabling streaming 3D reconstruction from arbitrarily long video with constant memory and compute — achieving O(1) per-frame cost while surpassing full-cache baselines in accuracy.

Left: Quantitative comparison on 7-Scenes across 200 frames. Right: Qualitative 3D reconstructions demonstrating OVGGT's stability over long sequences (50–500 frames).

News

Overview

OVGGT is a training-free framework that enables streaming 3D reconstruction from arbitrarily long video with constant memory and compute. It combines Self-Selective Caching (SSC) for zero-overhead KV cache compression via FFN residual magnitudes, and Dynamic Anchor Protection (DAP) to shield geometrically critical tokens from eviction, suppressing coordinate drift over long sequences. OVGGT is fully compatible with FlashAttention and processes videos within a fixed VRAM envelope while surpassing full-cache baselines in accuracy.

⚙️ Installation

Clone OVGGT

git clone https://github.com/<your-username>/OVGGT.git
cd OVGGT

Create conda environment

conda create -n OVGGT python=3.11 cmake=3.14.0
conda activate OVGGT

Install requirements

pip install -r requirements.txt
conda install 'llvm-openmp<16'

Download Checkpoints

Please download checkpoint of StreamVGGT from Hugging Face or Tsinghua cloud.

Evaluation

The evaluation code follows MonST3R, CUT3R, TTT3R, StreamVGGT and InfiniteVGGT.

cd src/

Multi-view Reconstruction

bash eval/mv_recon/run.sh

Results will be saved in eval_results/mv_recon/${model_name}_${ckpt_name}/logs_all.txt.

Video Depth

bash eval/video_depth/run.sh

Results will be saved in eval_results/video_depth/${data}_${model_name}/result_scale.json.

Pose Evaluation

bash eval/pose_evaluation/run.sh

Results will be saved in eval_results/pose_evaluation/{data}_${model_name}/_error_log.txt.

🚀 Quick Start

Viser Demo (Interactive 3D Visualization)

We provide a demo for OVGGT, based on the demo code from InfiniteVGGT. You can follow the instructions below to launch it.

python demo_viser.py  \
    --seq_path path/to/nrgbd/image_sequence \
    --frame_interval 10 \
    --gt_path path/to/nrgbd/gt_camera (Optional)

Gradio Demo (Web UI)

We provide a demo for OVGGT, based on the demo code from VGGT. You can follow the instructions below to launch it.

pip install -r requirements_demo.txt
python demo_gradio.py

🙏 Acknowledgements

Our code is based on the following brilliant repositories:

DUSt3R MonST3R Spann3R CUT3R VGGT Point3R StreamVGGT TTT3R Evict3R InfiniteVGGT

Many thanks to these authors!

📝 Citation

@article{lu2026ovggt,
  title={OVGGT: O(1) Constant-Cost Streaming Visual Geometry Transformer},
  author={Si-Yu Lu and Po-Ting Chen and Hui-Che Hsu and Sin-Ye Jhong and Wen-Huang Cheng and Yung-Yao Chen},
  journal={arXiv preprint arXiv:2603.05959},
  year={2026}
}

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
assets		assets
cloud_opt		cloud_opt
datasets_preprocess		datasets_preprocess
examples		examples
lib		lib
src		src
.gitignore		.gitignore
LICENSE.txt		LICENSE.txt
README.md		README.md
demo_gradio.py		demo_gradio.py
demo_viser.py		demo_viser.py
requirements.txt		requirements.txt
requirements_demo.txt		requirements_demo.txt
viser_utils.py		viser_utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

OVGGT: O(1) Constant-Cost Streaming Visual Geometry Transformer

News

Overview

⚙️ Installation

Download Checkpoints

Evaluation

Multi-view Reconstruction

Video Depth

Pose Evaluation

🚀 Quick Start

Viser Demo (Interactive 3D Visualization)

Gradio Demo (Web UI)

🙏 Acknowledgements

📝 Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

OVGGT: O(1) Constant-Cost Streaming Visual Geometry Transformer

News

Overview

⚙️ Installation

Download Checkpoints

Evaluation

Multi-view Reconstruction

Video Depth

Pose Evaluation

🚀 Quick Start

Viser Demo (Interactive 3D Visualization)

Gradio Demo (Web UI)

🙏 Acknowledgements

📝 Citation

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages