Skip to content

ryosuke-yamada/lam3c

Repository files navigation

3D sans 3D Scans: Scalable Pre-training from Video-Generated Point Clouds

arXiv CVPR 2026 Website HuggingFace Models Dataset


This repository contains the official implementation of LAM3C (pronounced “lah-MEK” /lɑːˈmɛk/) and the RoomTours pipeline.

LAM3C Message

The bottleneck of 3D self-supervised learning is not only algorithms, but also the scarcity of large-scale 3D data.
By turning the vast sea of unlabeled internet videos into point clouds, we unlock a scalable source of 3D supervision.

LAM3C scaling results

TL;DR

This paper shows that 3D self-supervised learning can be trained using only video-generated point clouds reconstructed from unlabeled videos, without relying on real 3D scans.

We introduce:

  • RoomTours — a scalable pipeline that converts unlabeled indoor videos into video-generated point clouds.
  • LAM3C — a 3D self-supervised learning framework designed to learn robust representations from noisy video-generated point clouds.

LAM3C transfers well to indoor semantic and instance segmentation tasks.

News

  • Mar 2026: Released RoomTours generation code and inference demo visualization.
  • Feb 2026: LAM3C was accepted to CVPR 2026 (main track).

Overview

Requirements

  • Conda
  • Python 3.9
  • PyTorch 2.5.0
  • CUDA 12.4
  • NVIDIA GPU for CUDA execution

Installation

This repo provide two ways of installation: standalone mode and package mode.

  • The standalone mode is recommended for users who want to use the code for quick inference and visualization. We provide a most easy way to install the environment by using conda environment file. The whole environment including cuda and pytorch can be easily installed by running the following command:

    # Create and activate conda environment named as 'lam3c'
    
    # run `unset CUDA_PATH` if you have installed cuda in your local environment
    conda env create -f environment.yml --verbose
    conda activate lam3c
    
    # if torch-scatter installation fails, install explicitly from the PyG wheel index
    pip install --no-build-isolation torch-scatter -f https://data.pyg.org/whl/torch-2.5.0+cu124.html
    
    # optional: install FlashAttention after torch is available in this env
    # (required on some systems to avoid pip build-isolation issues)
    pip install --no-build-isolation git+https://github.com/Dao-AILab/flash-attention.git

    FlashAttention is optional. If installation fails in your environment, you can skip it and use the fallback path (see Model section in Quick Start).

  • The package mode is recommended for users who want to inject our model into their own codebase. We provide a setup.py file for installation. You can install the package by running the following command:

    # Ensure Cuda and Pytorch are already installed in your local environment
    
    # CUDA_VERSION: cuda version of local environment (e.g., 124), check by running 'nvcc --version'
    # TORCH_VERSION: torch version of local environment (e.g., 2.5.0), check by running 'python -c "import torch; print(torch.__version__)"'
    pip install spconv-cu${CUDA_VERSION}
    pip install --no-build-isolation torch-scatter -f https://data.pyg.org/whl/torch-{TORCH_VERSION}+cu${CUDA_VERSION}.html
    # optional:
    pip install --no-build-isolation git+https://github.com/Dao-AILab/flash-attention.git
    pip install huggingface_hub timm
    
    # (optional, or directly copy the lam3c folder to your project)
    python setup.py install

    Additionally, for running our demo code, the following packages are also required:

    pip install open3d fast_pytorch_kmeans psutil numpy==1.26.4  # currently, open3d does not support numpy 2.x

Quick Start

Let's first begin with simple visualization demos with LAM3C, our pre-trained PointTransformerV3 (PTv3) model.

Visualization

LAM3C demo

We provide demos for PCA feature visualization, similarity heatmaps, semantic segmentation, and batched forward inference in the demo folder.

# 1) activate environment
conda activate lam3c
unset PYTHONPATH
export PYTHONNOUSERSITE=1
export PYTHONPATH=./

# 2) (optional) headless output on servers (writes .ply files to outputs/)
# export LAM3C_HEADLESS=1

# 3-A) recommended: run with HuggingFace checkpoints
# default model size is large
python demo/0_pca.py --hf-repo-id aist-cvrt/lam3c-roomtours
python demo/1_similarity.py --hf-repo-id aist-cvrt/lam3c-roomtours
python demo/2_sem_seg.py --hf-repo-id aist-cvrt/lam3c-roomtours
python demo/3_batch_forward.py --hf-repo-id aist-cvrt/lam3c-roomtours

# switch to base model size on HuggingFace
python demo/0_pca.py --hf-repo-id aist-cvrt/lam3c-roomtours --model-size base
python demo/1_similarity.py --hf-repo-id aist-cvrt/lam3c-roomtours --model-size base
python demo/2_sem_seg.py --hf-repo-id aist-cvrt/lam3c-roomtours --model-size base
python demo/3_batch_forward.py --hf-repo-id aist-cvrt/lam3c-roomtours --model-size base

# 3-B) fallback option: run with local preset checkpoints
# Use this when HF access is unavailable or blocked.
# Place the 4 checkpoint files under weights/ with names listed in "Model Zoo".
# Note: legacy checkpoint names ending with "_scennet.pth" are also supported.
python demo/0_pca.py --model-size base --ckpt weights/lam3c_ptv3-base_roomtours49k.pth
python demo/0_pca.py --model-size large --ckpt weights/lam3c_ptv3-large_roomtours49k.pth
python demo/1_similarity.py --model-size base --ckpt weights/lam3c_ptv3-base_roomtours49k.pth
python demo/1_similarity.py --model-size large --ckpt weights/lam3c_ptv3-large_roomtours49k.pth
python demo/2_sem_seg.py --model-size base --ckpt weights/lam3c_ptv3-base_roomtours49k.pth --head-ckpt weights/lam3c_ptv3-base_roomtours49k_probe-head_scannet.pth
python demo/2_sem_seg.py --model-size large --ckpt weights/lam3c_ptv3-large_roomtours49k.pth --head-ckpt weights/lam3c_ptv3-large_roomtours49k_probe-head_scannet.pth
python demo/3_batch_forward.py --model-size base --ckpt weights/lam3c_ptv3-base_roomtours49k.pth
python demo/3_batch_forward.py --model-size large --ckpt weights/lam3c_ptv3-large_roomtours49k.pth

# 3-C) run with custom checkpoints
python demo/0_pca.py --ckpt /path/to/custom_backbone.infer.pth
python demo/2_sem_seg.py --ckpt /path/to/custom_backbone.infer.pth --head-ckpt /path/to/custom_linear_head.pth

Inference on custom data

Moved to demo/README.md for easier access while working in the demo folder.

Model Zoo

Download checkpoints from Google Drive and place them under weights/ for local execution.
For HuggingFace-based loading in the examples below, use repo id aist-cvrt/lam3c-roomtours.

LP = linear probing. FT = full fine-tuning.

Pretrained Backbones & Benchmark Results

Model Backbone Checkpoint Pretraining Data ScanNet ScanNet200 ScanNet++ Val S3DIS Area 5 Weights Logs
LP FT LP FT LP FT LP FT
LAM3C PTv3-Base lam3c_ptv3-base_roomtours49k.pth RoomTours49k (VGPC) 66.0 75.1 25.3 35.1 34.2 43.1 65.7 72.9 Google Drive
HuggingFace
log
LAM3C PTv3-Large lam3c_ptv3-large_roomtours49k.pth RoomTours49k (VGPC) 69.5 79.5 28.1 35.5 35.9 43.1 69.5 75.5 Google Drive
HuggingFace
log

Linear Probe Heads

Head Backbone Checkpoint Target Dataset Weights
ScanNet linear probe head PTv3-Base lam3c_ptv3-base_roomtours49k_probe-head_scannet.pth ScanNet Google Drive
HuggingFace
ScanNet linear probe head PTv3-Large lam3c_ptv3-large_roomtours49k_probe-head_scannet.pth ScanNet Google Drive
HuggingFace

A single pretrained backbone is evaluated across multiple downstream datasets.
See detailed experiment logs for dataset-wise training and evaluation logs.

RoomTours Pipeline

RoomTours converts unlabeled indoor videos into training-ready point clouds for LAM3C pre-training. The pipeline includes video download, scene segmentation, Pi3 reconstruction, and point-cloud preprocessing. For setup and commands, see roomtours_gen/README.md.

RoomTours

LAM3C Pre-train

LAM3C pretraining is integrated on top of the vendored Pointcept training pipeline. Use tools/train_lam3c.sh as the repo-root entrypoint.

# PTv3-Base (recommended)
bash tools/train_lam3c.sh \
  configs/lam3c/pretrain/lam3c_v1m1_ptv3_base.py \
  logs/lam3c_pretrain_base

# PTv3-Large (optional)
# bash tools/train_lam3c.sh \
#   configs/lam3c/pretrain/lam3c_v1m1_ptv3_large.py \
#   logs/lam3c_pretrain_large

For setup notes, dry-run checks, and more options, see docs/pretrain.md.

Experiment Logs

Detailed dataset-wise training and evaluation logs are available in:

Citation

If you find our LAM3C work useful, please cite:

@inproceedings{yamada2026lam3c,
  title={3D sans 3D Scans: Scalable Pre-training from Video-Generated Point Clouds},
  author={Ryousuke Yamada and Kohsuke Ide and Yoshihiro Fukuhara and Hirokatsu Kataoka and Gilles Puy and Andrei Bursuc and Yuki M. Asano},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  year={2026}
}

License

  • Code: MIT License.
  • LAM3C weights: Creative Commons BY-NC 4.0 (free for research/education, no commercial use).
  • RoomTours / generated data: Creative Commons BY-NC 4.0 (free for research/education, no commercial use).
  • Demo sample data: loaded from Pointcept's pointcept/demo dataset repository on Hugging Face.
  • Upstream attribution: This repository includes/adapts code from Sonata.

See LICENSE, NOTICE, and THIRD_PARTY_NOTICES.md for details.

About

CVPR2026 (main)

Resources

License

Code of conduct

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors