This repository contains the official implementation of LAM3C (pronounced “lah-MEK” /lɑːˈmɛk/) and the RoomTours pipeline.
The bottleneck of 3D self-supervised learning is not only algorithms, but also the scarcity of large-scale 3D data.
By turning the vast sea of unlabeled internet videos into point clouds, we unlock a scalable source of 3D supervision.
This paper shows that 3D self-supervised learning can be trained using only video-generated point clouds reconstructed from unlabeled videos, without relying on real 3D scans.
We introduce:
- RoomTours — a scalable pipeline that converts unlabeled indoor videos into video-generated point clouds.
- LAM3C — a 3D self-supervised learning framework designed to learn robust representations from noisy video-generated point clouds.
LAM3C transfers well to indoor semantic and instance segmentation tasks.
- Mar 2026: Released RoomTours generation code and inference demo visualization.
- Feb 2026: LAM3C was accepted to CVPR 2026 (main track).
- Conda
- Python 3.9
- PyTorch 2.5.0
- CUDA 12.4
- NVIDIA GPU for CUDA execution
This repo provide two ways of installation: standalone mode and package mode.
-
The standalone mode is recommended for users who want to use the code for quick inference and visualization. We provide a most easy way to install the environment by using
condaenvironment file. The whole environment includingcudaandpytorchcan be easily installed by running the following command:# Create and activate conda environment named as 'lam3c' # run `unset CUDA_PATH` if you have installed cuda in your local environment conda env create -f environment.yml --verbose conda activate lam3c # if torch-scatter installation fails, install explicitly from the PyG wheel index pip install --no-build-isolation torch-scatter -f https://data.pyg.org/whl/torch-2.5.0+cu124.html # optional: install FlashAttention after torch is available in this env # (required on some systems to avoid pip build-isolation issues) pip install --no-build-isolation git+https://github.com/Dao-AILab/flash-attention.git
FlashAttention is optional. If installation fails in your environment, you can skip it and use the fallback path (see Model section in Quick Start).
-
The package mode is recommended for users who want to inject our model into their own codebase. We provide a
setup.pyfile for installation. You can install the package by running the following command:# Ensure Cuda and Pytorch are already installed in your local environment # CUDA_VERSION: cuda version of local environment (e.g., 124), check by running 'nvcc --version' # TORCH_VERSION: torch version of local environment (e.g., 2.5.0), check by running 'python -c "import torch; print(torch.__version__)"' pip install spconv-cu${CUDA_VERSION} pip install --no-build-isolation torch-scatter -f https://data.pyg.org/whl/torch-{TORCH_VERSION}+cu${CUDA_VERSION}.html # optional: pip install --no-build-isolation git+https://github.com/Dao-AILab/flash-attention.git pip install huggingface_hub timm # (optional, or directly copy the lam3c folder to your project) python setup.py install
Additionally, for running our demo code, the following packages are also required:
pip install open3d fast_pytorch_kmeans psutil numpy==1.26.4 # currently, open3d does not support numpy 2.x
Let's first begin with simple visualization demos with LAM3C, our pre-trained PointTransformerV3 (PTv3) model.
We provide demos for PCA feature visualization, similarity heatmaps, semantic segmentation, and batched forward inference in the demo folder.
# 1) activate environment
conda activate lam3c
unset PYTHONPATH
export PYTHONNOUSERSITE=1
export PYTHONPATH=./
# 2) (optional) headless output on servers (writes .ply files to outputs/)
# export LAM3C_HEADLESS=1
# 3-A) recommended: run with HuggingFace checkpoints
# default model size is large
python demo/0_pca.py --hf-repo-id aist-cvrt/lam3c-roomtours
python demo/1_similarity.py --hf-repo-id aist-cvrt/lam3c-roomtours
python demo/2_sem_seg.py --hf-repo-id aist-cvrt/lam3c-roomtours
python demo/3_batch_forward.py --hf-repo-id aist-cvrt/lam3c-roomtours
# switch to base model size on HuggingFace
python demo/0_pca.py --hf-repo-id aist-cvrt/lam3c-roomtours --model-size base
python demo/1_similarity.py --hf-repo-id aist-cvrt/lam3c-roomtours --model-size base
python demo/2_sem_seg.py --hf-repo-id aist-cvrt/lam3c-roomtours --model-size base
python demo/3_batch_forward.py --hf-repo-id aist-cvrt/lam3c-roomtours --model-size base
# 3-B) fallback option: run with local preset checkpoints
# Use this when HF access is unavailable or blocked.
# Place the 4 checkpoint files under weights/ with names listed in "Model Zoo".
# Note: legacy checkpoint names ending with "_scennet.pth" are also supported.
python demo/0_pca.py --model-size base --ckpt weights/lam3c_ptv3-base_roomtours49k.pth
python demo/0_pca.py --model-size large --ckpt weights/lam3c_ptv3-large_roomtours49k.pth
python demo/1_similarity.py --model-size base --ckpt weights/lam3c_ptv3-base_roomtours49k.pth
python demo/1_similarity.py --model-size large --ckpt weights/lam3c_ptv3-large_roomtours49k.pth
python demo/2_sem_seg.py --model-size base --ckpt weights/lam3c_ptv3-base_roomtours49k.pth --head-ckpt weights/lam3c_ptv3-base_roomtours49k_probe-head_scannet.pth
python demo/2_sem_seg.py --model-size large --ckpt weights/lam3c_ptv3-large_roomtours49k.pth --head-ckpt weights/lam3c_ptv3-large_roomtours49k_probe-head_scannet.pth
python demo/3_batch_forward.py --model-size base --ckpt weights/lam3c_ptv3-base_roomtours49k.pth
python demo/3_batch_forward.py --model-size large --ckpt weights/lam3c_ptv3-large_roomtours49k.pth
# 3-C) run with custom checkpoints
python demo/0_pca.py --ckpt /path/to/custom_backbone.infer.pth
python demo/2_sem_seg.py --ckpt /path/to/custom_backbone.infer.pth --head-ckpt /path/to/custom_linear_head.pthMoved to demo/README.md for easier access while working in the demo folder.
Download checkpoints from Google Drive and place them under weights/ for local execution.
For HuggingFace-based loading in the examples below, use repo id aist-cvrt/lam3c-roomtours.
LP = linear probing. FT = full fine-tuning.
| Model | Backbone | Checkpoint | Pretraining Data | ScanNet | ScanNet200 | ScanNet++ Val | S3DIS Area 5 | Weights | Logs | ||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| LP | FT | LP | FT | LP | FT | LP | FT | ||||||
| LAM3C | PTv3-Base | lam3c_ptv3-base_roomtours49k.pth |
RoomTours49k (VGPC) | 66.0 | 75.1 | 25.3 | 35.1 | 34.2 | 43.1 | 65.7 | 72.9 |
Google Drive HuggingFace |
log |
| LAM3C | PTv3-Large | lam3c_ptv3-large_roomtours49k.pth |
RoomTours49k (VGPC) | 69.5 | 79.5 | 28.1 | 35.5 | 35.9 | 43.1 | 69.5 | 75.5 |
Google Drive HuggingFace |
log |
| Head | Backbone | Checkpoint | Target Dataset | Weights |
|---|---|---|---|---|
| ScanNet linear probe head | PTv3-Base | lam3c_ptv3-base_roomtours49k_probe-head_scannet.pth |
ScanNet | Google Drive HuggingFace |
| ScanNet linear probe head | PTv3-Large | lam3c_ptv3-large_roomtours49k_probe-head_scannet.pth |
ScanNet | Google Drive HuggingFace |
A single pretrained backbone is evaluated across multiple downstream datasets.
See detailed experiment logs for dataset-wise training and evaluation logs.
RoomTours converts unlabeled indoor videos into training-ready point clouds for LAM3C pre-training.
The pipeline includes video download, scene segmentation, Pi3 reconstruction, and point-cloud preprocessing.
For setup and commands, see roomtours_gen/README.md.
LAM3C pretraining is integrated on top of the vendored Pointcept training pipeline.
Use tools/train_lam3c.sh as the repo-root entrypoint.
# PTv3-Base (recommended)
bash tools/train_lam3c.sh \
configs/lam3c/pretrain/lam3c_v1m1_ptv3_base.py \
logs/lam3c_pretrain_base
# PTv3-Large (optional)
# bash tools/train_lam3c.sh \
# configs/lam3c/pretrain/lam3c_v1m1_ptv3_large.py \
# logs/lam3c_pretrain_largeFor setup notes, dry-run checks, and more options, see docs/pretrain.md.
Detailed dataset-wise training and evaluation logs are available in:
If you find our LAM3C work useful, please cite:
@inproceedings{yamada2026lam3c,
title={3D sans 3D Scans: Scalable Pre-training from Video-Generated Point Clouds},
author={Ryousuke Yamada and Kohsuke Ide and Yoshihiro Fukuhara and Hirokatsu Kataoka and Gilles Puy and Andrei Bursuc and Yuki M. Asano},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
year={2026}
}- Code: MIT License.
- LAM3C weights: Creative Commons BY-NC 4.0 (free for research/education, no commercial use).
- RoomTours / generated data: Creative Commons BY-NC 4.0 (free for research/education, no commercial use).
- Demo sample data: loaded from Pointcept's
pointcept/demodataset repository on Hugging Face. - Upstream attribution: This repository includes/adapts code from Sonata.
See LICENSE, NOTICE, and THIRD_PARTY_NOTICES.md for details.



