DC-Scene

This is the code repository for the paper:

DC-Scene: Data-Centric Learning for 3D Scene Understanding

Ting Huang*, Zeyu Zhang*†, Ruicheng Zhang and Yang Zhao**

*Equal contribution. †Project lead. **Corresponding author

3DV 2026 Exploration Edge Track

[arXiv] [Paper with Code]

dc_0.mp4

Citation

If you use any content of this repo for your work, please cite the following our paper:

@article{huang2025dc,
  title={DC-Scene: Data-Centric Learning for 3D Scene Understanding},
  author={Huang, Ting and Zhang, Zeyu and Zhang, Ruicheng and Zhao, Yang},
  journal={arXiv preprint arXiv:2505.15232},
  year={2025}
}

Introduction

3D scene understanding plays a fundamental role in vision applications such as robotics, autonomous driving, and augmented reality. However, advancing learning-based 3D scene understanding remains challenging due to two key limitations: (1) the large scale and complexity of 3D scenes lead to higher computational costs and slower training compared to 2D counterparts; and (2) high-quality annotated 3D datasets are significantly scarcer than those available for 2D vision. These challenges underscore the need for more efficient learning paradigms. In this work, we propose DC-Scene, a data-centric framework tailored for 3D scene understanding, which emphasizes enhancing data quality and training efficiency. Specifically, we introduce a CLIP-driven dual-indicator quality (DIQ) filter, combining vision-language alignment scores with caption-loss perplexity, along with a curriculum scheduler that progressively expands the training pool from the top 25% to 75% of scene–caption pairs. This strategy filters out noisy samples and significantly reduces dependence on large-scale labeled 3D data. Extensive experiments on ScanRefer and Nr3D demonstrate that DC-Scene achieves state-of-the-art performance (86.1 CIDEr with the top-75% subset vs. 85.4 with the full dataset) while reducing training cost by approximately two-thirds, confirming that a compact set of high-quality samples can outperform exhaustive training.

Core Features

CLIP-driven dual-indicator quality (DIQ) filter: Combines visual-language alignment score and description loss perplexity to effectively identify and filter low-quality scene-description pairs.
Curriculum learning scheduler: Adopts a strategy of gradually expanding the training set to achieve a training process from easy to difficult.

Requirements

Python >= 3.7
PyTorch >= 1.8
CUDA-compatible GPU

Environment Setup

You can set up your own conda virtual environment by running the commands below.

# create a clean conda environment from scratch
conda create --name dcscene python=3.8
conda activate dcscene
# install required packages
pip install -r requirements.txt

Training

Import Dataset Path

There are two datasets that need to set paths. The Scanrefer dataset sets the DATASET_ROOT_DIR and DATASET_METADATA_DIR global variables in the datasets/scene_scanrefer.py file. The Nr3D dataset also sets two global variables in the datasets/scene_nr3d.py file.

Please modify the paths to match your actual dataset paths, set the training parameters, and then start model training.

Start Training

# w/o 2D input
python main.py --use_color --use_normal --checkpoint_dir ckpt/DC_Scene
# w/ 2D input
python main.py --use_color --use_multiview --checkpoint_dir ckpt_2D/DC_Scene

Evaluation

There are two datasets that need to set paths. The Scanrefer dataset sets the DATASET_ROOT_DIR and DATASET_METADATA_DIR global variables in the datasets/scene_scanrefer.py file. The Nr3D dataset also sets two global variables in the datasets/scene_nr3d.py file.

Please modify the paths to match your actual dataset paths, set the training parameters, and then start model training.

Start testing

# w/o 2D input
python main.py --use_color --use_normal --test_ckpt ckpt/DC_Scene/checkpoint_best.pth --test_caption
# w/ 2D input
python main.py --use_color --use_multiview --test_ckpt ckpt_2D/DC_Scene/checkpoint_best.pth --test_caption

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
images		images
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DC-Scene

Citation

Introduction

Core Features

Requirements

Environment Setup

Training

Import Dataset Path

Start Training

Evaluation

Start testing

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

DC-Scene

Citation

Introduction

Core Features

Requirements

Environment Setup

Training

Import Dataset Path

Start Training

Evaluation

Start testing

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages