OnlineSI: Taming Large Language Model for Online 3D Understanding and Grounding

Installation

Create the conda environment:

conda create -n online python=3.11
conda activate online
conda install -y -c nvidia/label/cuda-12.4.0 cuda-toolkit conda-forge::sparsehash

Install SpatialLM

cd external/SpatialLM
pip install poetry && poetry config virtualenvs.create false --local
poetry install
poe install-sonata

Install CUT3R and download weights

cd ../CUT3R
pip install -r requirements.txt
cd src/croco/models/curope/
python setup.py build_ext --inplace
cd ../../../../
pip install gdown
cd src
gdown --fuzzy https://drive.google.com/file/d/1Asz-ZB3FfpzZYwunhQvNPZEUA8XUNAYD/view?usp=drive_link
cd ..

Install Grounded SAM 2 and download weights

cd ..
git clone git@github.com:IDEA-Research/Grounded-SAM-2.git
cd Grounded-SAM-2/checkpoints
bash download_ckpts.sh
cd ../gdino_checkpoints
bash download_ckpts.sh
cd ..
pip install -e .
pip install --no-build-isolation -e grounding_dino

Install dependencies

cd ../..
pip install -r requirements.txt

Data Preprocess

Please follow instructions in src/data_preprocess/DATA_README.md

Training & Evaluation

Both training and evaluation are integrated in src/train.py. We use slurm to manage our computing resources.

Training:

cd src
bash scripts/train_batch.sh

Evaluation:

cd src
bash scripts/eval.sh

Some recipes to reproduce the results in our paper:

SpatialLM-Merge: train with src\configs\spatiallm_partial_0812\baseline\spatiallm_merge_baseline_train.yaml, evaluate with src\configs\spatiallm_merge_0924\baseline\spatiallm_merge_eval_baseline.yaml
SpatialLM-Finetune: train with src\configs\spatiallm_partial_0812\baseline\spatiallm_partial_gt_baseline.yaml, evaluate with src\configs\spatiallm_partial_0812\baseline\spatiallm_partial_eval_baseline.yaml
OnlineSI (Ours): train and evaluate with src\configs\spatiallm_partial_semantic_0916\baseline\spatiallm_partial_semantic_method.yaml

Training could be time-consuming, so here we provide a checkpoint trained with configs\spatiallm_partial_semantic_0916\baseline\spatiallm_partial_semantic_method.yaml, you can download it here.

The codebase has been reorganized and cleaned up, so it has not been fully tested yet. If you encounter any problems, feel free to shoot me an issue.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
assets		assets
data		data
external		external
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

OnlineSI: Taming Large Language Model for Online 3D Understanding and Grounding

Installation

Data Preprocess

Training & Evaluation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 1

Languages

Folders and files

Latest commit

History

Repository files navigation

OnlineSI: Taming Large Language Model for Online 3D Understanding and Grounding

Installation

Data Preprocess

Training & Evaluation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 1

Languages

Packages