Add VideoSegmentationSam3Boxes node by demoulinv · Pull Request #48 · meshroomHub/mrSegmentation

demoulinv · 2026-03-31T06:15:00Z

This pull request introduces a new segmentation node and makes several improvements and bug fixes to the video segmentation pipeline. The main addition is the new VideoSegmentationSam3Boxes node, which segments video frames using bounding boxes from a JSON file. Additionally, several changes in VideoSegmentationSam3Text.py improve the consistency of mask and bounding box handling.

New Node Addition:

Added a new node VideoSegmentationSam3Boxes for segmenting video frames based on bounding boxes from a JSON file, supporting multiple input resolutions, GPU usage, mask inversion, and flexible output options. This node integrates with the SAM3 video predictor and handles mask generation, file management, and metadata.

Improvements in VideoSegmentationSam3Text:

Fixed mapping and indexing for mask and bounding box dictionaries to use absolute frame IDs instead of local indices, ensuring correct association across the video sequence.
Updated mask assignment to encode object IDs in the mask values for better downstream processing and visualization.
Simplified function signatures by removing unnecessary width and height arguments from calls to sam3Utils.mapIds, as this information is not needed.

These changes collectively improve the flexibility, correctness, and usability of the video segmentation pipeline, especially for workflows involving bounding box-based segmentation and multi-resolution inputs.

Copilot

Pull request overview

This PR extends the video segmentation pipeline by adding a new Meshroom node (VideoSegmentationSam3Boxes) that generates masks from tracked bounding boxes stored in a JSON file, and aligns parts of the existing SAM3 text-based video segmentation to use absolute frame IDs and updated ID mapping.

Changes:

Added VideoSegmentationSam3Boxes node to segment video frames using per-frame bounding boxes (with multi-resolution inputs and mask inversion support).
Added segmentationRDS/bboxUtils.py to parse/merge/expand boxes and split them into consecutive-frame chunks.
Updated SAM3 utilities and text node to use the new mapIds signature and to key box dictionaries by absolute frame IDs.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 10 comments.

File	Description
`segmentationRDS/sam3Utils.py`	Changed `mapIds` signature and scaled ROI using mask dimensions instead of passed-in w/h.
`segmentationRDS/bboxUtils.py`	New helper module for reading/merging/expanding boxes and creating tracking chunks.
`meshroom/imageSegmentation/VideoSegmentationSam3Text.py`	Updated `mapIds` calls and changed box dictionary indexing to absolute frame IDs; adjusted mask filling values.
`meshroom/imageSegmentation/VideoSegmentationSam3Boxes.py`	New node implementation for box-driven video segmentation with SAM3 video predictor and multi-resolution crop handling.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

segmentationRDS/sam3Utils.py

segmentationRDS/bboxUtils.py

meshroom/imageSegmentation/VideoSegmentationSam3Boxes.py

CedricThebault · 2026-04-01T15:34:20Z

segmentationRDS/bboxUtils.py

+
+def merge_boxes(box1: list, box2: list, iou_threshold: float = 0.5) -> tuple[list, str]:
+    """
+    Merge 2 boxes xyxy by taking the bounding boxe, if their IoU is higher than the threshold. 


Suggested change

Merge 2 boxes xyxy by taking the bounding boxe, if their IoU is higher than the threshold.

Merge 2 boxes xyxy by taking the bounding box, if their IoU is higher than the threshold.

CedricThebault · 2026-04-01T15:38:31Z

segmentationRDS/bboxUtils.py

+        ]
+        return merged, f"bounding (IoU={iou:.2f})"
+    else:
+        return box1, f"forward     (IoU={iou:.2f} < seuil={iou_threshold})"


Suggested change

return box1, f"forward (IoU={iou:.2f} < seuil={iou_threshold})"

return box1, f"forward (IoU={iou:.2f} < threshold={iou_threshold})"

CedricThebault · 2026-04-01T16:16:15Z

segmentationRDS/bboxUtils.py

+
+    expanded_display = [int(new_x1), int(new_y1), int(new_x2), int(new_y2)]
+
+    # 3. Back conversion to source space


Suggested change

# 3. Back conversion to source space

# Back conversion to source space

CedricThebault · 2026-04-02T07:31:01Z

segmentationRDS/bboxUtils.py

+) -> dict:
+    """
+    Extract bounding boxes per object and organize them in chunck of consecutive frames.
+    Coordinates in the json file are supposed to be in the original source space, with the pixel aspect ratio not applicated.


Suggested change

Coordinates in the json file are supposed to be in the original source space, with the pixel aspect ratio not applicated.

Coordinates in the json file are supposed to be in the original source space, with the pixel aspect ratio not applied.

CedricThebault · 2026-04-02T09:21:58Z

segmentationRDS/bboxUtils.py

+import json
+from dataclasses import dataclass, field
+
+THRESHOLDS = [252, 504, 1008]


Suggested change

THRESHOLDS = [252, 504, 1008]

SIZE_THRESHOLDS = [252, 504, 1008]

CedricThebault · 2026-04-02T09:23:27Z

segmentationRDS/bboxUtils.py

+                    if target_size < 504 and not x4_ok:
+                        target_size = 504
+                    if target_size < 1008 and not x2_ok:
+                        target_size = 1008


Suggested change

if target_size < 504 and not x4_ok:

target_size = 504

if target_size < 1008 and not x2_ok:

target_size = 1008

if target_size < SIZE_THRESHOLDS[1] and not x4_ok:

target_size = SIZE_THRESHOLDS[1]

if target_size < SIZE_THRESHOLDS[2] and not x2_ok:

target_size = SIZE_THRESHOLDS[2]

CedricThebault · 2026-04-02T09:46:01Z

meshroom/imageSegmentation/VideoSegmentationSam3Boxes.py

+        desc.File(
+            name="inputx2",
+            label="Inputx2",
+            description="Folder containing source images upscale by 2.",


Suggested change

description="Folder containing source images upscale by 2.",

description="Folder containing source images upscaled by 2.",

CedricThebault · 2026-04-02T09:46:11Z

meshroom/imageSegmentation/VideoSegmentationSam3Boxes.py

+        desc.File(
+            name="inputx4",
+            label="Inputx4",
+            description="Folder containing source images upscale by 4.",


Suggested change

description="Folder containing source images upscale by 4.",

description="Folder containing source images upscaled by 4.",

CedricThebault · 2026-04-02T12:12:19Z

meshroom/imageSegmentation/VideoSegmentationSam3Boxes.py

+
+            image_paths.sort(key=lambda x: x[0])
+    else:
+        raise ValueError(f"Input path '{input_path}' is not a valid path (folder or sfmData file).")


Suggested change

raise ValueError(f"Input path '{input_path}' is not a valid path (folder or sfmData file).")

raise ValueError(f"Input path '{input_path}' is not a valid sfmData file.")

CedricThebault · 2026-04-02T13:16:39Z

meshroom/imageSegmentation/VideoSegmentationSam3Boxes.py

+                for id, v in views.items():
+                    image_x1_path = Path(v.getImage().getImagePath())
+                    image_x1_name = image_x1_path.name
+                    image_x2_path = None
+                    if os.path.isfile(os.path.join(path_folder_x2, image_x1_name)):
+                        image_x2_path = os.path.join(path_folder_x2, image_x1_name)
+                    image_x4_path = None
+                    if os.path.isfile(os.path.join(path_folder_x4, image_x1_name)):
+                        image_x4_path = os.path.join(path_folder_x4, image_x1_name)
+                    intrinsic = dataAV.getIntrinsicSharedPtr(v.getIntrinsicId())
+                    pinhole = camera.Pinhole.cast(intrinsic)
+                    par = 1.0
+                    if pinhole is not None:
+                        par = pinhole.getPixelAspectRatio()
+                    image_paths.append((image_x1_path, str(id), v.getFrameId(), v.getImage().getWidth(),
+                                        v.getImage().getHeight(), par, image_x2_path, image_x4_path))


Suggested change

for id, v in views.items():

image_x1_path = Path(v.getImage().getImagePath())

image_x1_name = image_x1_path.name

image_x2_path = None

if os.path.isfile(os.path.join(path_folder_x2, image_x1_name)):

image_x2_path = os.path.join(path_folder_x2, image_x1_name)

image_x4_path = None

if os.path.isfile(os.path.join(path_folder_x4, image_x1_name)):

image_x4_path = os.path.join(path_folder_x4, image_x1_name)

intrinsic = dataAV.getIntrinsicSharedPtr(v.getIntrinsicId())

pinhole = camera.Pinhole.cast(intrinsic)

par = 1.0

if pinhole is not None:

par = pinhole.getPixelAspectRatio()

image_paths.append((image_x1_path, str(id), v.getFrameId(), v.getImage().getWidth(),

v.getImage().getHeight(), par, image_x2_path, image_x4_path))

commonParams = None

for id, v in views.items():

image_x1_path = Path(v.getImage().getImagePath())

image_x1_name = image_x1_path.name

image_x2_path = None

if os.path.isfile(os.path.join(path_folder_x2, image_x1_name)):

image_x2_path = os.path.join(path_folder_x2, image_x1_name)

image_x4_path = None

if os.path.isfile(os.path.join(path_folder_x4, image_x1_name)):

image_x4_path = os.path.join(path_folder_x4, image_x1_name)

intrinsic = dataAV.getIntrinsicSharedPtr(v.getIntrinsicId())

pinhole = camera.Pinhole.cast(intrinsic)

par = 1.0

if pinhole is not None:

par = pinhole.getPixelAspectRatio()

if commonParams is None:

commonParams = [v.getImage().getWidth(), v.getImage().getHeight(), par, image_x2_path is None, image_x4_path is None]

if commonParams != [v.getImage().getWidth(), v.getImage().getHeight(), par, image_x2_path is None, image_x4_path is None]:

raise ValueError("All images do not have same dimensions or one image is missing its upscaled version.")

image_paths.append((image_x1_path, str(id), v.getFrameId(), v.getImage().getWidth(),

v.getImage().getHeight(), par, image_x2_path, image_x4_path))

demoulinv added 3 commits March 19, 2026 14:36

Update indexes mapping

5a1ee74

Adjust frame index in bboxes.json file

eeb972f

Add VideoSegmentationSam3Boxes node

db93015

demoulinv requested a review from Copilot March 31, 2026 06:15

demoulinv self-assigned this Mar 31, 2026

Copilot started reviewing on behalf of demoulinv March 31, 2026 06:15 View session

Copilot AI reviewed Mar 31, 2026

View reviewed changes

demoulinv added 2 commits March 31, 2026 10:58

Copilot suggests

543083a

Adjust logger output

702d6da

demoulinv requested review from CedricThebault and fabiencastan March 31, 2026 09:06

bugfix typo on useGpu

dbe0d67

CedricThebault reviewed Apr 2, 2026

View reviewed changes

	Merge 2 boxes xyxy by taking the bounding boxe, if their IoU is higher than the threshold.
	Merge 2 boxes xyxy by taking the bounding box, if their IoU is higher than the threshold.

	return box1, f"forward (IoU={iou:.2f} < seuil={iou_threshold})"
	return box1, f"forward (IoU={iou:.2f} < threshold={iou_threshold})"


		expanded_display = [int(new_x1), int(new_y1), int(new_x2), int(new_y2)]

		# 3. Back conversion to source space

	# 3. Back conversion to source space
	# Back conversion to source space

	Coordinates in the json file are supposed to be in the original source space, with the pixel aspect ratio not applicated.
	Coordinates in the json file are supposed to be in the original source space, with the pixel aspect ratio not applied.

	THRESHOLDS = [252, 504, 1008]
	SIZE_THRESHOLDS = [252, 504, 1008]

	description="Folder containing source images upscale by 2.",
	description="Folder containing source images upscaled by 2.",

	raise ValueError(f"Input path '{input_path}' is not a valid path (folder or sfmData file).")
	raise ValueError(f"Input path '{input_path}' is not a valid sfmData file.")

Conversation

demoulinv commented Mar 31, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants