VLNActionDataset: Integer division (//) for num_rounds discards final action segments and STOP signals, leading to agent failing to learn to stop

In `streamvln/dataset/vln_action_dataset.py`, line 657 calculates `num_rounds` using floor division (//):
```python
num_rounds = (actions_len - valid_idx) // self.num_frames
Then it iterates over range(num_rounds + 1) and skips the last empty window (when n * self.num_frames == actions_len - valid_idx), which results in only num_rounds samples being added per episode.
This means:
For any episode where (actions_len - valid_idx) % self.num_frames != 0, the final action segment (including the STOP step) is discarded entirely.
The training set loses a large number of STOP signals, so the agent never sees enough examples of when to stop during navigation.
This could lead to the agent failing to stop at the target during inference, severely hurting navigation success rates and SPL metrics.
Example
actions_len - valid_idx = 87
self.num_frames = 32
num_rounds = 87 // 32 = 2
Only windows 0~31 and 32~63 are kept; the final segment 64~87 (including STOP) is lost.
Suggested Fix
Instead of discarding the final segment, we should use ceiling division to keep all action segments (including the last partial one) and pad shorter segments to self.num_frames length during collation. This way:
No action/STOP steps are discarded
The agent can learn to recognize and execute the STOP signal
Training stability is maintained via padding
Could we modify the sampling logic to preserve all action segments (including partial ones) instead of using floor division?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

VLNActionDataset: Integer division (//) for num_rounds discards final action segments and STOP signals, leading to agent failing to learn to stop #88

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

VLNActionDataset: Integer division (//) for num_rounds discards final action segments and STOP signals, leading to agent failing to learn to stop #88

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions