Skip to content

feat(rst): add glacier support#212

Merged
swartzn merged 7 commits intomainfrom
swartzn/add-glacier-support
Dec 17, 2025
Merged

feat(rst): add glacier support#212
swartzn merged 7 commits intomainfrom
swartzn/add-glacier-support

Conversation

@swartzn
Copy link
Copy Markdown
Contributor

@swartzn swartzn commented Jul 1, 2025

FYI, this PR's basic glacier/archival functionality is ready for review but I intend to make a few more minor improvements.

What does this PR do / why do we need it?

Required for all PRs.

Add support for s3-compatible archival storage classes which require restoring the resource before downloading it.

Note ThinkParQ/protobuf#54 is required.

Related Issue(s)

Required when applicable.

Where should the reviewer(s) start reviewing this?

Only required for larger PRs when this may not be immediately obvious.

Are there any specific topics we should discuss before merging?

Not required.

What are the next steps after this PR?

Not required.

Checklist before merging:

Required for all PRs.

When creating a PR these are items to keep in mind that cannot be checked by GitHub actions:

  • Documentation:
    • Does developer documentation (code comments, readme, etc.) need to be added or updated?
    • Does the user documentation need to be expanded or updated for this change?
  • Testing:
    • Does this functionality require changing or adding new unit tests?
    • Does this functionality require changing or adding new integration tests?
  • Git Hygiene:

For more details refer to the Go coding standards and the pull request process.

@swartzn swartzn requested a review from a team as a code owner July 1, 2025 19:14
@swartzn swartzn force-pushed the swartzn/add-glacier-support branch from 39cb377 to 7a29032 Compare July 2, 2025 19:11
@swartzn swartzn changed the base branch from main to swartzn/add-priority-wait-queue July 7, 2025 10:27
@swartzn swartzn self-assigned this Aug 4, 2025
@swartzn
Copy link
Copy Markdown
Contributor Author

swartzn commented Aug 4, 2025

@swartzn Squash commits before merging

@swartzn swartzn force-pushed the swartzn/add-priority-wait-queue branch from dbcd5f0 to 294dd70 Compare August 25, 2025 10:49
@swartzn swartzn force-pushed the swartzn/add-priority-wait-queue branch from 294dd70 to 3afcac2 Compare September 24, 2025 16:06
@swartzn swartzn force-pushed the swartzn/add-glacier-support branch from 8b36f0c to d1854e5 Compare October 23, 2025 12:27
@swartzn swartzn force-pushed the swartzn/add-glacier-support branch from d1854e5 to e5d4ba6 Compare October 24, 2025 17:11
@swartzn swartzn force-pushed the swartzn/add-priority-wait-queue branch 2 times, most recently from a53ddea to 010690c Compare October 25, 2025 21:40
@swartzn swartzn force-pushed the swartzn/add-glacier-support branch 3 times, most recently from 63c143d to e0c9087 Compare October 28, 2025 14:19
@swartzn swartzn force-pushed the swartzn/add-priority-wait-queue branch from 5a857ba to e748d00 Compare October 28, 2025 14:42
@swartzn swartzn force-pushed the swartzn/add-glacier-support branch from e0c9087 to b1dad7e Compare October 28, 2025 14:43
@swartzn swartzn force-pushed the swartzn/add-priority-wait-queue branch 2 times, most recently from 1a58355 to a8b1024 Compare October 28, 2025 19:39
Base automatically changed from swartzn/add-priority-wait-queue to main October 28, 2025 19:43
@swartzn swartzn force-pushed the swartzn/add-glacier-support branch from b1dad7e to ef4f20e Compare October 28, 2025 23:55
@swartzn swartzn changed the title Swartzn/add glacier support feat(rst): add glacier support Oct 28, 2025
@swartzn swartzn force-pushed the swartzn/add-glacier-support branch from a1b5873 to 45aeb6e Compare October 29, 2025 18:34
@swartzn swartzn force-pushed the swartzn/add-glacier-support branch from 45aeb6e to e354e96 Compare November 3, 2025 22:27
@swartzn swartzn force-pushed the swartzn/add-glacier-support branch 2 times, most recently from 4b139e9 to dc7f599 Compare November 25, 2025 11:40
@swartzn
Copy link
Copy Markdown
Contributor Author

swartzn commented Nov 25, 2025

Tracking the scheduler’s library move from 'rst/sync/internal/workmgr/' to 'common/scheduler/' must be explicit due to the size of the changes. So, keep commit 143726b.

@swartzn swartzn force-pushed the swartzn/add-glacier-support branch from d7e8462 to fcb4f53 Compare December 5, 2025 20:07
@swartzn swartzn force-pushed the swartzn/add-glacier-support branch 2 times, most recently from 01f5da3 to 33a0130 Compare December 15, 2025 12:21
Copy link
Copy Markdown
Member

@iamjoemccormick iamjoemccormick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! For the record, I only minimally tested this, but I know you conducted much more extensive testing.

Generalize the scheduler so it can be consumed as a library component.
Decouple token release from scheduler stats updates
- Release tokens more frequently and on demand to reduce latency
- Improves worker queue backpressure
Improve token distribution
- Emit priority tokens in the priority order worker managers should follow.
- Distribute tokens in round-robin fashion to improved fairness and distribution.
Add scheduler specific metrics.
Stream local paths in lexicographic order.
Support stopping early with max path limit.
Support resuming stream after a specific path.
Add resume support for general and directory s3 buckets.
Update rst.Provider GetWalk() with filesystem.StreamPathResult.
The builder job now supports resumable walks by persisting the walk’s resume token in the work’s
externalId before notifying the sync worker to reschedule. This enables large builder requests to
yield execution time to other work and helps prevent concentrated load on the remote node.
- Dynamically add job request workers to keeps throughput balanced without over saturating the system.
- Remove the no longer used rst WalkResponse and WalkPath.

The sync manager now fully utilizes the common scheduler implementation. On startup, it scans the
work journal and creates scheduler tokens for both new and rescheduled work. Pulling rescheduled
work is now deadline-based, preventing unnecessary scans until at least one request is ready to be
queued again.
- Update sync manager with common scheduler implementation
- Separate the logic around pulling in new work from rescheduled.
- Improve sync manager and worker metrics.
- Fix dropping scheduler tokens when cancelling job request that's already in the work queue map.
- Fix the completedWork channel deadlock possibility.
S3 clients can now restore objects from archival storage classes which are defined in the storage
class definitions for the remote target's configuration. When a requests object is archived, the
restore will be initiated and the work is rescheduled with the a retry time.
- Update protobuf to add support for archival storage classes
- Document storage class definitions in beegfs-remote.toml
@swartzn swartzn force-pushed the swartzn/add-glacier-support branch from 33a0130 to 913a399 Compare December 17, 2025 18:12
@swartzn swartzn merged commit f93e9a9 into main Dec 17, 2025
6 checks passed
@swartzn swartzn deleted the swartzn/add-glacier-support branch December 17, 2025 19:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants