Conversation
39cb377 to
7a29032
Compare
Contributor
Author
|
@swartzn Squash commits before merging |
dbcd5f0 to
294dd70
Compare
294dd70 to
3afcac2
Compare
8b36f0c to
d1854e5
Compare
swartzn
commented
Oct 23, 2025
d1854e5 to
e5d4ba6
Compare
a53ddea to
010690c
Compare
63c143d to
e0c9087
Compare
5a857ba to
e748d00
Compare
e0c9087 to
b1dad7e
Compare
1a58355 to
a8b1024
Compare
b1dad7e to
ef4f20e
Compare
iamjoemccormick
requested changes
Oct 29, 2025
a1b5873 to
45aeb6e
Compare
45aeb6e to
e354e96
Compare
4b139e9 to
dc7f599
Compare
Contributor
Author
|
Tracking the scheduler’s library move from 'rst/sync/internal/workmgr/' to 'common/scheduler/' must be explicit due to the size of the changes. So, keep commit 143726b. |
iamjoemccormick
requested changes
Nov 25, 2025
iamjoemccormick
requested changes
Nov 25, 2025
iamjoemccormick
requested changes
Dec 4, 2025
d7e8462 to
fcb4f53
Compare
iamjoemccormick
requested changes
Dec 5, 2025
01f5da3 to
33a0130
Compare
iamjoemccormick
approved these changes
Dec 16, 2025
Member
iamjoemccormick
left a comment
There was a problem hiding this comment.
LGTM! For the record, I only minimally tested this, but I know you conducted much more extensive testing.
Generalize the scheduler so it can be consumed as a library component. Decouple token release from scheduler stats updates - Release tokens more frequently and on demand to reduce latency - Improves worker queue backpressure Improve token distribution - Emit priority tokens in the priority order worker managers should follow. - Distribute tokens in round-robin fashion to improved fairness and distribution. Add scheduler specific metrics.
Stream local paths in lexicographic order. Support stopping early with max path limit. Support resuming stream after a specific path.
Add resume support for general and directory s3 buckets. Update rst.Provider GetWalk() with filesystem.StreamPathResult.
The builder job now supports resumable walks by persisting the walk’s resume token in the work’s externalId before notifying the sync worker to reschedule. This enables large builder requests to yield execution time to other work and helps prevent concentrated load on the remote node. - Dynamically add job request workers to keeps throughput balanced without over saturating the system. - Remove the no longer used rst WalkResponse and WalkPath. The sync manager now fully utilizes the common scheduler implementation. On startup, it scans the work journal and creates scheduler tokens for both new and rescheduled work. Pulling rescheduled work is now deadline-based, preventing unnecessary scans until at least one request is ready to be queued again. - Update sync manager with common scheduler implementation - Separate the logic around pulling in new work from rescheduled. - Improve sync manager and worker metrics. - Fix dropping scheduler tokens when cancelling job request that's already in the work queue map. - Fix the completedWork channel deadlock possibility.
S3 clients can now restore objects from archival storage classes which are defined in the storage class definitions for the remote target's configuration. When a requests object is archived, the restore will be initiated and the work is rescheduled with the a retry time. - Update protobuf to add support for archival storage classes - Document storage class definitions in beegfs-remote.toml
33a0130 to
913a399
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
FYI, this PR's basic glacier/archival functionality is ready for review but I intend to make a few more minor improvements.
What does this PR do / why do we need it?
Required for all PRs.
Add support for s3-compatible archival storage classes which require restoring the resource before downloading it.
Note ThinkParQ/protobuf#54 is required.
Related Issue(s)
Required when applicable.
Where should the reviewer(s) start reviewing this?
Only required for larger PRs when this may not be immediately obvious.
Are there any specific topics we should discuss before merging?
Not required.
What are the next steps after this PR?
Not required.
Checklist before merging:
Required for all PRs.
When creating a PR these are items to keep in mind that cannot be checked by GitHub actions:
For more details refer to the Go coding standards and the pull request process.