Skip to content

add optional wait_status to exception event to inform finish event status#504

Merged
mergify[bot] merged 3 commits intoflux-framework:masterfrom
grondo:exception-status
Mar 20, 2026
Merged

add optional wait_status to exception event to inform finish event status#504
mergify[bot] merged 3 commits intoflux-framework:masterfrom
grondo:exception-status

Conversation

@grondo
Copy link
Contributor

@grondo grondo commented Mar 19, 2026

This PR adds an optional wait_status field to exception events, meant to be used by the job shell to force a finish event status value when job cleanup triggered by the exception could end up masking a more important exit status. (E.g. when a single task segfaults and exit-on-error is set).

This is just a proposal for discussion at this point. Happy to investigate other approaches!

Problem: When a severity 0 exception is raised on a job, the
subsequent job cleanup may shadow an abnormal exit status that
caused the exception, for example if a task gets a segfault and
exit-on-error is set.

Add an optional wait_status field to exception events to capture
any relevant abnormal wait status that should be preserved in the
eventlog.
@github-actions
Copy link

⚠️ linkcheck failed with status code 2

1 similar comment
@github-actions
Copy link

⚠️ linkcheck failed with status code 2

Copy link
Member

@garlick garlick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah seems good to me.

grondo added 2 commits March 19, 2026 06:51
Problem: An exception event wait_status is proposed as a way to
override the default computation of the finish event status as the
maximum of all shell wait status codes, but this is not codified in
the finish event status field description in RFC 21.

Require that a severity 0 exception with a wait_status field be used
to derive the finish event status when present.
Problem: RFC 21 requires that the finish event status field be derived
from the exception wait_status field if present, but RFC 32 does not
call this out for the exec system finish response.

Amend the finish response with the updated requirements.
@grondo grondo force-pushed the exception-status branch from dc4e37f to 7f4b52a Compare March 19, 2026 13:51
@github-actions
Copy link

⚠️ linkcheck failed with status code 2

@mergify mergify bot added the queued label Mar 20, 2026
@mergify mergify bot merged commit 282df7e into flux-framework:master Mar 20, 2026
6 of 7 checks passed
@mergify
Copy link
Contributor

mergify bot commented Mar 20, 2026

Merge Queue Status

  • Entered queue2026-03-20 01:20 UTC · Rule: default
  • Checks skipped · PR is already up-to-date
  • Merged2026-03-20 01:20 UTC · at 7f4b52aa10378275543502f4d5ae96d4d2f482c8

This pull request spent 5 seconds in the queue, with no time running CI.

Required conditions to merge
  • any of [🛡 GitHub branch protection]:
    • check-success = docs/readthedocs.org:flux-rfc
    • check-neutral = docs/readthedocs.org:flux-rfc
    • check-skipped = docs/readthedocs.org:flux-rfc
  • any of [🛡 GitHub branch protection]:
    • check-success = make check
    • check-neutral = make check
    • check-skipped = make check
  • any of [🛡 GitHub branch protection]:
    • check-success = validate commits
    • check-neutral = validate commits
    • check-skipped = validate commits

@mergify mergify bot removed the queued label Mar 20, 2026
@grondo grondo deleted the exception-status branch March 20, 2026 01:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants