Skip to content

SCHED-957: OOM killer should kill docker containers before slurmd#2319

Draft
faucct wants to merge 1 commit intomainfrom
feature/SCHED-957-oom-killer-should-kill-docker-containers-before-slurmd
Draft

SCHED-957: OOM killer should kill docker containers before slurmd#2319
faucct wants to merge 1 commit intomainfrom
feature/SCHED-957-oom-killer-should-kill-docker-containers-before-slurmd

Conversation

@faucct
Copy link
Collaborator

@faucct faucct commented Mar 13, 2026

Problem

When users spawn memory-heavy processes, e.g. docker containers, the OOM killer can kill slurmd instead of the processes.

Solution

Tuned the OOM killer a bit. Can extend this to other important processes, e.g. supervisord.

Testing

manual

Release Notes

Adjust the slurmd OOM score to be lower than the one of users' jobs.

@faucct faucct force-pushed the feature/SCHED-957-oom-killer-should-kill-docker-containers-before-slurmd branch 2 times, most recently from 4e35d12 to be3155a Compare March 13, 2026 15:46
@faucct faucct force-pushed the feature/SCHED-957-oom-killer-should-kill-docker-containers-before-slurmd branch from be3155a to 783a3ba Compare March 13, 2026 15:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant