Updates to use amazon linux 2023 ami and reflect the actual RealMemory of the compute nodes#34
Updates to use amazon linux 2023 ami and reflect the actual RealMemory of the compute nodes#34okram999 wants to merge 2 commits intoaws-samples:plugin-v2from
Conversation
| cd /home/ec2-user/slurm-* | ||
| /home/ec2-user/slurm-*/configure --prefix=/nfs/slurm | ||
| make -j 4 | ||
| tar -xf slurm-*.tar.bz2 |
There was a problem hiding this comment.
What if SlurmPackageUrl is not .tar.bz2?
There was a problem hiding this comment.
@bollig - if its changes the script will break. Its an upstream packaging decision that devs normally doesn't change in a whim. The archive extension appears to be consistently using .tar.bz2. https://download.schedmd.com/slurm/.
Now we can try to handle some of them but will not be full proof.
There was a problem hiding this comment.
something ugly like this
wget -q ${SlurmPackageUrl}
# Extract based on file extension
if ls slurm-*.tar.gz >/dev/null 2>&1; then
tar -xzf slurm-*.tar.gz
elif ls slurm-*.tar.bz2 >/dev/null 2>&1; then
tar -xjf slurm-*.tar.bz2
elif ls slurm-*.tgz >/dev/null 2>&1; then
tar -xzf slurm-*.tgz
elif ls slurm-*.tar >/dev/null 2>&1; then
tar -xf slurm-*.tar
else
echo "No recognized Slurm archive found"
exit 1
fi
# Change to the extracted directory, excluding any archive files
cd "$(ls -d /home/ec2-user/slurm-* | grep -v -E '\.tar\.gz$|\.tar\.bz2$|\.tgz$|\.tar$')"
There was a problem hiding this comment.
fair. this is ok as is, just thinking about people who may roll/distribute their own patch-fixed version of slurm.
| Default: 2 | ||
| Description: Number of vCPUs for the compute node instance type | ||
|
|
||
| ComputeNodeMemory: |
There was a problem hiding this comment.
Why statically define this? Why not auto-detect RealMemory and have users provide a SchedulableMemory percentage (see ParallelCluster for example)?
There was a problem hiding this comment.
SchedulableMemory 'll be something that needs further research and for my next PR. :)
IFAIK, SchedulableMemory is a Slurm configuration that's shipped with AWS ParallelCluster. And we are not dealing with parallelcluster here. As for the auto-detect for RealMemory; we need an ec2 describe call for the instance specified. This will also need more broader change which is beyond the scope of this PR.
Issue #, if available:
Update the quickstart
template.yamlto use Amazon Linux 2023 and include realmemory in thepartition.jsonDescription of changes:
yumtodnfRealMemoryin thepartitions.jsonto reflect the available memory of the compute nodes. If not specified, slurm failed to fetch the actual available memory.By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.