|
|
| |
cgroup.conf(5) |
Slurm Configuration File |
cgroup.conf(5) |
cgroup.conf - Slurm configuration file for the cgroup support
cgroup.conf is an ASCII file which defines parameters used by Slurm's
Linux cgroup related plugins. The file location can be modified at system
build time using the DEFAULT_SLURM_CONF parameter or at execution time by
setting the SLURM_CONF environment variable. The file will always be located
in the same directory as the slurm.conf file.
Parameter names are case insensitive. Any text following a
"#" in the configuration file is treated as a comment through the
end of that line. Changes to the configuration file take effect upon restart
of Slurm daemons, daemon receipt of the SIGHUP signal, or execution of the
command "scontrol reconfigure" unless otherwise noted.
For general Slurm Cgroups information, see the Cgroups Guide at
<https://slurm.schedmd.com/cgroups.html>.
The following cgroup.conf parameters are defined to control the
general behavior of Slurm cgroup plugins.
- CgroupAutomount=<yes|no>
- Slurm cgroup plugins require valid and functional cgroup subsystem to be
mounted under /sys/fs/cgroup/<subsystem_name>. When launched,
plugins check their subsystem availability. If not available, the plugin
launch fails unless CgroupAutomount is set to yes. In that case, the
plugin will first try to mount the required subsystems.
- CgroupMountpoint=PATH
- Specify the PATH under which cgroups should be mounted. This should
be a writable directory which will contain cgroups mounted one per
subsystem. The default PATH is /sys/fs/cgroup.
The following cgroup.conf parameters are defined to control the behavior of this
particular plugin:
- AllowedKmemSpace=<number>
- Constrain the job cgroup kernel memory to this amount of the allocated
memory, specified in bytes. The AllowedKmemSpace must be between
the upper and lower memory limits, specified by MaxKmemPercent and
MinKmemSpace, respectively. If AllowedKmemSpace goes beyond
the upper or lower limit, it will be reset to that upper or lower limit,
whichever has been exceeded.
- AllowedRAMSpace=<number>
- Constrain the job/step cgroup RAM to this percentage of the allocated
memory. The percentage supplied may be expressed as floating point number,
e.g. 101.5. Sets the cgroup soft memory limit at the allocated memory size
and then sets the job/step hard memory limit at the (AllowedRAMSpace/100)
* allocated memory. If the job/step exceeds the hard limit, then it might
trigger Out Of Memory (OOM) events (including oom-kill) which will be
logged to kernel log ringbuffer (dmesg in Linux). Setting AllowedRAMSpace
above 100 may cause system Out of Memory (OOM) events as it allows
job/step to allocate more memory than configured to the nodes. Reducing
configured node available memory to avoid system OOM events is suggested.
Setting AllowedRAMSpace below 100 will result in jobs receiving less
memory than allocated and soft memory limit will set to the same value as
the hard limit. Also see ConstrainRAMSpace. The default value is
100.
- AllowedSwapSpace=<number>
- Constrain the job cgroup swap space to this percentage of the allocated
memory. The default value is 0, which means that RAM+Swap will be limited
to AllowedRAMSpace. The supplied percentage may be expressed as a
floating point number, e.g. 50.5. If the limit is exceeded, the job steps
will be killed and a warning message will be written to standard error.
Also see ConstrainSwapSpace. NOTE: Setting AllowedSwapSpace to 0
does not restrict the Linux kernel from using swap space. To control how
the kernel uses swap space, see MemorySwappiness.
- ConstrainCores=<yes|no>
- If configured to "yes" then constrain allowed cores to the
subset of allocated resources. This functionality makes use of the cpuset
subsystem. Due to a bug fixed in version 1.11.5 of HWLOC, the
task/affinity plugin may be required in addition to task/cgroup for this
to function properly. The default value is "no".
- ConstrainDevices=<yes|no>
- If configured to "yes" then constrain the job's allowed devices
based on GRES allocated resources. It uses the devices subsystem for that.
The default value is "no".
- ConstrainKmemSpace=<yes|no>
- If configured to "yes" then constrain the job's Kmem RAM usage
in addition to RAM usage. Only takes effect if ConstrainRAMSpace is
set to "yes". If enabled, the job's Kmem limit will be assigned
the value of AllowedKmemSpace or the value coming from
MaxKmemPercent. The default value is "no" which will
leave Kmem setting untouched by Slurm. Also see AllowedKmemSpace,
MaxKmemPercent.
- ConstrainRAMSpace=<yes|no>
- If configured to "yes" then constrain the job's RAM usage by
setting the memory soft limit to the allocated memory and the hard limit
to the allocated memory * AllowedRAMSpace. The default value is
"no", in which case the job's RAM limit will be set to its swap
space limit if ConstrainSwapSpace is set to "yes". Also
see AllowedSwapSpace, AllowedRAMSpace and
ConstrainSwapSpace. NOTE: When enabled, ConstrainRAMSpace can lead
to a noticeable decline in per-node job throughout. Sites with
high-throughput requirements should carefully weigh the tradeoff between
per-node throughput, versus potential problems that can arise from
unconstrained memory usage on the node. See
<https://slurm.schedmd.com/high_throughput.html> for further
discussion.
- ConstrainSwapSpace=<yes|no>
- If configured to "yes" then constrain the job's swap space
usage. The default value is "no". Note that when set to
"yes" and ConstrainRAMSpace is set to "no",
AllowedRAMSpace is automatically set to 100% in order to limit the
RAM+Swap amount to 100% of job's requirement plus the percent of allowed
swap space. This amount is thus set to both RAM and RAM+Swap limits. This
means that in that particular case, ConstrainRAMSpace is automatically
enabled with the same limit than the one used to constrain swap space.
Also see AllowedSwapSpace.
- MaxRAMPercent=PERCENT
- Set an upper bound in percent of total RAM on the RAM constraint for a
job. This will be the memory constraint applied to jobs that are not
explicitly allocated memory by Slurm (i.e. Slurm's select plugin is not
configured to manage memory allocations). The PERCENT may be an
arbitrary floating point number. The default value is 100.
- MaxSwapPercent=PERCENT
- Set an upper bound (in percent of total RAM) on the amount of RAM+Swap
that may be used for a job. This will be the swap limit applied to jobs on
systems where memory is not being explicitly allocated to job. The
PERCENT may be an arbitrary floating point number between 0 and
100. The default value is 100.
- MaxKmemPercent=PERCENT
- Set an upper bound in percent of total RAM as the maximum Kmem for a job.
The PERCENT may be an arbitrary floating point number, however, the
product of MaxKmemPercent and job requested memory has to fall
between MinKmemSpace and job requested memory, otherwise the
boundary value is used. The default value is 100.
- MemorySwappiness=<number>
- Configure the kernel's priority for swapping out anonymous pages (such as
program data) verses file cache pages for the job cgroup. Valid values are
between 0 and 100, inclusive. A value of 0 prevents the kernel from
swapping out program data. A value of 100 gives equal priorioty to
swapping out file cache or anonymous pages. If not set, then the kernel's
default swappiness value will be used. Either ConstrainRAMSpace or
ConstrainSwapSpace must be set to yes in order for this
parameter to be applied.
- MinKmemSpace=<number>
- Set a lower bound (in MB) on the memory limits defined by
AllowedKmemSpace. The default limit is 30M.
- MinRAMSpace=<number>
- Set a lower bound (in MB) on the memory limits defined by
AllowedRAMSpace and AllowedSwapSpace. This prevents
accidentally creating a memory cgroup with such a low limit that
slurmstepd is immediately killed due to lack of RAM. The default limit is
30M.
- TaskAffinity=<yes|no>
- If configured to "yes" then set a default task affinity to bind
each step task to a subset of the allocated cores using
sched_setaffinity. The default value is "no". Note: This
feature requires the Portable Hardware Locality (hwloc) library to be
installed.
Debian and derivatives (e.g. Ubuntu) usually exclude the memory and memsw (swap)
cgroups by default. To include them, add the following parameters to the
kernel command line: cgroup_enable=memory swapaccount=1
This can usually be placed in /etc/default/grub inside the
GRUB_CMDLINE_LINUX variable. A command such as update-grub must be
run after updating the file.
###
# Slurm cgroup support configuration file
###
CgroupAutomount=yes
ConstrainCores=yes
#
Copyright (C) 2010-2012 Lawrence Livermore National Security. Produced at
Lawrence Livermore National Laboratory (cf, DISCLAIMER).
Copyright (C) 2010-2016 SchedMD LLC.
This file is part of Slurm, a resource management program. For
details, see <https://slurm.schedmd.com/>.
Slurm is free software; you can redistribute it and/or modify it
under the terms of the GNU General Public License as published by the Free
Software Foundation; either version 2 of the License, or (at your option)
any later version.
Slurm is distributed in the hope that it will be useful, but
WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for
more details.
Visit the GSP FreeBSD Man Page Interface. Output converted with ManDoc. |