cfb15f0bb9
This patch have no functional modification. Just improve the doc and log for isolcpu, nohz_full and numa node banning cpus, for providing more info for users. Signed-off-by: Tao Liu <ltao@redhat.com>
223 lines
8.7 KiB
Groff
223 lines
8.7 KiB
Groff
.de Sh \" Subsection
|
||
.br
|
||
.if t .Sp
|
||
.ne 5
|
||
.PP
|
||
\fB\\$1\fR
|
||
.PP
|
||
..
|
||
.de Sp \" Vertical space (when we can't use .PP)
|
||
.if t .sp .5v
|
||
.if n .sp
|
||
..
|
||
.de Ip \" List item
|
||
.br
|
||
.ie \\n(.$>=3 .ne \\$3
|
||
.el .ne 3
|
||
.IP "\\$1" \\$2
|
||
..
|
||
.TH "IRQBALANCE" 1 "Dec 2006" "Linux" "irqbalance"
|
||
.SH NAME
|
||
irqbalance \- distribute hardware interrupts across processors on a multiprocessor system
|
||
.SH "SYNOPSIS"
|
||
|
||
.nf
|
||
\fBirqbalance\fR
|
||
.fi
|
||
|
||
.SH "DESCRIPTION"
|
||
|
||
.PP
|
||
The purpose of \fBirqbalance\fR is to distribute hardware interrupts across
|
||
processors on a multiprocessor system in order to increase performance\&.
|
||
|
||
.SH "OPTIONS"
|
||
|
||
.TP
|
||
.B -o, --oneshot
|
||
Causes irqbalance to be run once, after which the daemon exits.
|
||
.TP
|
||
|
||
.B -d, --debug
|
||
Causes irqbalance to print extra debug information. Implies --foreground.
|
||
|
||
.TP
|
||
.B -f, --foreground
|
||
Causes irqbalance to run in the foreground (without --debug).
|
||
|
||
.TP
|
||
.B -j, --journal
|
||
Enables log output optimized for systemd-journal.
|
||
|
||
.TP
|
||
.B -p, --powerthresh=<threshold>
|
||
Set the threshold at which we attempt to move a CPU into powersave mode
|
||
If more than <threshold> CPUs are more than 1 standard deviation below the
|
||
average CPU softirq workload, and no CPUs are more than 1 standard deviation
|
||
above (and have more than 1 IRQ assigned to them), attempt to place 1 CPU in
|
||
powersave mode. In powersave mode, a CPU will not have any IRQs balanced to it,
|
||
in an effort to prevent that CPU from waking up without need.
|
||
|
||
.TP
|
||
.B -i, --banirq=<irqnum>
|
||
Add the specified IRQ to the set of banned IRQs. irqbalance will not affect
|
||
the affinity of any IRQs on the banned list, allowing them to be specified
|
||
manually. This option is additive and can be specified multiple times. For
|
||
example to ban IRQs 43 and 44 from balancing, use the following command line:
|
||
.B irqbalance --banirq=43 --banirq=44
|
||
|
||
.TP
|
||
.B -m, --banmod=<module_name>
|
||
Add the specified module to the set of banned modules, similar to --banirq.
|
||
irqbalance will not affect the affinity of any IRQs of given modules, allowing
|
||
them to be specified manually. This option is additive and can be specified
|
||
multiple times. For example to ban all IRQs of module foo and module bar from
|
||
balancing, use the following command line:
|
||
.B irqbalance --banmod=foo --banmod=bar
|
||
|
||
.TP
|
||
.B -c, --deepestcache=<integer>
|
||
This allows a user to specify the cache level at which irqbalance partitions
|
||
cache domains. Specifying a deeper cache may allow a greater degree of
|
||
flexibility for irqbalance to assign IRQ affinity to achieve greater performance
|
||
increases, but setting a cache depth too large on some systems (specifically
|
||
where all CPUs on a system share the deepest cache level), will cause irqbalance
|
||
to see balancing as unnecessary.
|
||
.B irqbalance --deepestcache=2
|
||
.P
|
||
The default value for deepestcache is 2.
|
||
|
||
.TP
|
||
.B -l, --policyscript=<script>
|
||
When specified, the referenced script or directory will execute once for each discovered IRQ,
|
||
with the sysfs device path and IRQ number passed as arguments. Note that the
|
||
device path argument will point to the parent directory from which the IRQ
|
||
attributes directory may be directly opened.
|
||
Policy scripts specified need to be owned and executable by the user of irqbalance process,
|
||
if a directory is specified, non-executable files will be skipped.
|
||
The script may specify zero or more key=value pairs that will guide irqbalance in
|
||
the management of that IRQ. Key=value pairs are printed by the script on stdout
|
||
and will be captured and interpreted by irqbalance. Irqbalance expects a zero
|
||
exit code from the provided utility. Recognized key=value pairs are:
|
||
.TP
|
||
.I ban=[true | false]
|
||
Directs irqbalance to exclude the passed in IRQ from balancing.
|
||
.TP
|
||
.I balance_level=[none | package | cache | core]
|
||
This allows a user to override the balance level of a given IRQ. By default the
|
||
balance level is determined automatically based on the pci device class of the
|
||
device that owns the IRQ.
|
||
.TP
|
||
.I numa_node=<integer>
|
||
This allows a user to override the NUMA node that sysfs indicates a given device
|
||
IRQ is local to. Often, systems will not specify this information in ACPI, and as a
|
||
result devices are considered equidistant from all NUMA nodes in a system.
|
||
This option allows for that hardware provided information to be overridden, so
|
||
that irqbalance can bias IRQ affinity for these devices toward its most local
|
||
node. Note that specifying a -1 here forces irqbalance to consider an interrupt
|
||
from a device to be equidistant from all nodes.
|
||
.TP
|
||
Note that, if a directory is specified rather than a regular file, all files in
|
||
the directory will be considered policy scripts, and executed on adding of an
|
||
irq to a database. If such a directory is specified, scripts in the directory
|
||
must additionally exit with one of the following exit codes:
|
||
.TP
|
||
.I 0
|
||
This indicates the script has a policy for the referenced irq, and that further
|
||
script processing should stop
|
||
.TP
|
||
.I 1
|
||
This indicates that the script has no policy for the referenced irq, and that
|
||
script processing should continue
|
||
.TP
|
||
.I 2
|
||
This indicates that an error has occurred in the script, and it should be skipped
|
||
(further processing to continue)
|
||
|
||
.TP
|
||
.B --migrateval, -e <val>
|
||
Specify a minimum migration ratio to trigger a rebalancing
|
||
Normally any improvement in load distribution will trigger the migration of an
|
||
irq, as long as preforming the migration will not simply move the load to a new
|
||
cpu. By specifying a migration value, the load balance improvement is subject
|
||
to hysteresis defined by this value, which is inversely propotional to the
|
||
value. For example, a value of 2 in this option tells irqbalance that the
|
||
improvement in load distribution must be at least 50%, a value of 4 indicates
|
||
the load distribution improvement must be at least 25%, etc
|
||
|
||
.TP
|
||
.B -s, --pid=<file>
|
||
Have irqbalance write its process id to the specified file. By default no
|
||
pidfile is written. The written pidfile is automatically unlinked when
|
||
irqbalance exits. It is ignored when used with --debug or --foreground.
|
||
.TP
|
||
.B -t, --interval=<time>
|
||
Set the measurement time for irqbalance. irqbalance will sleep for <time>
|
||
seconds between samples of the irq load on the system cpus. Defaults to 10.
|
||
.SH "ENVIRONMENT VARIABLES"
|
||
.TP
|
||
.B IRQBALANCE_ONESHOT
|
||
Same as --oneshot.
|
||
|
||
.TP
|
||
.B IRQBALANCE_DEBUG
|
||
Same as --debug.
|
||
|
||
.TP
|
||
.B IRQBALANCE_BANNED_CPUS
|
||
Provides a mask of CPUs which irqbalance should ignore and never assign interrupts to.
|
||
If not specified, irqbalance use mask of isolated and adaptive-ticks CPUs on the
|
||
system as the default value. The "isolcpus=" boot parameter specifies the isolated CPUs. The "nohz_full=" boot parameter specifies the adaptive-ticks CPUs. By default, no CPU will be an isolated or adaptive-ticks CPU.
|
||
This is a hexmask without the leading ’0x’. On systems with large numbers of
|
||
processors, each group of eight hex digits is separated by a comma ’,’. i.e.
|
||
‘export IRQBALANCE_BANNED_CPUS=fc0‘ would prevent irqbalance from assigning irqs
|
||
to the 7th-12th cpus (cpu6-cpu11) or ‘export IRQBALANCE_BANNED_CPUS=ff000000,00000001‘
|
||
would prevent irqbalance from assigning irqs to the 1st (cpu0) and 57th-64th cpus
|
||
(cpu56-cpu63).
|
||
Notes: This environment variable will be discarded, please use IRQBALANCE_BANNED_CPULIST
|
||
instead. Before deleting this environment variable, Introduce a deprecation period first
|
||
for the consider of compatibility.
|
||
|
||
.TP
|
||
.B IRQBALANCE_BANNED_CPULIST
|
||
Provides a cpulist which irqbalance should ignore and never assign interrupts to.
|
||
If not specified, irqbalance use mask of isolated and adaptive-ticks CPUs on the
|
||
system as the default value.
|
||
|
||
.SH "SIGNALS"
|
||
.TP
|
||
.B SIGHUP
|
||
Forces a rescan of the available IRQs and system topology.
|
||
|
||
.SH "API"
|
||
irqbalance is able to communicate via socket and return it's current assignment
|
||
tree and setup, as well as set new settings based on sent values. Socket is abstract,
|
||
with a name in form of
|
||
.B irqbalance<PID>.sock
|
||
, where <PID> is the process ID of irqbalance instance to communicate with.
|
||
Possible values to send:
|
||
.TP
|
||
.B stats
|
||
Retrieve assignment tree of IRQs to CPUs, in recursive manner. For each CPU node
|
||
in tree, it's type, number, load and whether the save mode is active are sent. For
|
||
each assigned IRQ type, it's number, load, number of IRQs since last rebalancing
|
||
and it's class are sent. Refer to types.h file for explanation of defines.
|
||
.TP
|
||
.B setup
|
||
Get the current value of sleep interval, mask of banned CPUs and list of banned IRQs.
|
||
.TP
|
||
.B settings sleep <s>
|
||
Set new value of sleep interval, <s> >= 1.
|
||
.TP
|
||
.B settings cpus <cpu_number1> <cpu_number2> ...
|
||
Ban listed CPUs from IRQ handling, all old values of banned CPUs are forgotten.
|
||
.TP
|
||
.B settings ban irqs <irq1> <irq2> ...
|
||
Ban listed IRQs from being balanced, all old values of banned IRQs are forgotten.
|
||
.PP
|
||
irqbalance checks SCM_CREDENTIALS of sender (only root user is allowed to interact).
|
||
Based on chosen tools, ancillary message with credentials needs to be sent with request.
|
||
|
||
.SH "HOMEPAGE"
|
||
https://github.com/Irqbalance/irqbalance
|
||
|