kernel

To manipulate the kernel modules we need to make sure the kmod package is installed in our system:

# [ yum | dnf ] install kmod

The lsmod command parses the contents of /proc/modules in a more human readable way.

We can dig a bit more info on particular modules with the modinfo command:

# modinfo ip_tables
filename: /lib/modules/3.10.0-123.el7.x86_64/kernel/net/ipv4/netfilter/ip_tables.ko
description: IPv4 packet filter
author: Netfilter Core Team <coreteam@netfilter.org>
license: GPL
srcversion: 44A16130862F8CA2ECA59D9
depends:
intree: Y
vermagic: 3.10.0-123.el7.x86_64 SMP mod_unload modversions
signer: Fermi National Accelerator Laboratory: Scientific Linux kernel signing key
sig_key: 0E:88:DF:6B:94:F4:EB:C4:DC:8D:B7:7E:13:B0:6F:6C:C5:18:30:C6
sig_hashalgo: sha256
.
# modinfo e1000e | grep “^parm:”
parm: debug:Debug level (0=none,…,16=all) (int)
parm: copybreak:Maximum size of packet that is copied to a new buffer on receive (uint)
parm: TxIntDelay:Transmit Interrupt Delay (array of int)
parm: TxAbsIntDelay:Transmit Absolute Interrupt Delay (array of int)
parm: RxIntDelay:Receive Interrupt Delay (array of int)
parm: RxAbsIntDelay:Receive Absolute Interrupt Delay (array of int)
.
# modinfo e1000e | grep “^depends:”
depends: ptp

And we can check the current configuration of all modules with:

# modprobe -c                → for all modules

Sometimes we need to load kernel modules (i.e. to run some binary) that have not yet been loaded.

# modprobe -v fcoe
insmod /lib/modules/3.10.0-123.el7.x86_64/kernel/drivers/scsi/scsi_tgt.ko
insmod /lib/modules/3.10.0-123.el7.x86_64/kernel/drivers/scsi/scsi_transport_fc.ko
insmod /lib/modules/3.10.0-123.el7.x86_64/kernel/drivers/scsi/libfc/libfc.ko
insmod /lib/modules/3.10.0-123.el7.x86_64/kernel/drivers/scsi/fcoe/libfcoe.ko
insmod /lib/modules/3.10.0-123.el7.x86_64/kernel/drivers/scsi/fcoe/fcoe.ko

And very rarely we might need to unload certain modules:

# modprobe -r fcoe  → unload FCoE module if it is not used
# rmmod fcoe        → same as above
# rmmod -w fcoe     → if used, it waits until it isn’t and then unloads it

If we need to change the value of some input parameter for a module, we can do so as follows:

# lsmod | grep e1000g      → make sure module is not loaded and, if it is, unload it!
# modprobe e1000g InterruptThrottleRate=3000,3000,3000 debug=1

If we need to change some kernel module settings (i.e. load some modules at startup or change the values of their parameters), we can do so by adding scripts to the /etc/modprobe.d directory. For example, if we need the virtio-net kernel module loaded at startup, we can achieve that simply with the command:

# echo “virtio-net” > /etc/modprobe.d/virtio-net.conf
# cat /etc/modprobe.d/virtio-net.conf
virtio-net

At boot time, systemd scans all the files /etc/modprobe.d/*conf and the kernel modules included in them are loaded with any explicit parameters given.

As regards to kernel parameters, the categories vary system to system but the main ones are: dev, fs, kernel, net & sunrpc. In most systems the default kernel parameters are set to acceptable values. But we might want to change some of them to tighten up the security of the system or to enable it to run specific software (i.e. databases, web servers, etc).

To view all the kernel parameters and their current values we use:

# sysctl -a                   → list all parameters and values
abi.vsyscall32 = 1
crypto.fips_enabled = 0
debug.exception-trace = 1
debug.kprobes-optimization = 1
dev.cdrom.autoclose = 1
dev.cdrom.autoeject = 0
dev.cdrom.check_media = 0
dev.cdrom.debug = 0
[…]
.
# sysctl -a -r “^sunrpc”      → list parameters matching regexp
sunrpc.max_resvport = 1023
sunrpc.min_resvport = 665
sunrpc.nfs_debug = 0
sunrpc.nfsd_debug = 0
[…]
.
# sysctl -a -N -r “^crypto”   → just list the parameters, no values shown
crypto.fips_enabled
.
# sysctl -n kernel.hostname   → just list the value
orap1.company.net

The default values for kernel parameters are either hard-coded, determined at compilation time or set in the files underneath the directory /usr/lib/sysctl.d/.

# ls -l /usr/lib/sysctl.d/
total 12
-rw-r–r–. 1 root root 466 Mar 5 2015 00-system.conf
-rw-r–r–. 1 root root 710 Sep 15 15:12 50-default.conf
-rw-r–r–. 1 root root 499 Sep 15 15:15 libvirtd.conf

If we need to change any parameter on a permanent basis we should make sure to add the value-pair to /etc/sysctl.conf or, even better, in its own file in /etc/sysctl.d.

# cat /etc/sysctl.conf
# System default settings live in /usr/lib/sysctl.d/00-system.conf.
# To override those settings, enter new settings here, or in an /etc/sysctl.d/.conf file
#
# For more information, see sysctl.conf(5) and sysctl.d(5).
# cat /etc/sysctl.d/oracle.conf
# Oracle settings
kernel.shmmni = 4096
kernel.shmmax = 4398046511104
kernel.shmall = 1073741824
kernel.sem = 250 32000 100 128
fs.aio-max-nr = 1048576
fs.file-max = 6815744
net.ipv4.ip_local_port_range = 9000 65500
net.core.rmem_default = 262144
net.core.rmem_max = 4194304
net.core.wmem_default = 262144
net.core.wmem_max = 1048586

Any value-pair set in a file in /etc/sysctl.d/ will overwrite the same pair if set in /etc/sysctl.conf. We should make sure the value-pairs do not conflict in the different configuration files as, if they do, the alphabetical order of the files in /etc/sysctl.d/ will determine what is the last pair to be read and set.

Changing the kernel values in the configuration files does not affect in any way a running system as those values won’t be used until the next reboot. If we need changes to be effected immediately we have 3 ways forward.

We can update the kernel values by reloading sysctl.conf or any file underneath /etc/sysctl.d/:

# sysctl -p                 → reloads and enforces the settings in /etc/sysctl.conf
# sysctl -p /etc/sysctl.d/oracle.conf

We can use sysctl -w …

# sysctl -w net.core.rmem_max=10485760

… or we can do it directly (a bit less safe) …

# echo “10485760” > /proc/sys/net/core/rmem_max

But we have to remember that any immediate effect won’t survive a reboot. So if we need the changes persisted, we should add the value-pairs to the configuration files.

All the kernel parameters can be read/written in the /proc/sys pseudo filesystem:

# ls -l /proc/sys/
total 0
dr-xr-xr-x. 1 root root 0 Oct 14 11:14 abi
dr-xr-xr-x. 1 root root 0 Oct 13 21:03 crypto
dr-xr-xr-x. 1 root root 0 Oct 14 11:14 debug
dr-xr-xr-x. 1 root root 0 Oct 14 11:14 dev
dr-xr-xr-x. 1 root root 0 Oct 13 21:03 fs
dr-xr-xr-x. 1 root root 0 Oct 13 21:03 kernel
dr-xr-xr-x. 1 root root 0 Oct 13 21:03 net
dr-xr-xr-x. 1 root root 0 Oct 14 11:14 sunrpc
dr-xr-xr-x. 1 root root 0 Oct 13 21:03 vm
# tree fs
fs
├── aio-max-nr
├── aio-nr
├── binfmt_misc
│ ├── register
│ └── status
├── dentry-state
├── dir-notify-enable
├── epoll
│ └── max_user_watches
├── file-max
├── file-nr
[…]
# sysctl -a -r “^fs”
fs.aio-max-nr = 1048576
fs.aio-nr = 0
fs.binfmt_misc.kshcomp = enabled
fs.binfmt_misc.kshcomp = interpreter /bin/ksh93
fs.binfmt_misc.kshcomp = flags:
fs.binfmt_misc.kshcomp = offset 0
fs.binfmt_misc.kshcomp = magic 0b1308
fs.binfmt_misc.status = enabled
fs.dentry-state = 217546 203735 45 0 0 0
fs.dir-notify-enable = 1
fs.epoll.max_user_watches = 791162
fs.file-max = 6815744
fs.file-nr = 5440 0 6815744
[…]

We can see the tree structure above in which the actual parameters are always at the end.

Let’s go through some kernel parameters that could be changed to increase performance (values here are a guidance only!):

Network tuning

net.core.rmem_max = 10485760                → max OS receive buffer size to 10MB
net.core.wmem_max = 10485760                → max OS send buffer to 10MB
net.core.netdev_max_backlog = 5000          → max of packets queued for kernel processing
net.core.somaxconn = 1024                   → max connections backlogged waiting for socket accept
net.ipv4.tcp_syncookies = 1                 → avoid SYN flood DoS attacks
net.ipv4.tcp_fastopen = 1                   → speeds-up successive connections between 2 end-points
net.ipv4.tcp_window_scaling = 1              → enable window scaling if system can take it
net.ipv4.tcp_timestamps = 1                  → enable better measurement of RTT
net.ipv4.tcp_max_tw_buckets = 1000000        → pool size of time-wait sockets
net.ipv4.udp_rmem_min = 16384               → min size in bytes of UDP socket read buffer
net.ipv4.udp_wmem_min = 16384                → min size in bytes of UDP socket write buffer
net.ipv4.ip_local_port_range = ”9000 65500”  → port range available for network connections

Network security

net.ipv4.tcp_max_syn_backlog = 4096         → max TCP connections awaiting acceptance
net.ipv4.conf.*.rp_filter = 1               → drop packets that come from “impossible” places
net.ipv4.conf.*.log_martians = 1            → send to syslog any packets dropped by rp_filters
net.ipv4.ip_forward = 0                     → drop all forward packets (might break tunnels,VPN,etc)
net.ipv4.conf.all.forwarding = 0            → disable forwarding in all existing interfaces
net.ipv4.conf.default.forwarding = 0        → disable forwarding in all future interfaces
net.ipv4.conf.<interface>.forwarding = 0     → disable forwarding for a specific interface
net.ipv4.conf.all.send_redirects = 0        → not needed unless acting as router/gateway
net.ipv4.conf.all.accept_local = 0          → rejects packets with local source addresses
net.ipv4.conf.all.accept_redirects = 0      → redirects are a security risk unless they’re secure
net.ipv4.conf.all.secure_redirects = 1       → accept redirects only from the specified gateways
net.ipv4.conf.all.accept_source_route = 0   → safer to ignore source route requests
net.ipv4.icmp_echo_ignore_broadcasts = 1    → ignore ICMP echo requests sent via broadcast
net.ipv4.icmp_echo_ignore_all = 1           → ignore all ICMP echo requests
net.ipv6.conf.all.router_solicitations = 0  → disable unless acting as router/gateway
net.ipv6.conf.all.accept_ra_defrtr = 0      → do not accept default routes sent by RAs
net.ipv6.conf.all.accept_ra_pinfo = 0        → do not accept Prefix Information sent by RAs
net.ipv6.conf.all.autoconf = 0              → do not use PI sent by RAs for device autoconfig

Disk I/O tuning

fs.aio-max-nr = 1048576                     → max async I/O concurrent ops
fs.file-max = 4194304                       → max number of entries of the system-wide file handle table
fs.nr_open = 1048576                         → max number of concurrent file handles for single process
fs.inode-max = 12582912                      → max number of inodes in cache
fs.mqueue.queues_max = 256                  → max number of mqueues system-wide
fs.mqueue.msg_max = 1024                     → max number of messages in a mqueue
fs.mqueue.msgsize_max = 8192                → max size in bytes of a single mqueue message
fs.pipe-max-size = 1048576                  → max size in bytes of a pipe

Kernel security

kernel.dmesg_restrict = 1        → prevent non-privileged users from using dmesg
kernel.exec-shield = 1           → prevents execution in non-executable memory regions
kernel.kptr_restrict = 1         → kernel ptr addresses are hidden unless user has CAP_SYSLOG privs
kernel.msgmax = 8192             → max size in bytes of SysV queue message
kernel.msgmnb = 819200           → max size in bytes of SysV queue
kernel.msgmni = 32000             → max number of SysV queues system-wide
kernel.shmmax = 4294967295       → max size in bytes of a shared memory segment
kernel.shmmni = 4096             → max number of shared memory segments
kernel.shmall = 268435456        → max number of shared memory pages
kernel.sem =”512 32000 512 128”  → SEMMSL (max semaphores per set), SEMNS (max semaphores total),
.                                  SEMOPM (max ops per call), SEMMNI (max semaphores sets)
kernel.randomize_va_space = 2    → enable Address Space Layout Randomization to
.                                  prevent certain buffer overflow attacks
kernel.threads-max = 125810      → max number of threads system-wide (will be automatically reduced if more
.                                  than 1/8th of RAM would be consumed)

The optimal values of the parameters above might differ a lot from the suggested ones above. Test thoroughly…

<< grub2          systemd >>