Selecting a Linux I/O Scheduler

Selecting a Linux I/O Scheduler

As documented in /usr/src/linux/Documentation/block/switching-sched.txt, the I/O scheduler on any particular block device can be changed at runtime. There may be some latency as the previous scheduler's requests are all flushed before bringing the new scheduler into use, but it can be changed without problems even while the device is under heavy use.

# cat /sys/block/hda/queue/scheduler
noop deadline [cfq]
# echo anticipatory > /sys/block/hda/queue/scheduler
# cat /sys/block/hda/queue/scheduler
noop [deadline] cfq

Ideally, there would be a single scheduler to satisfy all needs. It doesn't seem to exist yet. The kernel often doesn't have enough knowledge to choose the best scheduler for your workload:

  • noop is often the best choice for memory-backed block devices (e.g. ramdisks) and other non-rotational media (flash) where trying to reschedule I/O is a waste of resources
  • deadline is a lightweight scheduler which tries to put a hard limit on latency
  • cfq tries to maintain system-wide fairness of I/O bandwidth

The default was anticipatory for a long time, and it received a lot of tuning, but was removed in 2.6.33 (early 2010). cfq became the default some while ago, as its performance is reasonable and fairness is a good goal for multi-user systems (and even single-user desktops). For some scenarios -- databases are often used as examples, as they tend to already have their own peculiar scheduling and access patterns, and are often the most important service (so who cares about fairness?) -- anticipatory has a long history of being tunable for best performance on these workloads, and deadline very quickly passes all requests through to the underlying device.

Selecting the right Linux I/O scheduler for a host equipped with NVMe SSD?

"none" (aka "noop") is the correct scheduler to use for this device.

I/O schedulers are primarily useful for slower storage devices with limited queueing (e.g, single mechanical hard drives) — the purpose of an I/O scheduler is to reorder I/O requests to get more important ones serviced earlier. For a device with a very large internal queue, and very fast service (like a PCIe SSD!), an I/O scheduler won't do you any good; you're better off just submitting all requests to the device immediately.

How to select linux process scheduler in menuconfig while kernel compilation

The following StackOverflow thread answers a similar question:

According to the above changing the scheduler type is done dynamically at runtime.

TL;DR

cat /sys/block/sda/queue/scheduler

to check what is running

sudo bash -c 'echo deadline > /sys/block/sda/queue/scheduler'

to change.

Excuse me if you meant compiling the options into the "from scratch" kernel build. In that case it is in the 2.6.15-rc4 configuration it is under:
Block layer --->
IO Schedulers --->

Edit

After realizing I misread the question and referenced the IO scheduler rather than the Process Scheduler:

The CFS is the only process scheduler in the new kernels. It is possible to play with its attributes to make it more "real time" with the sched command

Change I/O scheduler not using sd* to refer to the disk

I guess there's no easier/cleaner solution then to create a script, as osgx suggested.

In dmesg I did not find anything like a SerialID of the disks so I came up with a different solution which might also be easier for others to apply.

Create a file called setscheduler.sh in /etc/init.d/ and add the following content:

#!/bin/bash

# List of UUIDs (one per line)
# For each drive: Add the UUID of a single partition located on the drive of which you want to change the I/O-scheduler
UUID_LIST=(
2669b09e-75cd-4f45-bedb-8cb405444287
)

DISK_PATH="/dev/disk/by-uuid"
BLOCK_PATH="/sys/block"

for UUID in ${UUID_LIST[@]} ; do
if [[ -L "${DISK_PATH}/${UUID}" ]] ; then
TARGET=$( readlink "${DISK_PATH}/${UUID}" )
DISK=`expr "${TARGET}" : '.*\(sd[a-z]\)'`

if [[ -d "${BLOCK_PATH}/${DISK}" ]] ; then
echo deadline > "${BLOCK_PATH}/${DISK}/queue/scheduler"
echo 1 > "${BLOCK_PATH}/${DISK}/queue/iosched/fifo_batch"
fi
fi
done

Make the file executable:

sudo chmod +x /etc/init.d/setscheduler.sh

Register it as an init.d script:

sudo update-rc.d setscheduler.sh defaults

Changing the disk scheduler on the fly

You can change the IO scheduler on the fly without fear. It is protected by appropriate locking to make sure no transactions are lost.



Related Topics



Leave a reply



Submit