Selecting the Right Linux I/O Scheduler for a Host Equipped with Nvme Ssd

Selecting the right Linux I/O scheduler for a host equipped with NVMe SSD?

"none" (aka "noop") is the correct scheduler to use for this device.

I/O schedulers are primarily useful for slower storage devices with limited queueing (e.g, single mechanical hard drives) — the purpose of an I/O scheduler is to reorder I/O requests to get more important ones serviced earlier. For a device with a very large internal queue, and very fast service (like a PCIe SSD!), an I/O scheduler won't do you any good; you're better off just submitting all requests to the device immediately.

Decrease in Random read IOPs on NVME SSD if requests issued over small region

Surely it is to do with the structure of the memory. Internally this drive is built from many memory chips and may have multiple memory buses internally. If you do requests across a small range all the requests will resolve to a single or few chips and will have to be queued. If you access across the whole device then the multiple request are across many internal chips and buses and can be run asynchronously so will provide more throughput.

Best enterprise scheduler that you worked with?

The best one was homegrown.

Second best was Autosys.

Risk of ZIL working on single SSD

You would need to check your zpool version, if it's >= 19 then it supports "Log device removal" or ZIL removal.

$ zpool upgrade -v
This system is currently running ZFS pool version 28.

The following versions are supported:

VER DESCRIPTION
--- --------------------------------------------------------
1 Initial ZFS version
2 Ditto blocks (replicated metadata)
3 Hot spares and double parity RAID-Z
4 zpool history
5 Compression using the gzip algorithm
6 bootfs pool property
7 Separate intent log devices
8 Delegated administration
9 refquota and refreservation properties
10 Cache devices
11 Improved scrub performance
12 Snapshot properties
13 snapused property
14 passthrough-x aclinherit
15 user/group space accounting
16 stmf property support
17 Triple-parity RAID-Z
18 Snapshot user holds
19 Log device removal
20 Compression using zle (zero-length encoding)
21 Deduplication
22 Received properties
23 Slim ZIL
24 System attributes
25 Improved scrub stats
26 Improved snapshot deletion performance
27 Improved snapshot creation performance
28 Multiple vdev replacements

This means I have version 28 (>=19), thus, I can remove my ZIL device anytime from my pool. I doubt you'll break your single ZIL drive within a year or two unless you are performing synchronous writes all the time which your ZIL helps even out. There have been a lot of improvements on ZFS and from what I've read when your ZIL device breaks, the pool will simply revert to writing directly to the data pool. Of course, you can still rollback to a 'good' snapshot (do make sure you have snapshots).



Related Topics



Leave a reply



Submit