
Discover 25 essential Linux commands for efficient storage management. Learn how to monitor disk usage, manage partitions, create filesystems, and optimize storage performance with detailed
Learn how to analyze and reduce Linux I/O latency with tools like fio, blktrace, and perf. Optimize disk performance to improve throughput, latency, and system efficiency.
In the world of high-performance computing and data-driven applications, disk I/O latency can often be the Achilles’ heel for system performance. Whether you’re running a database, file server, or virtualized environment, understanding and reducing I/O latency can result in significant performance improvements. In this post, we will explore how to analyze and reduce Linux I/O latency using three powerful tools: fio, blktrace, and perf. These tools provide insights into I/O performance bottlenecks and enable system administrators and developers to optimize disk access patterns for better throughput.
I/O latency refers to the delay between initiating an I/O operation (like reading from or writing to a disk) and the completion of that operation. In Linux, I/O latency can be caused by a variety of factors, including disk hardware limitations, inefficient file systems, kernel bottlenecks, or high contention for system resources. I/O latency is particularly critical in environments that require high-speed data processing, like databases or virtualization, where delays can cause slowdowns and degrade overall system performance.
Before diving into how to analyze and reduce I/O latency, it’s important to understand the different components that contribute to this delay. These include:
|
|
|
By monitoring and identifying which component of I/O latency is problematic, you can take targeted action to reduce the overall latency.
Each of the tools we’ll be discussing has a specific role in understanding and troubleshooting I/O latency.
🟢 fio (Flexible I/O Tester) |
fio is a benchmarking tool that allows users to generate custom I/O workloads and measure the performance of different storage devices under varying conditions. It can simulate real-world workloads (like sequential or random access) and provide detailed metrics like latency, throughput, and IOPS (Input/Output Operations Per Second).
🟢 blktrace |
blktrace is a kernel-level tool for tracing block layer events in Linux. It provides fine-grained insights into the block I/O system, including request queuing, completion times, and I/O scheduling.
🟢 perf |
perf is a performance analysis tool for Linux that provides insights into CPU and system performance. While it is often used for CPU profiling, it can also be used to gather detailed I/O performance data by sampling events related to I/O operations.
One of the first steps in understanding I/O latency is benchmarking the performance of your storage system. fio is ideal for this purpose. It allows you to simulate different workloads and measure latency, throughput, and IOPS.
Here is an example of how to run a basic benchmark with fio:
fio --name=mytest --ioengine=libaio --rw=randwrite --bs=4k --numjobs=16 --size=10G --runtime=60m --time_based --output=fio_report.txt
▶️ Explanation of the options |
|
|
|
|
|
|
|
|
|
Once the test completes, fio will generate a report with several key metrics:
|
|
|
By running different tests, you can identify the I/O patterns that cause the highest latency, whether they are random reads, sequential writes, or something else.
blktrace is a tool that provides deep insights into the internal workings of the Linux block layer. It allows you to trace the activity of block devices and gather information on the queuing and completion of I/O requests. The following command starts a trace on the device /dev/sda:
sudo blktrace -d /dev/sda -o trace_output
The -d option specifies the block device to trace, and -o directs the output to a file. You can later use the blkparse tool to analyze the trace data:
sudo blkparse -i trace_output.blktrace.0
Input file trace_output.blktrace.0 added
Input file trace_output.blktrace.1 added
252,0 0 1 0.000000000 256586 A FWFSM 14702113 + 0 <- (253,2) 6311457
252,0 1 1 0.000000709 264662 A FWFSM 33574989 + 0 <- (253,3) 12601421
252,0 0 2 0.000000827 256586 Q FWFSM [kworker/0:0]
252,0 1 2 0.000001265 264662 Q FWFSM [kworker/1:1]
252,0 1 3 0.000006576 264662 G FWFSM [kworker/1:1]
252,0 0 3 0.000006589 256586 G FWFSM [kworker/0:0]
252,0 0 4 0.000033828 266264 A FWFSM 92283166 + 0 <- (253,4) 12589342
252,0 0 5 0.000034161 266264 Q FWFSM [kworker/0:3]
252,0 0 6 0.000035255 266264 G FWFSM [kworker/0:3]
252,0 1 4 0.033742570 0 C WSM 33574989 [0]
252,0 0 7 0.033743562 0 C WSM 14702113 [0]
252,0 1 5 0.033769097 264662 A WFSM 33574989 + 6 <- (253,3) 12601421
...omitted for brevity...
This will output detailed information on every block I/O operation, including the time the operation was queued, completed, and the latency involved. The data is typically in microseconds, which can be useful for pinpointing delays at the I/O request level.
...
Total (trace_output):
Reads Queued: 0, 0KiB Writes Queued: 17, 42KiB
Read Dispatches: 0, 0KiB Write Dispatches: 11, 42KiB
Reads Requeued: 0 Writes Requeued: 0
Reads Completed: 0, 0KiB Writes Completed: 21, 42KiB
Read Merges: 0, 0KiB Write Merges: 1, 4KiB
IO unplugs: 4 Timer unplugs: 0
Throughput (R/W): 0KiB/s / 3KiB/s
Events (trace_output): 103 entries
Skips: 0 forward (0 - 0.0%)
🖥️ Example blktrace Output |
| Timestamp | Event Type | Sector | Queue Time | Service Time |
|---|---|---|---|---|
| 1000.000 | Read | 12345 | 200 us | 1.5 ms |
| 1001.500 | Write | 67890 | 250 us | 2.0 ms |
The blktrace output helps you see how much time is spent queuing and servicing each I/O request, enabling you to identify bottlenecks.
While fio and blktrace give you a great deal of detail about the I/O system, perf can provide a broader picture by profiling system-wide performance, including CPU usage during I/O operations.
For example, you can use perf to monitor block I/O events in real-time:
sudo perf stat -e block:block_rq_issue,block:block_rq_complete -a sleep 60
This command collects statistics on the block request issue and completion events over a 60-second period.
Key metrics output by perf:
|
|
You can also use perf record and perf report to generate detailed performance profiles:
sudo perf record -e block:block_rq_issue -a
sudo perf report
The perf report command will show you a summary of the I/O operations, allowing you to pinpoint which processes are consuming the most I/O resources and causing latency.
After analyzing the data from fio, blktrace, and perf, you can take steps to reduce I/O latency. Some common strategies include:
|
To check the current scheduler for a device:
cat /sys/block/sda/queue/scheduler
To change the scheduler to deadline:
echo deadline > /sys/block/sda/queue/scheduler
|
|
|
|
|
|
Reducing I/O latency on Linux systems is crucial for high-performance applications, especially in environments that require fast data access. By using tools like fio, blktrace, and perf, you can effectively monitor, analyze, and address I/O bottlenecks. With the right insights, you can make informed decisions on system configuration, hardware upgrades, and software optimizations to minimize latency and boost overall performance.
Did you find this article helpful? Your feedback is invaluable to us! Feel free to share this post with those who may benefit, and let us know your thoughts in the comments section below.

Discover 25 essential Linux commands for efficient storage management. Learn how to monitor disk usage, manage partitions, create filesystems, and optimize storage performance with detailed

Creating GPT partitions in Linux empowers IT professionals with a modern and robust method for efficient storage management, offering advantages like larger drive support, data

A detailed, step‑by‑step guide to setting up GPU passthrough on KVM for Linux hosts. Learn how to enable IOMMU, bind your GPU to VFIO, configure
