Pipe_Buf: The Minimum Buffer Size For Reliable Piping With Dd

What is PIPE_BUF?

PIPE_BUF is a constant defined in the Linux kernel that specifies the minimum size, in bytes, of an atomic write to a pipe. Any data written to a pipe that is less than or equal to the PIPE_BUF size is guaranteed to be written atomically and not interleaved with writes from other processes.

The PIPE_BUF value represents the buffer size allocated for each pipe. The kernel sets this value at boot time based on the system configuration and memory availability. It serves as a safe upper bound for reliable atomic writes to pipes between processes.

Definition of PIPE_BUF Constant

The PIPE_BUF constant is defined in the Linux kernel header file linux/limits.h. It is typically available in user space by including limits.h.

#define PIPE_BUF    4096    /* max #bytes atomic in write to a pipe */ 

As shown above, PIPE_BUF specifies the maximum number of bytes that can be written atomically to a pipe. Any write less than this size will be atomic and self-contained.

Minimum Size for Atomic Writes to Pipes

The key attribute of PIPE_BUF is that it defines the minimum atomic write size for pipes. Any chunk of data written that is smaller than PIPE_BUF bytes will be handled atomically by the kernel and sent as a discrete unit through the pipe.

For example, if PIPE_BUF is 4096 bytes and a program writes 2048 bytes to a pipe, the write will be atomic. The 2048 chunk will be individually buffered and sent in full through the pipe without being interspersed with other data.

Set by the Kernel at Startup

The Linux kernel sets the PIPE_BUF size automatically at boot time based on system limits and available memory. The value cannot be configured directly in user space or modified after boot.

The PIPE_BUF value can vary across different Linux systems. Modern 64-bit Linux systems often have a PIPE_BUF value of 4 KiB to 64 KiB. The default is often 4 KiB for more resource constrained environments or systems with less available memory to dedicate to buffers.

Why PIPE_BUF Matters for Piping with dd

The dd command is commonly used for copying data between files and devices in Linux. A core component of dd functionality involves sending data through pipeline streams to other processes.

Since dd relies heavily on piping data between processes, the PIPE_BUF value is a critical sanity check to ensure reliable data transfer. Understanding PIPE_BUF helps set proper buffer sizes for dd to interface safely with pipes.

dd Copies Data in Blocks

The dd command copies files, converts and formats data by reading and writing in sized blocks. This block size is configurable with the bs=N option.

For example, dd can be told to read files 512 bytes at a time with the bs=512 option. All input is split into 512-byte blocks when transferring data.

dd if=input.img of=output.img bs=512

This block size behavior allows dd to offer customizable data transfer tailored to throughput needs and I/O characteristics.

If Block Size < PIPE_BUF, Piping May Interleave Blocks

A core principle of using dd is that the block size should be equal to or larger than PIPE_BUF when piping data between processes. If the dd block size is smaller than PIPE_BUF, the kernel may interleave blocks when writing to the pipe.

For example, say PIPE_BUF is 4096 bytes. If dd uses bs=2048, the kernel may cut data across pipe buffer boundaries since the block size is less than PIPE_BUF. This can lead to corrupted data and broken transfers.

Can Lead to Corrupted Data

If the dd block size is less than PIPE_BUF when piping data, the potential exists for data corruption and interleaved blocks. Even if adjacent blocks are written entirely before moving to the next, ambiguity exists at block boundaries.

For reliable piping, dd’s block size should always match or exceed PIPE_BUF. This prevents out of order writes and ensures each block is handled atomically through the pipe without risk of intermixing blocks.

Checking the PIPE_BUF Value

To check the current system’s PIPE_BUF value in Linux, use the getconf command as follows:

getconf PIPE_BUF

This will print out the number of bytes defined for the atomic pipe buffer size set in the kernel at boot time.

Typically 4096 to 65536 Bytes

On most modern Linux systems, the PIPE_BUF value reported by getconf will be 4096, 8192, 16384, or 65536 bytes. Very resource constrained systems may have smaller PIPE_BUF around 1024 or 2048 bytes.

The buffer memory required scales linearly with PIPE_BUF. So systems tight on memory may elect smaller PIPE_BUF if they expect heavy pipe usage. However most modern Linux desktop and server OSes opt for larger buffer sizes.

Using dd with Pipes Safely

To use dd with pipes safely between processes, always choose a blocksize equal or larger than your system’s PIPE_BUF value as shown by getconf.

This ensures atomic transfers through the pipe without risk of interleaved or fragmented blocks corrupting data.

Choose bs >= PIPE_BUF for Reliable Piping

A simple rule of thumb when using dd for piping data is to set the blocksize bs parameter greater than or equal to the PIPE_BUF value.

This guarantees dd will only send full blocks through the pipe atomically without breaking up data across PIPE_BUF boundaries or introducing fragmentation.

Example Code with bs=64k for Common PIPE_BUF Values

Since most modern Linux systems have a PIPE_BUF of 4 KiB to 64 KiB, a blocksize value of 64 KiB is safe in most common scenarios:

dd if=/dev/sda bs=64k | gzip > diskimage.gz

This will read the disk in 64 KiB blocks, pipe the output to gzip, and write a compressed disk image. The 64 KiB size ensures no risk of fragmentation across PIPE_BUF.

Tuning BLOCKSIZE for Optimal Throughput

While a bs value matching PIPE_BUF is critical for safe piping, throughput between dd and other processes can often be optimized by increasing blocksize further to improve performance.

Larger bs Values Can Increase Throughput

Leveraging larger block sizes with dd aligned to the underlying device characteristics can boost throughput significantly. This amortizes overhead across larger chunks on each read and write.

Testing with different multiples of PIPE_BUF as the dd blocksize can help identify the optimal setting for throughput on a particular storage device.

But Too Large Wastes Memory for Buffers

However there are downsides to setting extremely large blocksizes with dd. Each block occupies memory for buffering as data transfers through the system.

So while throughput increases, setting block size too high will consume memory needlessly and potentially trigger OOM killer shutdowns if system RAM becomes overcommitted.

Example Benchmarking Different Values

Here is an example testing dd read speeds from a storage device with different bs values to tune for optimal real-world throughput:

# Baseline ( typically PIPE_BUF )
dd if=/dev/sda bs=4k | pv > /dev/null
4GB 0:20:43 [32.5MB/s]

# Try larger blocksizes
dd if=/dev/sda bs=128k | pv > /dev/null 
4GB 0:14:32 [46.9MB/s]

dd if=/dev/sda bs=512k | pv > /dev/null
4GB 0:12:38 [53.2MB/s] 

dd if=/dev/sda bs=1M | pv > /dev/null
4GB 0:12:14 [55.3MB/s]

Here we see 512 KiB provides the optimal throughput. Beyond that memory buffers just add overhead without additional speedup.

Leave a Reply

Your email address will not be published. Required fields are marked *