The following is part of a series on the different aspects of disk I/O performance and optimization for Oracle databases. Each tip is excerpted from the not-yet-released Rampant TechPress book, "Oracle disk I/O tuning," by Mike Ault. Check back to the main series page for upcoming installments.
ATA tuning in Linux
You will be happy to know that Linux is specifically designed (as are other true "UNIXs", SCO not included) to not fragment hard drives. Fragmentation is reduced in Linux over Windows by usually more than 98 percent and is usually not a concern.
However, Linux takes any default drive interface settings where implemented by the developer of the ATA kernel module that interfaces to your EIDE/ATA disk drive. At the time the driver was written, no doubt the drives were inferior to what you are currently running, not to mention it may have been written with old interface drivers as well. Fear not! Linux provides the hdparm utility to reset the values for the ATA interface to take advantage of new drives and better interfaces. Thus, if you want to tune the ATA drives in Linux use hdparm. In some cases, the disk IO rate increased by nearly 500 percent (a factor of 5) over pre-tuned values by the proper application of the hdparm application parameters. The basic tuning related arguments for hdparm are:
When no flags are given, acdgkmnru is assumed.
- -a - Get/set sector count for file system read-ahead. This is used to improve performance in sequential reads of large files by pre-fetching additional blocks in anticipation of them being needed by the running task. In the current kernel version (2.0.10), this has a default setting of 8 sectors (4KB). This value seems good for most purposes, but in a system where most file accesses are random seeks, a smaller setting might provide better performance. Also, many IDE drives have a separate built-in read-ahead function, which alleviates the need for a file system read-ahead in many situations.
- -A - Disable/enable the IDE drive's read-look ahead feature (usually ON by default).
- -B - Set Advanced Power Management feature, if the drive supports it. A low value means aggressive power management and a high value means better performance. A value of 255 will disable apm (automatic power management) on the drive. When a drive spins down due to reaching its apm timeout, it may take up to 30 seconds to respond to the first request after the timeout.
- -c - Query/enable (E) IDE 32-bit I/O support. A numeric parameter can be used to enable/disable 32-bit I/O support: Currently supported values include 0 to disable 32-bit I/O support (sets it to 16 bit), 1 to enable 32-bit data transfers, and 3 to enable 32-bit data transfers with a special sync sequence required by many chipsets. The value 3 works with nearly all 32-bit IDE chipsets, but incurs slightly more overhead. Note that "32-bit" refers to data transfers across a PCI or VLB bus to the interface card only; all (E)IDE drives still have a 16-bit connection over the ribbon cable from the interface card.
- -d - Disable/enable the "using_dma" flag for this drive. This option now works with most combinations of drives and PCI interfaces, which support DMA and which are known to the IDE driver. It is also a good idea to use the appropriate -X option in combination with -d1 to ensure the drive itself is programmed for the correct DMA mode, although most BIOSs should do this for you at boot time. Using DMA nearly always gives the best performance, with fast I/O throughput and low CPU usage. But there are at least a few configurations of chipsets and drives for which DMA does not make much of a difference, or may even slow things down (on really messed up hardware!). Your mileage may vary.
- -f - Sync and flush the buffer cache for the device on exit. This operation is also performed as part of the -t and -T timings.
- -g - Display the drive geometry (cylinders, heads, sectors), the size (in sectors) of the device, and the starting offset (in sectors) of the device from the beginning of the drive.
- -i - Display the identification info that was obtained from the drive at boot time, if available. This is a feature of modern IDE drives and may not be supported by older devices. The data returned may or may not be current, depending on activity since booting the system. However, the current multiple sector mode count is always shown. For a more detailed interpretation of the identification info, refer to ATA attachment Interface for Disk Drives (ANSI ASC X3T9.2 working draft, revision 4a, April 19/93).
- -I - Request identification info directly from the drive, which is displayed in a new expanded format with considerably more detail than with the older -i flag. There is a special "no seatbelts" variation on this option, -Istdin, which cannot be combined with any other options, and which accepts a drive identification block as standard input instead of using a /dev/hd* parameter. The format of this block must be exactly the same as that found in the /proc/ide/*/hd*/identify "files". This variation is designed for use with "libraries" of drive identification information and can also be used on ATAPI drives, which may give media errors with the standard mechanism.