Monthly Archives: October 2009

HFS+ and File System Fragmentation

A common question asked by Mac users is: Does my HFS+ file system get fragmented and what should I do about it?  This question is most often asked by those who have experience with the Windows operating system, where defragmentation tools are readily available, visible to the user, and frequently recommended (at least historically).  Apple does not generally recommend defragmentation for HFS+, but let’s dig a little deeper and see why that is.

What is file system fragmentation?

Hard drives are block devices, that is, they read and write multiple bytes at a time, in groups called blocks.  The actual physical hard drive has a device block size (512 bytes is typical) while the operating system’s file system will implement a block size of it’s own (4kB is typical).  That is to say, when a file is written or read, it is done so in discrete groups of 4,096 bytes at a time.   Any file that is larger than this amount must occupy multiple blocks.  For example, a 16kB file would occupy four 4kB blocks on the hard drive.  A 9kB file would occupy three 4kB blocks – notice the inefficient use of space! (This wasted space is known as internal fragmentation and we won’t discuss it further here).  These blocks may or may not be contiguous (adjacent to one another).  In the case they are not, the disk head must seek to multiple locations on the disk platter in order to retrieve the complete file.  This movement is slow, relatively speaking , so less movement of the disk head is better than more movement from a performance standpoint.  Therefore, optimal file performance (for a single file at least) implies that all the blocks making up the file be contiguous, so they may all be read with a minimum of head movement.

Where does fragmentation come from?

Suppose the operating system is writing a file and it requires 4 blocks.  If the hard drive is relatively empty, odds are good the 4 blocks can be written contiguously.  Now suppose at some time in the future the file doubles in size, it now requires an additional 4 blocks.  The file may no longer occupy a contiguous set of blocks, as there may or may not be additional room in the vicinity of the original blocks.  An additional four blocks will be written, at some other location non-adjacent to the original blocks.  Any access of this file will now require an additional seek time (for the disk head to transit from the original set of blocks to the second set of blocks).  Therefore, the file is now fragmented.  This is known as external fragmentation and unless preventative measures are taken, it can grow worse over time.  As a disk is used, the free space tends to get split up.  Consider that as existing files are deleted, free blocks will appear in locations that were used previously, and these blocks need to be reused.  The cycle of using and freeing space over time results in the available space on the hard drive becoming spread out in a random pattern, with fewer and fewer large areas of available blocks.

How the operating system deals with fragmentation

There are two general approaches to handling external fragmentation: avoiding it in the first place and cleaning it up when it does happen.  These approaches are not mutually exclusive and can be combined, hence Mac OS X implements several of these tricks.   When fragmentation does occur, “on the fly” defragmentation can be applied (under fairly specific circumstances).

Extents

HFS+ uses an extent based allocation scheme.   An extent (also known as a block run) consists of a starting block number and a count of contiguous blocks.  One or more extents are used to store file contents, however, the algorithm for selecting the extents to use will prefer a single extent (i.e. contiguous storage).  This allocation scheme inherently tends towards contiguous storage, as the system will attempt to store the file in the minimum number of extents that provide the space needed, and an extent by definition represents contiguous storage.  Contrast this approach with a pure block based allocation scheme such as that of the legacy Windows File Allocation Table (FAT) file system, which tends towards allocating single blocks at a time, which can be widely dispersed.  Many modern file systems use an extent based allocation scheme, including the successor to the FAT file system, NTFS.  Another technique employed by HFS+ is that it will try to avoid reusing freed space if possible, i.e. it will ignore the extents freed from deleted files, which are likely to be widely dispersed and therefore highly fragmented.

Delayed allocation

HFS+ also uses a technique known as delayed allocation.  When an application requests that data be written, the actual write to the disk is delayed as long as possible.  Meanwhile the contents of the file to be written are buffered into memory.  Inevitably the file must be written to disk, but because of the delay there is a much improved chance all the data can be written to a set of contiguous blocks (a single extent).  Contrast this with an approach where bytes are written to storage as soon as possible, which could result in a insufficiently size extent being selected, necessitating further extents shortly thereafter.   A trade off of this approach is the increased possibility of lost data if a power outage occurs before the data is written to disk.

On the fly defragmentation

HFS+ can also detect and correct fragmented files under certain circumstances.  When a file is opened, it is checked by the kernel and if certain conditions are met, the file will be defragmented on the fly.  One of the advantages of using Mac OS X (perhaps not for the casual user, but certainly for a power user with interest in the internal workings) is that a large portion of the source code is readily accessible via the Darwin project.  Here we see a snippet of source representing the actual algorithm for determining when to defragment a file (this is from the hfs_vnop_open function.) You’ll notice a number of constraints: The file in question must be less than 20 MB, the system must have booted at least 3 minutes ago, and there must be a minimum of 8 extents, and the file must not have been updated in the last minute, to prevent thrashing.

	/*
	 * On the first (non-busy) open of a fragmented
	 * file attempt to de-frag it (if its less than 20MB).
	 */
	if ((hfsmp->hfs_flags & HFS_READ_ONLY) ||
	    (hfsmp->jnl == NULL) ||
#if NAMEDSTREAMS
	    !vnode_isreg(vp) || vnode_isinuse(vp, 0) || vnode_isnamedstream(vp)) {
#else
	    !vnode_isreg(vp) || vnode_isinuse(vp, 0)) {
#endif
		return (0);
	}

	if ((error = hfs_lock(cp, HFS_EXCLUSIVE_LOCK)))
		return (error);
	fp = VTOF(vp);
	if (fp->ff_blocks &&
	    fp->ff_extents[7].blockCount != 0 &&
	    fp->ff_size <= (20 * 1024 * 1024)) {
		int no_mods = 0;
		struct timeval now;
		/*
		 * Wait until system bootup is done (3 min).
		 * And don't relocate a file that's been modified
		 * within the past minute -- this can lead to
		 * system thrashing.
		 */

		if (!past_bootup) {
			microuptime(&tv);
			if (tv.tv_sec > (60*3)) {
				past_bootup = 1;
			}
		}

		microtime(&now);
		if ((now.tv_sec - cp->c_mtime) > 60) {
			no_mods = 1;
		} 

		if (past_bootup && no_mods) {
			(void) hfs_relocate(vp, hfsmp->nextAllocation + 4096,
					vfs_context_ucred(ap->a_context),
					vfs_context_proc(ap->a_context));
		}
	}
	hfs_unlock(cp);

The call to hfs_relocate performs the actual defragmentation, and is also used in the implementation of the next feature.

Adaptive hot file clustering

Adaptive hot file clustering is another mechanism by which files are defragmented.   This performance enhancing feature of Mac OS X attempts to keep the most frequently used files in the “hot zone” of the disk, in other words, the area which can be accessed most quickly.  A “temperature” is calculated over a time period and periodically files are moved into or out of the hot zone based on this temperature.  The end result is that small, frequently used files are put in the most advantageous location.  There are certain restrictions to this technique, including file size and maximum number of files – due to the limited space in the “hot zone”.  Finally, as files are moved into the hot zone they are automatically defragmented, if necessary, courtesy of the hfs_relocate function.

Caveat: free space required

Obviously, the defragmentation magic described above is subject to contiguous free space being available on the disk.  There can come a time when the disk’s remaining free space is simply too fragmented for the on the fly defragmentation to operate properly.  When this condition occurs (flagged as HFS_FRAGMENTED_FREESPACE) the hfs_relocate function will no longer work as desired.

In the real world

Amit Singh’s hfsdebug tool can be used to calculate actual fragmentation of an HFS+ file system.  Let’s look at actual values from a Mac Book Pro laptop.  You can see that the laptop’s 149 GiB hard drive is relatively full, at 87% used:

[Mac-Book-Pro ~]$ df -h
Filesystem      Size   Used  Avail Capacity  Mounted on
/dev/disk0s2   149Gi  128Gi   21Gi    87%    /

The hfsdebug command will require superuser privileges, run it via the sudo command.  The output can be quite voluminous, as when given the -f parameter, it will output a list of all fragmented files, so here we pipe it into the tail command and retrieve the last 10 lines of output.

[Mac-Book-Pro ~]$ sudo ./hfsdebug -f -t 5 | tail -10
# Top 5 Files with the Most Extents on the Volume
rank    extents   blk/extents       cnid path
1          2370          1.44    1415451 Macintosh HD:/Users/User1/Pictures/iPhoto Library/face_blob.db
2          1066          1.36    1415450 Macintosh HD:/Users/User1/Pictures/iPhoto Library/face.db
3           263        139.72    1056058 Macintosh HD:/Users/User2/Library/Caches/com.apple.Safari/Cache.db
4           178         10.66     961880 Macintosh HD:/Users/User2/Library/PubSub/Database/Database.sqlite3
5           131       1940.60    1087232 Macintosh HD:/Users/User2/Downloads/xcode314_2809_developerdvd.dmg

Out of 496715 non-zero data forks total, 496275 (99.911 %) have no fragmentation.
Out of 43688 non-zero resource forks total, 43688 (100.000 %) have no fragmentation.

You can see that despite the heavy utilization of the disk’s free space, there is in fact little significant fragmentation.  But at the same time there are SOME heavily fragmented files, which are listed via the -t parameter of the hfsdebug command.  These heavily fragmented files are too large (> 20MB) to invoke on the fly defragmentation. Lastly, you will see that although this hard disk is quite full, there are still several large (1GB+) of contiguous free space available, as displayed via the -0 option of hfsdebug.

[Mac-Book-Pro ~ ]$ sudo ./hfsdebug -0 | grep GB
 300992         0x1d5e8         0x66da7     1.15 GB
 944875         0xd8114        0x1bebfe     3.60 GB
 262145       0x217d835       0x21bd835     1.00 GB
 2115346       0x232a4b9       0x252ebca     8.07 GB

Summary

In summary, modern operating systems such as Mac OS X have largely eliminated file system fragmentation as a performance concern for the average (home) computer user through a combination of techniques to avoid fragmentation, and techniques to correct it when it does occur.  My recommendation: if your hard drive is not close to full capacity you probably have no action to take.  If you do suspect fragmentation is affecting performance, you can easily confirm whether this is the case or not via the hfsdebug tool.  If you find significant fragmentation (5% or more), you can either obtain a 3rd party tool, or perform a Time Machine backup and do a clean install of the OS, then restore your files.   For the average user, a reasonable strategy would be to always perform clean installs when applying major operating system updates (10.5 to 10.6, for example), as this will correct any fragmentation that is present.  Given the ease of use of Time Machine for both the backup and restore process, there is really no daunting technical hurdles for the average Mac user to overcome.  Lastly, if you are have less than 15% free space on your hard drive, you should consider upgrading to a larger disk, as with more free space available the features of HFS+ that keep fragmentation at bay will work properly.

Recommendations

  • Try to keep 15% to 20% free space on your hard drive, defragmentation can’t work if there is no space for it to use!
  • Time Machine backup and clean installs for major operating system upgrades (i.e. 10.5 to 10.6, etc.), which will remove most, if not all, fragmentation.
  • For the average home user, nothing beyond the above should be needed.
  • Results for machines used as servers may vary, as the workload is different.  Best to use hfsdebug to figure out if fragmentation is a problem or not