Backup Disk Use with Time Machine: A Preliminary Empirical Study

Scot W. Stevenson <scot.stevenson@gmail.com>

Version 1.1, 6. January 2008


Apple's Time Machine software automatically writes backups to an external drive. To develop an understanding for the dive size needed in a two-computer household, daily data on backup use was collected over a period of 17 days. This data was used to extrapolate how long it would take to fill up a 250 gbyte hard drive. Though the results themselves cannot be easily generalized, it is hoped that others might find the method useful.


Introduction

In late 2007, Apple introduced the Time Machine backup software as part of its Mac OS X 10.5 "Leopard" operating system. An external drive or server share is used for hourly copies of the user's original hard drive. These hourly backups are condensed to daily and then weekly backups. When the disk is full, the system starts deleting the oldest weekly backups.

The author decided to add two SATA hard drives to an existing Ubuntu server in a RAID 1 (mirrored) configuration to hold the backups from two Macs. Real-world financial considerations such as the high price of cat food limited the funds available for computer hardware. A real world estimate was needed of how small the drives could be kept while still being large enough to hold the data of both backups for at least two years, with the choice between 250 gbytes and 500 gbytes. The two-year time limit is based on the assumption that there is a high chance of mechanical drive failure due to the heavy use.

To judge the size needed, a test run of Time Machine was performed using an existing external hard drive under everyday conditions.
 

Methods

Setup

The two source computers used were the author's iMac and his wife's MacBook, both Intel Core Dual machines. The backup hard disk was a LaCie 250 gbyte drive connected to the iMac via Firewire 400. It was exported on the local network to allow the Time Machine on the MacBook to write data to it as an image. Both Macs used OS X 10.5.1 during the study. On both computers, only the home directories of the primary user were flagged for backup.

Use before data collection

Time Machine on the iMac was enabled on 3. November 2007 while running OS X 10.5.0. The initial backup was 38.3 gbytes, as based on the Finder information tab. The MacBook was not upgraded from "Tiger" to "Leopard" until 10.5.1 was released. Its first backup was performed on 17. November and was 21.9 gbytes large, as based on the Finder information of the mounted image bundle. Data collection was started on the 21. November when the setup was considered to be stable. At this time, 77.7 gbytes of the drive were reported as used by the df command.

Data collection

The space used on the backup drive was collected once a day with a bash shell script run from /etc/periodic/daily . The statistics for the backup drive in mbytes are isolated, filtered to remove all but the most important entries, and then added to a list together with a time stamp:

#!/bin/sh
echo `date` - `df -m | grep "Desk" | awk '{print $2, $3, $4, $5}' ` >> /Users/scot/Desktop/tm_stats.txt


 This produces a list of entries in the following format:

Wed Nov 21 03:15:33 CET 2007 - 238347 77722 160624 33%

The last four numbers are, in mbytes: The total capacity of the drive, the space used, the free space remaining, and the percentage used.

Data was collected from 21. November 2007 and stopped on the 7. December 2007. At this point, the author had to leave the country. 

No attempt was made to ensure that Time Machine was able to create backups every hour. Both computers were set to go to sleep after a period of inactivity that ranged from a few minutes (MacBook on battery power) to two hours (iMac). Since Time Machine on the MacBook required the iMac to be awake, its backups were more irregular. No attempt was made to ensure that he data was collected at the same time each day. 

Data analysis

The data was collected in a NeoOffice spreadsheet. The amount of disk space used per day was computed and averaged using the functions provided.

Further Notes

Both computers are mainly used for e-mail and vanilla texts such as blog postings. The largest data types added were family pictures of Thanksgiving and iTunes purchases. iMovie HD films were stored and edited in directories not included in the backup. iMovie 08 was not used at all, as it is an abomination. FileVault was not enabled on either machine.


Results

The average size increase per day was 92.9 mbytes. With 77.7 gbytes used for the initial backups, a 250 gbyte drive would be full in 4.7 years.

The median size increase per day was 33.0 mbytes. With 77.7 gbytes used for the initial backups, a 250 gbyte drive would be full in 13 years.


Raw data of space used on backup hard drive

DateMbytes usedDifference
21. Nov 2007 77.722
22. Nov 2007 77.825 103
23. Nov 2007 77.855 30
24. Nov 2007 77.940 85
25. Nov 2007 78.008 68
26. Nov 2007 79.700 1692
27. Nov 2007 79.273 -427
28. Nov 2007 78.774 -499
29. Nov 2007 78.810 36
30. Nov 2007 78.869 59
1. Dec 2007 78.902 33
2. Dec 2007 78.885 -17
3. Dec 2007 78.899 14
4. Dec 2007 78.929 30
5. Dec 2007 78.913 -16
6. Dec 2007 78.916 3
7. Dec 2007 79.264 348


Graph of Space used on Backup Drive
               
Data was collected at  3:15h a.m., with three exceptions: 27. Nov (14:43h p.m.), 3. Dec (8:05h a.m.), and 4. Dec (7:52h a.m.). The 1.7 gbytes used on 26. Nov are Thanksgiving family pictures added to iPhoto. 

The graphic is scaled to show the complete hard drive capacity -- 238 gbytes as given by the df command -- consistent with the aim of the study to estimate when the total capacity would be reached.


Discussion

This study was performed for a very specific purpose under very specific conditions. A general analysis of Time Machine's disk use was not attempted. Therefore, except for the basic methodology, only limited generalization is possible.

The average daily use was less than expected. It can be argued that the short time period sampled was an abnormal "quiet period." Support for this comes from the size increase between the two initial backups -- 38.3 gbytes for the iMac on 3. November and 21.9 gbytes for the MacBook on 17. November for a total of 60.2 gbytes -- and the 77.7 gbyte of disk space that df reported as used four days later at the start of the study.  However, during that time both computers were subject to reorganizing and general removal of disk clutter. It is felt that this is responsible for the pre-study increase.

The period examined was too short for Time Machine to condense the daily backups to weekly backups. For the aim of the study, this was considered secondary, since the combined backups should use even less space, increasing the argument for the smaller hard drive size.

It should be noted that even so, the amount of space used on the backup drive actually decreased several times. Most noticeably are the two days after 1.7 gbytes of Thanksgiving pictures were added to the system on 26. December, when about 425 and 500 mbytes were freed.  The author's understanding of Time Machine's workings is insufficient to explain this.

As a result of this study, two 250 gbyte drives will be purchased and combined to a RAID 1 backup drive on an existing Ubuntu server after the release of OS X 10.5.2 "Leopard". As part of that setup, long-term data will be collected for further analysis.

Colophon

Version 1.0 finished 5. January 2008. HTML version created with KompoZer. Graphic is a screen shot of the NeoOffice spreadsheet.



Back to main page Scot W. Stevenson
Last change: 6. January 2008
Up one level