The volumes of digital data being produced are growing at an ever increas-ing pace. According to an International Data Corporation study for 2007, 264 exabytes of data were created. In the future, this staggering volume of data is projected to grow at a 57% annual growth rate, faster than the ex-pected growth of storage capacity. Moreover, new regulatory requirements mean that a larger fraction of this data will have to be preserved. All of this translates into a growing need for cost-effective digital archives.
While Hard Disk Drive (HDD) technology has made significant progress over the years, so has magnetic tape recording, such that tape still remains the least ex-pensive long-term archiving medium. Current tape technology achieves a storage density of about 1 Gb/in2 and a cartridge capacity on the order of a terabyte. An analysis of the limits of current tape technology suggests that tape areal density can be further pushed by two orders of magnitude, leading to cartridge capacities in excess of 100 terabytes. This makes tape a very attractive technology for data archiving with a sustainable roadmap for the next ten to twenty years, well be-yond the anticipated scaling limits of HDD technology.
Figure 1: IBM System Storage TS3500 Tape Library, a highly scalable, automated tape library for mainframe and open systems backup and archiving in midrange to enterprise environments with a capacity of up to 45 PB (with 3:1 compression).
It is clear that tape will never become the primary storage medium for average computer users. HDDs are much better suited to this purpose, with access times of a few milliseconds, storage densities of 300-400 Gb/in2 and capacities of up to a terabyte.
However, for long-term archiving, backup and disaster recovery, there are con-siderable advantages to using tape:
- energy savings: once data is recorded, the medium is passive; it sits in a rack and no power is needed
- security: once the data is recorded and the cartridge removed from the drive, the data is inaccessible until the cartridge is reinstalled. This means that the data cannot be corrupted by a virus while it is offline. Security is further enhanced by drive-level encryption
- lifetime: because the medium is passive, it is extremely reliable with a long lifetime. Some tapes have been in use for forty years
- reliability: tape media is removable and interchangeable, meaning that unlike HDDs, mechanical failure of a drive does not lead to data loss, be-cause a cartridge can simply be mounted in another drive.
All these factors contribute to the major net advantage:
- cost: savings estimates of the total operating cost of tape backup relative to HDDs range from factors of three to twenty-three, even if the latest de-velopments, such as data deduplication, are taken into account. In archival applications, where deduplication can not be used effectively, cost savings can be even higher.
Today’s archival tapes have a storage capacity of about 1 Gb/in2. A recent study indicates that improvements in technology may increase this density to 100 Gb/in2 without a fundamental change in the tape recording paradigm. There are five main technologies involved:
1. Media: the main challenge for tape systems is that the tape medium is flexible while HDDs are rigid. HDD heads ‘fly’ over the media while those for tape sys-tems are in contact with the tape. Current tapes are based on metal particles, but promising research is underway into new tape media such as barium-ferrite, which provides a smoother surface and improved signal quality.
2. Heads: HDDs currently use very sophisticated head technology, based on tunneling magnetoresistive (TMR) sensors. It is expected that tape heads will also move to using TMR, which has increased sensitivity leading to an im-proved signal-to-noise ratio for detection.
3. Transport and track-following control: the spacing between adjacent tracks today is around 10 µm. The target for the future is to reduce this to the order of 0.2 µm, which requires much better control of the lateral positioning of the tape head in the sub-micrometer range as well as very tight tape speed and tension control.
4. Signal processing: signal processing plays a crucial role in reliably retrieving the recorded digital information. High areal recording densities pose signifi-cant challenges in terms of "write" and "read" operations. The main challenge is to ensure highly reliable operation of both these signal-processing func-tions, including adaptive equalization as well as gain and timing control, de-spite significant reductions in the available signal to noise ratio. To achieve the envisaged linear recording densities leading to multi-terabyte tape sys-tems we are investigating novel advanced noise-predictive detection schemes that take into account the special statistical properties of the noise process.
5. Error protection: to guarantee an uncorrectable bit error probability of less than 1 x 10-17, redundancy is added to protect the data with error-correcting codes (ECC). In the Linear Tape Open (LTO-4) standard, the overhead amounts to 27%. We investigate ways to reduce the overhead without sacri-ficing performance. For instance, currently data is coded first for error correc-tion and then for modulation constraints. In the future, a reversal of this order called ‘reverse concatenation’ could lead to a gain in efficiency and enable more powerful iterative data detection and decoding schemes.
Tape will remain a vital storage medium for the foreseeable future, as there is a good chance of reaching a density of 100 Gb/in2 on tape. This will help solve the huge problem of preserving our ever-growing mountain of data in an economical and ecological manner. IBM is working on advanced technologies to keep tape storage systems as attractive for long-term data storage in the future as they have been in the past.
Links:
http://www.emc.com/collateral/analyst-reports/diverse-exploding-digital-universe.pdf
http://www.research.ibm.com/journal/rd/524/argumedo.pdf
http://www.clipper.com/research/TCG2008009.pdf
http://www.ultrium.com/pdf/Tape%20Fallacies%20Commentary%20Final.pdf
Please contact:
Jens Jelitto, Mark Lantz, Evangelos Eleftheriou
IBM Research GmbH, Zurich Research Laboratory, Switzerland
E-mail: