Piql White Paper - Magnetic Tape Technology in a Multi-node Archival Storage System

Magnetic tape technology in a multi-node archival storage system Whitepaper 6

TAPE TECHNOLOGIES RELIABILITY

TWO SIGNIFICANT OBSERVATION COME OUT OF THE CERN REPORT:

The reliability of tape technology as measured by the uncorrectable bit error rate (UBER) is su- perior to current disk technologies and con- tinues to improve generation by generation. The UBER of LTO-8 is now stated as 1 in 10-19, an order better than IBM’s TS1155 at 1 in 10-18 and equiv- alent to the current Oracle T10000D. Bit error rates are not the whole story as disk sys- tems provide much greater overall levels of data in- tegrity through technologies such as RAID (Redun- dant Array of Inexpensive Disks) or erasure coding. Despite attempts, RAIT (Redundant Array of Inex- pensive Tapes) has never been widely employed so the reliability of tape technology is generally the native reliability of the technology. This should be more than adequate, as with a reliability figure of 1 in 10-19 a drive would need to run for 130 years at 300MB/sec to encounter an error event. This reli- ability specification however is theoretical and as- sumes a perfect working environment, maintenance of tape drives and media and use within specified duty cycles. A paper from CERN describes how their tape-based archive system having collected over 70 Petabytes of data during the first run of the Large Hadron Collider (LHC) experiment was planned to be shut down and the period used for migrating the com- plete data archive to higher-density tape media. The CERN physics archive comprised over 50 000 tape cartridges of 4 different types from 2 generations of tape drives: 5TB and 1TB cartridges from Oracle, and 4TB and 1TB cartridges from IBM. As new drive generations were arriving on the market and 1TB media was becoming obsolete, the shutdown pe- riod offered an opportunity to migrate or re-pack tape media. Re-packing is a function of “Enterprise” tape technologies whereby older tape media can be reformatted in newer generation drives to achieve higher storage capacities.

1. During the repack exercise, 13 tape cartridges were identified on which a significant portion of the data could not be read due to environmental contamina- tion. Considering that CERN use the highest quality tape media and libraries and maintain high standards in their datacentres this must be a significant risk to any tape archive. CERN subsequently put in place environmental quality measuring devices and today actively monitor a number of factors including air particulates, temperature and humidity. 2. An observed data integrity figure of 1 in 10-16 an- nually. This had been 1 in 10-14 in 2009 and had been improved upon by a comprehensive series of im- provements in operational and maintenance proce- dures. A UBER of 1 in 10-14 relates to a data loss rate of around 90,000 in 100 milllion files written annu- ally, or near to 0.1%. This is quite a different picture to the theoretical UBER figures and improvement as can be seen by reading the paper required consider- able effort. Tape is not infallible, but its positive characteristics outweigh the risks if these are mitigated. CERN have no choice for reasons of cost to keep more than one copy of most of their data. As such they must mitigate risk by whatever procedural methods they can in order to achieve the maximum native reliabil- ity from their tape infrastructure. It is beyond most organisations to take the precautions described, but most will follow a multi-node storage architecture in order to mitigate the risk. It would seem prudent then to also follow the multi-technology principal by employing different technologies in each storage node.

Made with FlippingBook flipbook maker