White paper - Alternative Storage Technologies

Alternative Storage Technologies Whitepaper

Alternative Storage Technologies Whitepaper

THE COMPLEX NEEDS OF PRESERVATION The world is overflowing with digital data; in fact the digital universe is doubling every two years according to IDC 1 . A share of this digital universe has value in the long term. This is for instance data that is considered part of our cultural heritage, strategic business infor- mation or data that must be preserved for legal reasons. In order to ensure long term survival, these data need to be treated differently than short-lived information. Do the existing storage technologies meet the complex requirements involved in long-term pre- servation of data? What are the unique character- istics of the available alternatives? These are some of the issues we will look into. In our evaluation of the alternatives we will in particular focus on the following aspects: • How well is data security and integrity maintained? • How easy will it be to access the data in the future? • How cost-efficient is the technology in a long term perspective?

What does «digital» really mean?

When a file is stored digitally, the data is stored as binary codes, i.e. sequences of numbers (0 and 1,) which are written on to a storage medium. By decoding these numbers, a perfect copy of the original file is brought back. “Digital storage”, including the Cloud, normally means digital application of analogue storage medium.

1 IDC, “The Digital Universe of Opportunities”, April 2014 www.emc.com/leadership/digital-universe/index.htm

Alternative Storage Technologies Whitepaper

MAGNETIC TAPE What it is: Magnetic tape is an analogue medium for magnetic recording, made of a thin magne- tizable coating on a long, narrow strip of plastic film. The plastic film is coated with magnetic material, i.e. ferric oxide powder which is mixed with a binder to attach it to the plastic. It includes a dry lubricant to avoid damaging the tape drive. The magnetic coating makes it possible to retain electronically encrypted data in digital format. The tape is normally packaged in a plastic cartridge for protection and tape drives are used to read and write the data. Cheap mass storage: Magnetic tape allows large amounts of digital data to be stored at a relatively low cost, making it well suited for a mass storage purpose. Handle with care: When using tape for archiving purposes one has to bear in mind that tape is a sensitive material that requires proper handling and environmental conditions. For instance, mag- netic fields are obviously a threat to any magnetic storage, and smoke and small particles can cause damage and loss of data. It is therefore imperative to have clean operating conditions, with the right temperature and humidity levels. With minimal usage, handled properly and stored in optimal conditions, magnetic tapes have a lifetime expec- tancy ranging between 10 to 30 years. 2 However, due to the delicate material and its high density of

Linear Tape Open (LTO) is a magnetic tape storage open technology that was developed in the late 1990s by HP, IBM and Certance (now Quantum). LTO is widely used for mass storage, and the latest generations (LTO 7) can store up to 6 terabyte. For long-term preservation purposes, it is important to be aware that data capacity increases for each generation, but this has a negative impact on the expected lifetime. As for other magnetic tapes, LTO requires a migration-based archiving strategy in order to ensure data safety and accessibility. kept to a minimum, and it is considered “best prac- tice” to migrate the data at least every 5-10 years to newer tape formats. Migrations and vendor lock-in: A drawback with magnetic tape is that data retrieval is depen- dent on specific reading devices, i.e. tape drives, which need to be maintained. New tape drives are normally not able to bring back data from older generations of tapes 3 . This means that when using magnetic tape for long term preservation, regular and endless investments in migration is needed.

storage, the risk of damage and failure increases each time the tape is used. Consequently, the usage should be

2 Fujifilm, “LTO Ultrium Technology” 2012, www.fujifilmusa.com/shared/bin/LTO_Data_Tape_Seminar_2012.pdf 3 LTO Ultrium readers are two generations backward compatible

Alternative Storage Technologies Whitepaper

OPTICAL DISCS What it is: Optical discs are flat circular disks that encode digital data on one of its flat surfaces. They are usually made of aluminium, but for preserva- tion purposes it is preferred that the reflective lay- er of the optical disk is gold 4 . Optical disc drives (ODD) with laser beams are used for engraving the data onto the disc, and the laser light is also used for reading back the data whilst the disk spins at high speeds. CDs and DVDs: The two main types of optical discs used by archivists are CDs and DVDs. These can be further split into: • CD-R and DVD-R: Write once, read many • CD-RW and DVD-RW: Rewritable • DVD-RAM: Rewritable discs formatted for random access (like a computer hard disc) Blu-Ray is an optical disc designed to supersede the DVD format with a higher capacity. However, it is more popular among households than professional archivists. M-Disk is an optical disc which was introduced in 2013, claiming a lifetime of 1000 years. However, a Blu-Ray reader is needed for data retrieval, and the future accessibility of these is a challenge the manu- facturer does not give an answer to.

Ease of use: Optical discs such as CDs and DVDs are easy to use, and both the media and the hardware are relatively inexpensive to purchase. In addition, the laser-based device reader is not in contact with the disc which makes mechanical failure less likely, and most of the readers are backward compatible. These factors have made optical discs widely used by archivists. High failure rates: Originally developed primarily as a mass consumer product, elaborate measures are required when using optical discs for long term preservation. Optical discs have a relatively high risk of failure 5 , and it is strongly recommended to always do thorough testing to ensure required standards are met. Scratching and environmental factors such as dust, heat or UV light can cause severe damage to the disks. Optical discs also offer low data capacity compared to alternative tech- nologies. Frequent migrations: It is challenging to reliably estimate the real lifetime of optical discs. They are fragile, and the life span depends on factors such as manufactured quality, how well it is recorded and its physical handling and storage. To lower the risk of failure, best practice is to frequently migrateontonewer formats. This is in sharp contrast to manufacturers who tend to claim an expected lifetime of up to several hundred years.

4 DPC; “Handbook in Digital Preservation”; www.dpconline.org/advice/preservationhandbook/media-and-formats/media

5 “Risks Associated with the Use of Recordable CDs and DVDs as a Reliable Storage Medium in Archival Collections – Strategies and Alternatives”, Memory of the World Programme, UNESCO, 2006; http://unesdoc.unesco.org/images/0014/001477/147782E.pdf

Alternative Storage Technologies Whitepaper

HARD DISK DRIVE (HDD) What it is: A hard disk is a rapidly rotating disk used for storing digital information. The data is recorded by magnetizing a thin film of ferromagnetic mate- rial. A read/write head on an arm accesses the data when the disk is spinning. Instant access: Hard disks provide instant access to the data and the reading and writing device is an integrated part of the storage medium. They enable cost-efficient storage of large amounts of data, with a storage capacity of up to 10 terabytes per disk (2016). Short durability: High failure rates make hard disks inappropriate for long-term preservation. A recent study projects an average lifespan of six years. 6 It is therefore common to combine a number of disks for redundancy, also known as a RAID. From a mechanical point-of-view, a frequent issue is that the disk will not spin, and the failure rates increase with age. Another common reason for failure is

Solid-State Drives (SSD) use flash memory for digital storage, unlike HDD’s magnetic technology. They are more mechanically robust than HDDs and require less power, but have less data capacity and are more expensive in use. SSD is well suited for portable devices and applications requiring fast data access, but are not considered to have the attributes required for long-term digital preservation.

that the read/write head scrapes the rotating platter and thereby causes damage and data loss. Furthermore, a hard drive is a magnetic storage technology and its magnetic strength will natural- ly deteriorate over time when placed offline for archiving upon a shelf.

6 http://blog.backblaze.com/2013/11/12/how-long-do-disk-drives-last, retrieved November 21st, 2013

Alternative Storage Technologies Whitepaper

THE CLOUD What it is: “The Cloud” is a term used for storage services where the data is stored in virtualized pools operated by third parties. These hosting companies operate large data centres, filled with servers, which could be located anywhere in the world. They are normally subject to strict security measures such as restricted access, environmental control and emergency backup power supply. Store and ignore: For the end user, the Cloud offers clear advantages such as instant access, ease of use with no need for maintenance, and it is inexpensive to buy or lease storage space. This makes it well suited for basic storage needs. How- ever, for the valuable data that requires secure, Amazon Glacier is an online file storage service developed for long-term storage of data that does not require instant access. Like other Cloud ser- vices, Glacier is based on hard drives, but the data retrieval time is 3 to 5 hours according to Amazon. Storage is cheap, but users are charged for retriev- ing the data hence the risk of vendor lock-in should be considered. The privacy, security and ownership concerns related to cloud storage also apply to Amazon Glacier.

long-term preservation the Cloud involves some major concerns related to security, privacy and ownership. Ownership and privacy concerns: The Cloud in- volves entrusting a third party to store the data. This means the service provider has access to the data and may accidentally or deliberately disclose or alter the information. The service provider may also go out of business, posing questions as to who owns the servers and the data on them. The data owner will in this situation have limited control, and the data is inaccessible from the moment the network connection or power supply is switched off. Security issues: Data accessible through the web is vulnerable to hacking. Although the service providers strive to increase the security level, there is no guarantee that the data is out of reach of individual hackers, companies or national security agencies that might have an interest in accessing the information. Some data centres offer offline back-up, normally in the form of magnetic tape, in addition to cloud storage. This makes the data more secure, but a migration-based archiving strategy is then needed when the time perspective is long.

Data centre with servers

Alternative Storage Technologies Whitepaper

MICROFILM What it is: Microfilm is 16mm or 35mm film stored on open reels or in cassettes. It contains small images, usually in black and white, although some microfilm formats also support digital data. The two main types of microfilms are: • Silver halide film: The recommended alternative for long-term preservation. The image is trans- ferred to the film by using silver emulsion on a polyester strip. • Vesicular film: Creates the image on the polyester strip by using microscopic bubbles instead of silver emulsion, making it a less expensive but also a less durable solution. Special cameras capable of photographing at reduced size are used in order to transfer an image to microfilm. The image is then printed on the film and chemically processed in a laboratory. The film processing makes the recording process more complex compared with some of the alter- native technologies. In spite of this, microfilm is widely used for long-term preservation of data, for a number of reasons. Long durability and no need for decoding: Microfilm has a lifetime expectancy of up to several hundred years when stored properly at the recommended temperature and humidity levels. Additionally, as the data is normally analogue, data stored on microfilm can be read back by simply using a magnifying glass. Hence data retrieval is independent of specific reading devices, and the images on the film require no software decoding. Desktop readers with large screens and zooming lenses are normally used for data retrieval. The downside is that it is a manual and cumbersome process to access and reproduce data stored on traditional microfilm. Amigration-freeWORM: Thedurability and secure read-back of information written as images make microfilm a migration-free storage alternative, potentially creating large cost savings and elim- inating the risk of migration-related data loss. By being an analogue medium stored offline, micro-

A microfiche has many of the same attributes as microfilm, but is formatted as a card rather than as a film reel. They are normally stored in open top envelopes in drawers or boxes. An ultrafiche is a compact version of a microfiche, storing images at much higher densities and often made directly from computers.

film is also a secure solution. As a true WORM (Write Once, Read Many) it is literally impossible to edit or delete the information that is written to the film. Hence microfilmmeet many of the criteria for long-term preservation. Manual handling: The major drawback with micro- film is that the workflow for reproducing data is time-consuming and largely manual. Microfilm was designed for storing analogue images, and in a world overflowing with digital data, traditional microfilm is simply not an efficient alternative.

Microfilm reader

Alternative Storage Technologies Whitepaper

PIQL’S TECHNOLOGY What it is: Piql offers Services based on a turn- key system designed to comply with the needs of long-term digital preservation. The result is a storage medium that allows digital data to be preserved safely and efficiently, and easily retriev- able independent of future access to specific technologies or vendors, the piqlFilm. Piql uses high-resolution photosensitive film as a digital storage medium. Data is written to film as large QR-codes, each containing 8.8 million pixels. This allows any kind of data to be preserved offline, on a storage medium with a documented lifetime of 500 years.

Piql has developed all software and hardware needed to write and read back data. Yet there is no vendor lock-in as the decoding software is open source and the data can be retrieved by using any digital camera and computer available in the future. Future-proof WORM: Piql’s solution offers many of the same preservation qualities as microfilm, but applied to digital data. As a true WORM stored offline, it is impossible to manipulate or delete data once it is written. Digital data written as 0 and 1 obviously need decoding to get back the original file. To make data retrieval future-proof, explanations of how to decode and retrieve the information is written as human readable text on

Data preserved in digital and visual formats

Alternative Storage Technologies Whitepaper

the storage medium. By being self-contained with a documented lifetime of 500 years, valuable data can be preserved for the unforeseeable future. The technology is also migration-free, both in terms of storage medium, reading device and file formats (provided that archival formats are used). The latter is due to the fact that file format specifications can be written in readable text on the storage medium. For digital and visual formats: Piql’s technology differs from traditional microfilm in enabling digital data to be stored alongside visual images, making it suitable for the digital era we have entered.Theflexibilityincombiningdigitalandvisual formatsopensupfornewpossibilitiesandadditional security, such as visual previews of digital files. Search and access files: Piql’s technology allows for metadata searches and accessibility on a different level than traditional microfilm. As a seamless element within a standard IT infrastructure, users can search for the requested file and get it back in original format within minutes

(retrieval time will vary according to each client’s preferences). Migration-free long-term preservation: Piql’s preservation technology is not recommended for data that needs to be instantly accessible to ensure business continuity. It is rather a secure, migrationfree and future-proof option for valuable data that needs to be preserved long-term.

Piql’s film and box are tested to ensure the preserved data will be kept intact for hundreds of years.

Alternative Storage Technologies Whitepaper

Disruptive storage technologies

New technologies have emerged moving away from the primary drivers of increasing capacity and speed. They aim to make storage smarter, more flexible and easier to manage.

Even though these technologies are not digital storage mediums, they enhance the performance and management of the storage functions. These technologies include NVMe, storage class memory, and intent-based storage management. NVMe/NVMe-oF NVMe (Non-Volatile Memory express) is a power- ful communications protocol targeted specifically at high-speed flash storage systems. It offers signifi- cantly higher performance and lower latencies for existing applications than legacy protocols (SATA and SAS). It enables new capabilities for real-time data processing in the data centre, cloud and edge environments. NVMe-oF (NVMe over Fabrics) creates a very high-performance storage network with latencies that rival direct attached storage (DAS). As a result, flash devices can be shared, when needed, among servers. A lack of robustness and maturity have so far limit- ed NVMe/NVMe-oF adoption. However, new en- hancements, such as the newly announced NVMe over TCP, is accelerating the adoption. STORAGE-CLASS MEMORY (SCM) SCM allows for some processing to be performed at the storage layer rather than in the host CPU's main memory. This computational storage increas- es efficiency and performance. SCM is faster than NAND-based flash alternatives in the range of

1,000-times faster, meaning microsecond latency, not millisecond. Extensive adoption has not hap- pened yet; however, Intel has launched the Op- tane DCPMM persistent memory module, which could accelerate the adoption. INTENT-BASED STORAGE MANAGEMENT Intent-based storage management improves the planning, design and implementation of storage architectures, particularly for organizations coping with mission-critical environments. Intent-based approaches can deliver the same benefits as in networking, like rapid scaling, operational agility, and emerging technology adoption. A developer who specifies a desired outcome (such as, "I need fast storage") isn't consumed with administrative overhead and can provide containers, microser- vices or conventional applications more rapidly. As with any disruptive technology, the downside to intent-based storage management is the hurdle of deployment versus promised value. "Intent-based storage is not a one-size-fits-all technology. While improving the efficiency, performance, and management of business processes through stor- age optimization, these disruptive technologies don't solve the imminent problem of survivability of the storage medium and the content into the future.

Alternative Storage Technologies Whitepaper

Emerging storage medium technologies

Several innovations aim to solve the data density (lack of storage space) and environmental (data centres’ energy consumption) challenges. However, most of the innovations are in an early stage of development, and the writing and reading technology are complicated and costly.

For these technologies to become a reality, they have to be affordable, and the data retrieval tech- nology must be open, available and easy to recre- ate in the future. The following are some of the new emerging technologies. GRAPHENE Graphene is made from sheets of carbon atoms arranged in a lattice. Each sheet is just one atom thick. An International group of Russian and Japanese sci- entists developed a material that will significantly increase the recording density in data storage de- vices, such as SSDs and flash drives. They based their work on spintronics instead of electronics. In spintronics, devices operate on the principle of magnetoresistance. There are three layers, the first and third of which are ferromagnetic, and the middle one is nonmagnetic. Passing through such a “sandwich” structure, electrons, depending on their spin, are scattered differently in the magnetized edge layers, which affects the resulting resistance of the device. The control of the information using the standard logical bits, 0 and 1, can be performed by detecting an increase or decrease in this resis- tance. The scientists created the three layers using a combination of graphene and the semi-metallic Heusler alloy Co2FeGaGe. By selecting the Heusler alloy composition and the methods of its applica- tion, it was possible to create a thinner sample that significantly increases the capacity of magnetic memory devices without increasing their physical size.

Next, scientists plan to scale the experimental sample and modify the structure.

5D OPTICAL STORAGE Researchers are using an extreme durable quartz glass as a storage medium, which can survive di- sasters like fires or solar flares, potentially harm- ful for data centres. Besides the robustness of the medium, they can use additional degrees of free- dom for data storage, which help to increase ca- pacity.” The storage solution is described as being five-dimensional. Information is encoded in mul- tiple layers, including the usual three dimensions. However, it is also encoded in orientation and size of imprinted structures — thereby giving it five degrees of freedom for data storage. The storage allows for hundreds of terabytes per disc in data capacity. It’s also got thermal stability up to 1,800 degrees Fahrenheit. The main current bottleneck is the increase in writ- ing speed. The project still in development, and there are no specifications on the reading device.

5D memory crytal

Alternative Storage Technologies Whitepaper

HELIUM DRIVES Modern hard drives have traditionally not been air- tight. Modern helium large capacity drives are her- metically sealed to offer more storage per square inch, run cooler than air-filled hard drives and use less power to spin the discs thanks to helium’s lower resistance. The increased storage capacity is an obvious benefit, and running cooler is a huge advantage. Many hard drives succumb to data-de- stroying damage as a result of overheating. Helium drives make this possibility far less likely. These drives are already available in the market, they are expensive, but the decreased risk of data loss by overheating might be worth it. However, these drives still have moving parts and use mag- netic particles, bringing up the risk of data loss by mechanical or magnetic forces.

SHINGLED MAGNETIC RECORDING (SMR) SMR is a new way hard drives record data. Hard drives with this technology push tracks closer to- gether with higher aerial densities. Data is stacked like shingles on a roof, increasing storage capacity in the same amount of space. Over time, the stored data is trimmed without compromising the content, making the storage ca- pacity even greater. Because the read head is small- er than the write head, the data can be read even after trimming. One downsize is slower random writes. SMR drives use large write heads relative to the width of the tracks of bits, so writing one track affects the next tracks of bits too. This limits SMR drives to sequen- tial writing, where the next tracks don't matter. To do random writes (such as changes to data on the drive already), data must be arranged sequentially in cache, by the drive in this case, and entire sec- tions of the drive must be rewritten sequentially.

SMR vs. Traditional HDD

Alternative Storage Technologies Whitepaper

DNA (DEOXYRIBONUCLEIC ACID) DNA is the molecule that dictates how an organ- ism develops. The DNA in our bodies is made up of nucleotides that form pairs in a specific order. There are four different nucleotides in DNA: ade- nine (A), cytosine (C), guanine (G), and thymine (T). DNA data storage works by encoding digital data sequences (0s and 1s) into DNA sequences (A, C, G, T). The sequenced information is then synthe- sized into artificial DNA. To retrieve information from this DNA, one must decode the nucleotide sequences from the synthetic DNA back into bina- ry data, which is extremely difficult. The main advantages of DNA data storage are: • The incredible high storage capacity. DNA can hold 215 petabytes of data on a single gram. • Long-Lasting — DNA can last a very long time, in any condition, for tens of thousands of years without needing any special care or treatment. • Size — DNA is extremely small and cannot be viewed by the naked eye, making it very useful for storing data in a confined space.

The main disadvantages of DNA data storage are: • Cost —The price is high. The cost per mega- byte for encoding data is an estimated $12,400, and an extra $220 is needed for retrieval/de- coding. • Read and write speeds. Writing to and reading from DNA is a lengthy and painstaking process. • Not rewritable and no random access func- tionality — Once you encode data into DNA, there is no way of making changes to your data without redoing the encoding process. There’s also no random access functionality, which means you can’t access a specific part of data without decoding all of it. For now, DNA-based storage and computing are not likely to be a noticeable part of everyday life, but something that could have a massive impact on the big picture view of humanity.

DNA encoding and decoding processes

DIFFERENT TECHNOLOGIES FOR DIFFERENT NEEDS

We have seen that each of the storage technologies have different characteristics, making them suitable for different needs. Data owners need to thoroughly consider their requirements in terms of e.g. acceptable data retrieval time, security level and budget limitations during the archival period.

The following table provides an overview of some of the attributes of each of the commercial storage technologies. It is important to keep in mind that this is a general overview of technologies constant- ly in development. Within each category, there are differences between vendors and products.

Magnetic Tape

Optical Disks

Hard Disk Drive

Cloud Microfilm piqlFilm

Projected lifetime 7

30

100

6

N/A

500

750+

Digital data

Analogue data 8

10

True WORM 9

Offline format 11

Instant retrieval

Vendor-neutral reading device

Migration-free storage medium

True archival format 12

7 General figures, there are variations from vendor to vendor 8 Human readable text or pictures 9 Write Once, Read Many, meaning the data is unalterable after being recorded on the storage medium 10 CD-R and DVD-R only 11 Does not require any maintenance of the medium to ensure data integrity 12 Self-contained format: descriptions of how the data is stored, file formats, information on how to decode the data, etc.

Piql secures long-term future access to valuable digital data. Head-quartered in Norway since 2002, Piql Preservation Services are offered to clients around the world by the use of technology that converts high-resolution film into a digital preservation medium for the future.

Piql AS • Grønland 56, N-3045 Drammen, Norway • tel +47 905 33 432 • office@piql.com • www.piql.com

©2021 - Piql AS - All rights reserved | www.piql.com | Rev. 01-2021

Made with FlippingBook - Online catalogs