Digitization vs Digital Preservation and Preservation Level Policy Making

Preservation Level Policy Making

Preservation level speaks to the “actions or objectives a repository intends to fulfill in regard to an Object in its custody.”[1] In practice preservation level indicates the degree of preservation, whether it be simple “bit-level” preservation to permanency. The implications of preservation level policy provide a foundation for how the records are maintained – i.e. records deemed to be worthy of permanent preservation require more rigorous and potentially expensive service in order to ensure their long term viability. Long term viability includes both renderability and understandability.[2] In practice this implies that the records will endure the chaotic shifting of new technologies and the capricious events of fortune. Ultimately the practices and processes which ensure the aforementioned characteristics must be informed by policy, otherwise staff do not have a consistent philosophy and operating procedure by which to guide their behaviors.

Furthermore preservation level policy is sound for economic reasons. Permanent records must be maintained often using complex software, or by additional staff hours when compared to lower levels of preservation. These resources factor into a budget. It follows then that it is impractical for many institutions to exclusively maintain permanent records; staff and funding shortages make such difficult, especially in expanding and active archives. Wise preservation “triage” can follow policy: records which are deemed to be of lesser archival significance can be delegated to lower levels of preservation so as to ensure the permanency of important collections.

Preservation policy also ensures that objects are preserved following an exacting standard. PREMIS 2.0 defines a semantic unit to indicate why an object’s preservation level was indicated as such.[3] Such contextual information can be used as a statement of intent for future archivists encountering the object to minister it appropriately. One might define an object for “full preservation” – i.e. to maintain its original characteristics as nearly as possible. Policy serves a critical role in ensuring consistent metadata and thus appropriate preservation. PREMIS 2.0 semantics even provides a framework for current and future preservation levels at an object metadata level, and the process and timeline by which that transition might occur must also be informed by policy.

Yet another useful function of policy is standardization. Well written policy becomes a field manual by which archivists can refer to. It is possible to avoid the analog nightmare of opening a dusty drawer and finding stacks of records with rusted paperclips and acidic leaves burning through them. In the digital world we have parallels: corrupt hard disks, missing files and spreadsheets and files which overwrite each other when migrated. A standardized approach to preservation and metadata production minimizes the likelihood of malpractice, because it becomes mantra or canon of the institution.[4]

Preservation level policy can also be useful in personal collections for much the same reasons as for archival institutions, although at a smaller and more immediate scale. Old papers of nostalgic value can be thrown on a DVD or storage disk and byte/sum checked every month or so, but records deemed irreplaceable and of prime importance must be approached differently. Redundant backup systems must be devised, as well as a way of regularly checking the integrity of the originals and copies. The price of neglect here is lost files, and in our digital age a corrupt file is more often lost forever, only rarely saved by expensive and intrusive restoration services.

Digitization vs Digital Preservation

Simply put: digitizing a record has nothing to do with preserving it. Digital preservation speaks to the processes and procedures which ensure the long term access of a digital record. It is the science of avoiding the mechanical breakdown, as well as the technical obsolescence of digital records over time. While digitizing implies the transfer of an analog record to the digital, it does not necessarily imply a plan to ensure the renderability and understandability of that record by future generations. Digital preservation takes the wisdom of the archivist and applies it to the realm of computers and the internet: it is not enough simply to “hope for the best” when storing digital files, as Conway (2000) so succinctly demonstrated. Specifically, digital files if left to their own devices, and without any special efforts taken to maintain them, typically become inoperable within a decade. This is the shortest lifespan of any medium to date, even lower than highly acidic paper. And while highly acidic paper becomes brittle and browns with age, alerting owners of impending deterioration, digital records become corrupt silently and often en masse. In a world in which “born digital” records are becoming the norm, in which analog means of information retrieval and storage are replaced by the electronic, serious attention must be paid to preserving digital data lest the new host of documents detailing our cultural heritage be lost to neglect.[5] To summarize: digitization is a process of migration from the analog to the digital, digital preservation is the process, science and philosophy of ensuring digital records are not lost in time.

A major aspect of digital preservation is addressing the concern of digital obsolescence. This refers to incompatibility of older records in computer systems sporting newer hardware and software. This often results in the record becoming difficult or impossible to access. During the 1990s the de facto archival medium for digital files was tape. In contemporary times there are few tape drives still remaining, to say nothing of the state of the tapes themselves. It follows then that it is difficult or impossible to access the information which was stored on such tapes. This is one example of digital obsolescence, wherein technological innovation quickly surpasses the rate at which information is stored. This situation is complicated by a lack of standard protocols regarding digital preservation, although OAIS and PREMIS have attempted to address these issues in recent years.[6] Wise digital preservation includes a serious evaluation of medium, both hardware and software, to ensure the permanent access of records.

The issues of physical deterioration and digital obsolescence can be answered by a suite of tools: metadata, refreshing, migration, copying, and emulation. Metadata refers to the attachment of contextual information detailing the object record itself, including data regarding its provenance, the technologies which created it, the hardware which it is stored on and more. The hope is that the attachment of such information would minimize the likelihood of obsolescence and benign neglect. PREMIS offers excellent standardized spreadsheets for this purpose, which are slowly but surely becoming standard in the digital archives universe. Refreshing refers to what the class did in the digital curation project, it is the transferring of data between two storage mediums to ensure that the bits are constantly renewed and so do not degrade. Refreshing digital records is a short term solution to the long term problem of digital medium disintegration.[7] Migration is the conversion of the record to “newer system environments.”[8] In practice migration refers to the practice of moving a record from one file format to another, often for purposes of renderability and understandability, or from one operating system to a newer one so as to avoid obsolescence. The overall goal of migration is to maintain functionality of the record and to avoid the scenario in which the record cannot be accessed due to software or hardware constraints. Copying a record to multiple sources ensures that localized catastrophe does not spell the end of the record. Emulation refers to software which is capable of recreating the functionality of an obsolete system. Old word processors can be emulated using emulator software and then be used to access records which would otherwise be inaccessible. In this fashion obsolete records can be accessed or migrated to newer media. Emulation is still in a theoretical stage, although it has some prominent supporters and is gaining impetus in the literature. Jeff Rothenberg recently launched the visionary, modular emulator called Dioscuri, designed to emulate a wide range of early computing applications and operating systems.[9]

Digitization for preservation is sound if the original analog record has deteriorated to such a degree that it is no longer feasible to maintain the record physically. The order is to preserve records with archival or otherwise cultural significance. Normally this entails the production of high resolution scans, “dark archive” master copies (often in .TIFF), and the production of derivatives (JPEGs, GIFs etc) for more general access. Great care must be taken to ensure that the digitization process is not an editorial or creative process; the digital record should be engineered in such a fashion so that it is as similar to the original as possible. Sharpening, light masks and other modifications should only be used to best replicate the composition of the record.[10]


[1] Brian F. Lavoie. PREMIS With a Fresh Coat of Paint. D-Lib Magazine. 2008. http://www.dlib.org/dlib/may08/lavoie/05lavoie.html

[2] Ibid.

[3] PREMIS: Preservation Metadata Maintenance Activity. http://www.loc.gov/standards/premis/

[4] Digital Preservation Coalition. What to preserve? Significant Properties of Digital Objects. http://www.dpconline.org/events/significant-properties.html

[5] Margaret Hedstrom. Digital preservation: a time bomb for Digital Libraries. http://www.uky.edu/~kiernan/DL/hedstrom.html

[6]David M. Levy and Catherine C. Marshal. Going digital: a look at assumptions underlying digital libraries. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.63.2540&rep=rep1&type=pdf

[7] University of Michigan. Digital Preservation Management: Implementing short-term solutions for long-term problems. http://www.icpsr.umich.edu/dpm/dpm-eng/contents.html

[8] Garrett, J., D. Waters, H. Gladney, P. Andre, H. Besser, N. Elkington, H. Gladney, M. Hedstrom, P. Hirtle, K. Hunter, R. Kelly, D. Kresh, M. Lesk, M. Levering, W. Lougee, C. Lynch, C. Mandel, S. Mooney, A. Okerson, J. Neal, S. Rosenblatt, and S. Weibe. Preserving digital information: Report of the task force on archiving of digital information. 1996. http://www.rlg.org/legacy/ftpd/pub/archtf/final-report.pdf

[9] Jeffrey van der Hoeven. Dioscuri: emulator for digital preservation. D-Lib Magazine. 2007. http://www.dlib.org/dlib/november07/11inbrief.html

[10] Cornell University. Digital Imaging Tutorial. http://www.library.cornell.edu/preservation/tutorial/quality/quality-01.html