ERCIM News 80

Image ERCIM News 80 cover page

January 2010
Special theme:
Digital Preservation

This issue in pdf
(64 pages; 15 Mb)
FacebookTwitterLinkedInPinterest
Next issue
January 2015
Next special theme:
Scientific Data Sharing
Call for the next issue
Get the latest issue to your desktop
RSS Feed

Trustworthy Preservation Planning with Plato

by Christoph Becker, Hannes Kulovits and Andreas Rauber

Digital content is short-lived, yet may prove to have value in the future. How can we keep it alive? Finding the right action to enable future access to our cultural heritage in a transparent way is the task of Plato.

The rapid changes in technology in today's information landscape have considerably shortened the lifespan of digital objects. While analogue objects such as photographs or books directly represent the content, digital objects are useless without the technical environment for which they were designed. In contrast to a book, word-processor documents cannot be read, a simulation cannot be re-run and re-evaluated, and sensor data cannot be interpreted without the right hardware, software and documentation environment. Digital objects are under threat at several levels: media failure, file format and tool obsolescence, or the loss of necessary metadata. Especially for born-digital material this often means that the contained information is lost completely. Digital preservation has become a pressing challenge for any kind of IT-related operation.

Given that a digital object needs the correct environment in order to function, we can either recreate the original environment (emulation) or transform the object to work in different environments (migration). A growing number of tools performing migration and emulation are available today, with each having particular strengths and weaknesses. Often there is no optimal solution. On the other hand, requirements vary across institutions and domains, and for each setting, very specific constraints apply. The process of evaluating potential solutions against specific requirements and building a plan for preserving a given set of objects is called preservation planning. Preservation planning is the centerpiece of the reference model for an Open Archival Information System (OAIS, ISO Standard 14721:2003, see link below). So far, it is a mainly manual, sometimes ad-hoc process with little or no tool support.

Figure 1: Preservation planning environment.
Figure 1: Preservation planning environment.

The planning tool Plato, developed as part of the Planets project (Preservation and Long-term Access through Networked Services) by the Digital Preservation lab at the Vienna University of Technology, is a publicly available Web-based decision support tool accessing a distributed architecture of preservation services. It implements a solid planning process and integrates a controlled environment for experimentation and automated measurements of outcomes. This enables trustworthy, evidence-based decisions to be made, as required by the Trustworthy Repositories Audit & Certification Criteria (TRAC, currently under evaluation for ISO standardization).

Preservation Planning
To ensure digital content remains accessible to and authentic for future users, a plan must be created that takes into account legal and technical constraints such as storage space, infrastructure and delivery, copyright issues, costs, user needs and object characteristics.

A preservation plan defines a series of preservation actions to be taken by a responsible institution due to an identified risk for a given set of digital objects or records (called a collection). The Preservation Plan takes into account the preservation policies, legal obligations, organizational and technical constraints, user requirements and preservation goals and describes the preservation context, the evaluated preservation strategies and the resulting decision for one strategy, including the reasoning for the decision. It also specifies a series of steps or actions (called a preservation action plan) along with responsibilities and rules and conditions for execution on the collection. Provided that the actions and their deployment as well as the technical environment allow it, this action plan is an executable workflow definition, such as a Planets workflow (see article "The Planets Interoperability Framework" in this issue).

The four-phase high-level workflow shown below can further be divided into fourteen steps. Evaluation of candidate actions uses controlled experiments and increasingly automated measurements.

Potential migration and emulation tools are applied to sample content and evaluated according to a hierarchy of requirements, based on Utility Analysis. A service-oriented framework greatly automates experiments and allows users to leverage various publicly available Web service registries that provide access to potential preservation action tools. Quality-aware services measure execution parameters and quality of the action tools, removing this burden from the experimenter.

Figure 2: Visualization of results.
Figure 2: Visualization of results.

The result of using the tool is a complete preservation plan that can be deployed and executed.
Current and future work includes:

  • Repository integration: we are working on an integration of Plato with leading digital repository systems such as ePrints, RODA and other Fedora-based solutions to add preservation planning functionality to these systems.
  • Monitoring: continuous monitoring of repository operation is essential and should include monitoring preservation plans.
  • Proactive recommendation: by building recommender technology, we want to further increase the level of proactive planning in Plato.
  • Deployment: Plato is being evaluated and used by several institutions to assist in planning long-term preservation (including the British Library, the Royal Library of Denmark and the Bavarian State Library), with further case studies focusing specifically on non-heritage application domains such as the medical sector (medical imaging), e-Government, production processes and scientific data sets.
  • Compliance validation: with both service provision and trust gaining importance in the handling of digital content, full integration and validation within operational procedures are being evaluated in the context of respective international standardization initiatives.

Plato is publicly available free of charge at the project Web site.

Links:
Plato Project: http://www.ifs.tuwien.ac.at/dp/plato
ISO-14721:2003: OAIS, Blue-Book: http://public.ccsds.org/publications/archive/650x0b1.pdf
Planets Project: http://www.planets-project.eu
Digital Preservation Lab, Department of Software Technology and Interactive Systems, Vienna University of Technology: http://www.ifs.tuwien.ac.at/dp
Trusted Repositories Audit and Certification Checklist: http://www.crl.edu/sites/default/files/attachments/pages/trac_0.pdf
Full preservation plan definition: http://www.ifs.tuwien.ac.at/dp/plato/ docs/plan-template.pdf

Please contact:
Christoph Becker
Technical University Vienna/AARIT, Austria
Tel: +43 1 58801 18818
E-mail: This email address is being protected from spambots. You need JavaScript enabled to view it.

Contents