Digital Archaeology: Rescuing Neglected and Damaged Data Resources

(A JISC/NPO Study within the Electronic Libraries (eLib) Programme on the Preservation of Electronic Materials)

Purpose of the Study

The study will aim to examine the approaches to accessing digital materials where the media has become damaged (through disaster or age) or where the hardware and software is either no longer available or unknown. The summary of the problem given in the request for tender document under the Aim of the study provides a good sketch of the issue. Other examples might have included the rescue of the Stasii tapes by the German Archives where the format of the information on the tapes was not readily known, the documentation was limited, and the hardware and software had to be identified (or constructed). The numerous examples of rescue from media in post-crisis situations (after fires, flooding) provide us with evidence as to the how the process of rescue needs to be managed and some of the obstacles that are encountered during a rescue. They also provide an indicate of the labour investment, financial costs, and methods of working.

The project will examine the issues included among the objectives listed in the request for tender in detail. It will:

  • survey current activities, identify significant (both in terms of information value, complexity of rescue, and quantity of information recovered) rescue projects and describe how the rescue issues were approached and what lessons were learned during the rescue activities;
  • examine the kinds of data formats and types that can be rescued from the vantage of hardware and software;
  • examine the issues rescue from media whose format is unknown, where the hardware and software for reading the media no longer exist and where the media has become damaged;
  • identify the technology preservation (e.g. museums, commercial retroconversion firms) and disaster recovery (or digital rescue) (e.g. public sector and commercial) services and companies;
  • description kinds of operation (technical and organisational) which is necessary to carry out this work and to address the question as to whether it can be done on an ad hoc basis or whether it can only be done by an established institution in an effective manner;
  • identify the issues that make the need to rescue inevitable and which increase the likelihood that rescue will not be successful;
  • identify any guidelines which might help us to avoid having to turn to the rescue path; and,
    investigate the kinds of possible pilots that might be undertaken, if sufficient examples of rescue activities cannot be identified.

Method of approaching the problem

The project will begin with a literature review (including both online and print resources, and where possible the grey-literature) and a review of marketing literature from disaster recovery companies, major storage vendors, and what limited information comes from companies which provide these kinds of services the intelligence and law enforcement communities. This will be complimented by face-to-face discussions, telephone interviews and written exchanges with selected representatives from these sectors. The main information gathering exercise is likely to involve at least 20 face-to-face discussions, at least twice as many telephone interviews, and a full range of written exchanges. The initial port of call will be colleagues at the National Media Laboratory in the US who have experience in rescue activities. The project will also study the limited work which has been undertaken by National Archives, such as the work done by the Swedish National Archives and that undertaken by the German Archives in Koblenz (to rescue information Stasii materials).