No viruses or check-sum inconsistences at the Digital Preservation Party please: validating, verifying and transferring born digital architecture records.

To acknowledge Digital Preservation Day, staff at Newcastle University Special Collections wanted to share some of their recent experiences in accessioning born digital material. The practical part of the digital preservation process has been on pause since the start of the pandemic, but now we are back in the office, interacting with our colleagues and scrutinising all the data involved in protecting the longevity of the bit stream.

Introducing Digital Preservation

Before we dive into what we have been doing, we thought it would be good to introduce our understanding of digital preservation, what it is and why it is considered increasingly important for archival collections. For those of you au fait with the theory of born digital material feel free to skip to the next section.

Digital preservation can be defined as a structured set of activities designed to ensure continued access to born digital materials for as long as it is deemed necessary. This includes maintaining the availability of compatible hardware-software relationships, trained practitioners, and appropriate workflows to select, transfer, ingest, and access ‘born digital’ files.

There are no rigid workflows for a successful digital preservation strategy that are applicable for all organisations, but there are widely used software, helpful resources and models to aid you when formulating your own managed set of activities which can be tailored towards your own institution. See here for a handy guide introducing you to the technical jargon.  The presence of born digital materials held in archival collections is becoming increasingly paramount, especially when cataloguing and providing access to late twentieth century material. Therefore, it follows that the knowledge and skills to preserve and provide access to them is also being increasingly highlighted.

Digital Preservation within Newcastle University Special Collections

A department wide digital preservation strategy has been in place within Newcastle University Special Collections since 2019. This includes full workflows to accession, process and catalogue, access, and preserve born digital materials within the wider aims of facilitating research, encouraging collection transparency and maintaining a pathway to providing long-term access to heritage collections. Following these steps users of special collections can now access digital material held within the Bloodaxe and Donaldson, (Sir Liam) collections.

Newcastle University has recently become the recipient of the Farrell (Sir Terry) Archive, and one of our next priorities is to process the collection. You would expect a substantial architectural collection to contain voluminous quantities of physical building plans, and this collection is no exception. However, the archive also contains a substantial amount of born digital material in the form of CAD and drawing files, along with the quantities of reports and contractual documentation involving the professional practice of Sir Terry as leading architect over the course of his professional life. These digital records are predominantly contained within CD-R’s and the occasional floppy disc.

A box of CD'R's from the Farrell, (Sir Terry) Archive.
The born digital materials, demonstrating that great things come in small packages.

Fresh from ‘Novice to Know-How’ training prepared by the National Archives, Senior Archives Assistant, Jemma Singleton set about actioning the existing digital preservation strategy for this material. It was a way of testing out new-found knowledge in a practical way, and to explore the applicability of conducting regular digital preservation as a parallel exercise to more traditional archival cataloguing. The second part of this blog post details what we did, challenges we faced, how we creatively solved these challenges, and what we learned during the process.

Accessioning Born Digital Architecture Files

The items selected for digital accession and file transfer involved Sir Terry Farrell’s work for the refurbishment of the Royal Institution, along with some video interviews. The item containers (eg: the CD-R) had already been physically catalogued, but the contents needed to be virus checked, validated and transferred. The following working practice, supported with in-house demonstration and advice by Archivist, Ruth Sheret, is detailed below.

  1. Assign a parent-code file ID and keep daily log of activities up to date for steps (2-5).
  2. Run each disc through virus checking software (Malwarebytes Premium), using an un-networked quarantine computer, and note the results.
  3. Validate digital item using DROID to check for encrypted files, unexpected file types, file structures, or files that are not wanted, and note the results.
  4. Transfer files to a local area, using checksum SHA-56 on the material, once prior to transfer and once it has been transferred to the local area. Check they match.
  5. Refile physical item and repeat for all desired material.
  6. Transfer files from local area to shared access area once computer is connected back to the network (but only if it remains virus free!).
Jemma at a computer doing conducting born digital material checks.
The feeling you want to have when everything checks out.

Challenges and Workaround Solutions

Appropriate Equipment in the Right Location

The Farrell (Sir Terry) Archive is stored separately from the majority of Newcastle University Special Collections. The process of acquiring and setting up an appropriate virus checking computer is currently delayed with the IT department at Newcastle University. This meant that identified born digital items within the collection were transported to the main library where an established quarantine computer was used. Although it would be ideal to have the appropriately set up IT systems in the right location, transporting items where there are adequate digital preservation resources may prove to be an adequate long-term work around. This solution had the serendipitous consequence of enabling other colleagues (Rachel Hawkes, Literary Archivist) working on other collections with born digital material to engage in training, enhancing the digital preservation skill set of the wider team.

Ruth and Rachel engaged in virus checking and check-sum validation.

Appropriate Software

As we hadn’t used this computer for a while, one of the first things required was an update for all the software we use for transferring born digital material. One of our tools (DROID) was glitchy in the morning and so we decided on an alternative work around – either hold off on the validation process for a later date, or make manual checks. This was an easy decision to make for the material we had, Ruth checked the material and quickly decided that manual checks would be reliable and not very time consuming. It was decided to harvest the metadata at a future stage in whole digital preservation process.

Checksum Inconsistencies

This session was a virus-free party and special care was taken to make sure the hardware was always unplugged from the network. However, some checksum results pre and post transfer were not exact. This was often due to an inbuilt folder within the portable hardware storage device, acting as a secondary digital container but not a relevant digital record. Nevertheless, this required an in-depth, manual check of files transferred from its physical format onto the local system.

Time

Aspects of the virus checking and file transfer process can feel like lost hours, as they are out of your control once the computer processes have been initiated. The most significant factor to consider is how long it can take to transfer quantities of large files from a local-offline environment to a shared networked workspace at the end of a digital preservation session. For this session there was a total of 3851 files requiring 38 minutes to transfer over. Knowledge of the length of time that different parts of the digital preservation process can take will come with experience and the confidence to conduct other workplace activities alongside digital preservation. Top tip – save enough time to do the offline to online file transfer at the end of the day, especially if you have a bus to catch, or a dentist appointment, or post-work fun times.

Digital Preservation Day Reflections

It is a positive step to begin implementing the long-term working strategy for born digital files from the Sir Terry Farrell archive, in line with the existing processes of Newcastle University Special Collections. It also felt good to create solutions to working obstacles that cropped up along the way. Future steps for this collection will be to formally catalogue the digital files and make access copies for users, along with incorporating regular digital preservation sessions into the cataloguing activities.

It was noted that much of the intellectual content of born digital materials transferred as part of the Farrell (Sir Terry) Archive already exists as a physical copy within the collection. This has raised a wider question about how many copies of an item should be kept and in which format? A preference for physical collections is space hungry but relatively stable to store, where-as born digital material is physically economic for space, but cost over time to host and maintain on a server. Then there is the umming and aaahhing about how an appropriate strategy is resourced and organised for the long-term accessibility of born digital materials that increasingly form the records of any modern organisation, and, specific to this blog, archives dealing with late twentieth/early twenty first century material. But that may need to wait for another time: protect your bits and Happy Digital Preservation Day.

Sir Terry Farrell’s archive has been generously loaned to Newcastle University Library and is currently being catalogued. Once catalogued it will be made fully available to the public.  All rights held by The Terry Farrell Foundation. 

Leave a Reply