Thursday, January 5, 2012

Class No. 2: Thinking Digital: A Practical Session to Help You Get Started

On Tuesday I began the new year by taking my first DAS course presented as an on-demand webinar: "Thinking Digital...A Practical Session to Help You Get Started," taught by Jessica Branco Colati and Greg Colati. This is one of the Foundational DAS courses, and it serves as a good overview of the decisions we make as creators and stewards of digital content. Unlike the first course I took, this was more focused on the digitization of traditional archives than on born-digital records.

The course was organized by the kinds of choices digital archivists must make about quality, metadata, management, storage, preservation, and delivery. Without repeating all of the information in the course, I'll just list the highlights as I saw them:
  • High quality digital objects adhere to five established principles: interoperability, reusability, sustainability, authenticity, and scalability. Better quality requires more time, more storage, and better equipment, but it also allows for a wider variety of uses. We should create the best quality digital objects we can afford now so that we have greater flexibility later.
  • Metadata allows for the identification, management, access, use, and preservation of a digital object. There are several different types of metadata: administrative, descriptive, preservation (including technical), and structural. Metadata should support local needs, but should also be standardized in order to enable interoperability. Controlled vocabularies should be supported. Keep in mind that metadata is never truly finished - there will always be changes or updates to make, or new information to capture.
  • The management of digital files must include all derivatives of the original object (and there could be hundreds) as well as the metadata about that object. Management must be built into your digitization workflow; it should not be a separate activity. There is no one digital asset management system (DAMS) that will solve all of your problems - you will most likely need an array of systems to accomplish all of your goals. In any DAMS, web delivery is only a small piece of the puzzle despite how important it is to users and probably to your management.
  • Storage choices will depend on the choices about quality you made earlier - the higher quality files you have, the more storage you will need. While storage may be getting cheaper, back up and preservation services are getting more expensive. It might be best to consult an expert when it comes to storage.
  • Preservation starts at the point of creation of a digital object, which is also the point at which the creators of digital content probably don't want to be bothered with questions about preservation, so it's on us as archivists to maintain the focus on preservation concerns. The first stage in a successful preservation plan is simply to acknowledge that digital preservation is important (much like the first step in overcoming addiction is to acknowledge that you have a problem, I suppose).This is as far as we've gotten, to be honest, but we hope to move onto the next stage soon, which is to take action.
  • Delivery involves both discovery and access. Discovery is based on the indexing of your metadata and/or the full text of your scanned documents. Access is how users interact with your digital objects once they are discovered - are the objects simply viewed, or are they able to be manipulated or extracted by the user?
The first point the instructors made before delving into what I described above was that the skills we already have as archivists can be easily adapted to the digital environment. I find that this is particularly true when it comes to the following: 

  • Planning and prioritizing digitization workflow. This is no different from planning and prioritizing our processing workflow, and should be done in the same systematic way.
  • Creating descriptive metadata. Descriptive metadata is archival description, which means that we already know how to create it, and also that our finding aids are full of preexisting descriptive metadata.
  • Managing and preserving digital assets. We manage our physical holdings, whether through the use of a database or a paper shelf list, and we are responsible for their long term preservation. This is true of digital files as well, whether they are born-digital records or digital surrogates of physical objects. Though digital files do present some specific challenges that will require more technical knowledge than we may start out with, the fundamental responsibility is the same.
One final comment: the on-demand courses are available for two months once you register for them, which is very convenient, but it turns out that this flexibility actually made it difficult for me to find the time for it when there are so many other things that require immediate attention. I registered for this webinar back in November, and I was lucky to complete it just before the two months expired. I do plan to take additional on-demand courses, but in order to thwart my inner procrastinator I will try to schedule a specific day for them as if I were taking them live.

I will be taking another Foundational DAS course, "Digital Curation: Creating an Environment for Success" on January 18th in Boston, so I'll be posting again in a few weeks. Until then, thanks for reading!


  1. Hey Erica -- another interesting read. Thanks for sharing your experiences.

    One thing I was thinking about was the issue/ability/problematic thought and perhaps necessity of re-digitizing material in the future. It's great if institutions digitize material now from negatives, photographic prints, or other kinds of relatively stable analog media. But the technology that has advanced in imaging science during the past 5 years has been sort of mind blowing (think of the best digital camera from 2007 and compare to today's best). Also the platforms that those items can be retrieved on (like social media/web 2.0 junk, etc.) has also changed quickly. Obviously high-quality is the best practice when trying to digitize items for the present, but how do you predict the future of imaging science in regards to an archival setting? It's almost like all the work that we think is high-quality enough now might become obsolete faster than we may realize, and that's a terrifying thought. One area, however, where it may not apply is sound. Humans can't hear the difference past a certain sample rate/bit depth when digitizing from analog, so the line can be drawn there. But what if someone invents something tomorrow that revolutionizes digitizing sound? How do you prepare for something that hasn't been invented yet? And do archives need to implement the most current digital technology to create a sort of "digital master" file from archival masters? And does any of this matter really, especially when it comes to born digital files?

    Also, I had a side thought about something you mentioned at the Simmons SCoSAA presentation.

    It might be worthwhile and interesting to look at the way relationships are mapped on social networks as some kind of mold for creating an EAC-CPF presidential libraries project, so the relationships between people would also connect between institutions. A lot of institutions have different things going on digitally, but one common language I bet they all share is HTML. I know FBML is Facebook's markup language that they developed for users to create applications that allow incorporation of Facebook's millions of user profiles into those applications. FBML eventually then sits on HTML (I believe). FBML integration transfers data and relationships between other "records" or profiles to a third party which integrates their specific content while pulling the profile information from Facebook itself. Facebook also developed XFBML which allows you as a web designer to incorporate little "Like" buttons and Newsfeeds into your HTML website. So what if there was a "network" developed with all the people from Presidential administrations, and each person had their own profile/record that held all conceivable metadata about them with relationships, etc. And then those profiles were privately hosted and never seen. Then the presidential libraries would be able to incorporate the networked profiles into their websites, which would act as third-party developers like you find on Facebook with a unifying markup language that would fit into an HTML web page? Too crazy?

    Thanks for another cool post.

  2. Hi Hannah,
    You bring up a good point; the technology changes so fast that the "high quality" files we create now will most likely seem outdated and inferior to us in the not-so-distant future. However, if we focus on that we become paralyzed, unable to settle on any format because we can't guarantee that it will last forever. I think the key to being able to move forward is to plan for the regular and inevitable migration of our digital files to the next format. Of course having such a plan means that we must first recognize that digitization is just the first step in the larger digital curation process, not an end unto itself.

    I think your idea for presidential libraries EAC-CPF profiles is an exciting one, and in fact we've been thinking about how to implement something like it. You should look at the Social Networks and Archival Context Project (SNAC) out of the University of Virginia, described by some as "Facebook for the dead," which uses EAC-CPF records generated from EAD finding aids to make connections between people, organizations, and families who appear in archival records. Here is the link to their search interface:

    and here is an article about the project:

    Thanks for your comment, and please keep them coming!