Heritage Collections in Crisis
Stating the Obvious: Lessons Learned Attempting Access to Archival
Archive of World Music, Harvard University
All of us have experienced compelling, even jolting, intellectual
awakenings when confronting primary audio and visual resources
that document the lives of people and societies. One of mine from
the past year occurred while being shown archival film footage
of Marian Anderson's performance at the Lincoln Memorial. An international
conference held in Europe brought another to my attention that
is personal to me. Norwegian Radio preserved the recording of the
Nazi officer announcing the takeover of Norway during World War
II, assuring citizens that "resistance was futile." As
an American of Norwegian descent whose great-aunt worked in the
resistance, this audio recording gave immediacy and chilling reality
to a history that I already knew rather well. Last week I received
a phone call from a former student who had taken a seminar at Harvard
in 1978. As part of a research paper, he had made a recording of
his Texan grandmother singing cowboy songs. Now, his college-age
daughter wants it for a project. It is at once a part of family
history (more valued now than it was when it was originally made
by a teenage college student); it also presents cowboy songs not
widely documented in the literature; and it is a source of American
vernacular music history. More than the simple entertainments that
some of these materials started out as being, the songs and tales,
speeches, performances and events recorded by participants and
observers have become treasures of collective memory and heritage.
In our universities, faculty, students and researchers increasingly
want to use these materials in teaching, learning and scholarly
production. Audio and visual materials are both by us and about
us in important ways. Families and local communities demand access
to materials that they often, with justification, consider their
own. Radio stations and museum exhibit curators want to use them.
All sorts of people want access to recordings and the materials
that accompany them--programs, program notes, field notes and other
documentation--in a convenient way.
Access to these collections, particularly unique archival collections,
has rarely been easy. Our fragile audio materials must be reformatted
for any kind of use. As special collections that present difficulties
in cataloging and housing, and as collections sometimes regarded
as highly specialized or ephemeral, they have not occupied high
places on the list of priorities for funding or for work. And we
have been plagued by the view that we as audio archivists have
no established standards for preservation and therefore should
not proceed with projects. Thus, a potential user typically has
had to find out somehow what is in a collection, then place a request
for the items he or she would like to hear well in advance so that
labor-intensive reformatting can take place, then travel to the
library during its business hours there to confront a plain-looking
audio cassette and some sort of photocopied list of its contents
or accompanying materials. While some institutions will mail copies
of materials to users, others cannot. Often, the cassette and photocopies
must be left in or returned to the institution.
Current expectations from our users contrast dramatically with
this practice. Many expect fast delivery of MP3 files with scanned
images of whatever accompanying documentation there may be. Users
expect access to contents of collections via free and well-maintained
online Web sites. Sitting in an institution to listen to materials,
not to mention waiting for them to be prepared, never enters their
minds as a reasonable avenue. As a faculty member, researcher,
and librarian, I know that, in our hearts, all of us want this
immediate access, even those of us who still prefer to read from
paper, take notes with pens, and buy books.
To state all of this, especially to a group such as that gathered
for this program, is to state the obvious, for all of you live
and work with these materials and demands every day. The question
is, how do we do meet these needs? How do we overcome the multitude
of enormous problems that seem to attend our every effort at reasonable
access? Why is access so hard and what can be done, if anything,
to improve it? Of course, myriad technical and legal problems attend
online access, which I will leave to my colleagues to discuss.
Access to collections and information about them presents its own
challenges, some of which I will outline here.
My favorite library patrons will gesture wildly toward a part
of our collection and say, "of course, all this will be digitized
eventually." As someone working in a large collection, I find this
view variously hilarious, pitiable, or depressing. As a nation,
we have not managed to catalog our collective holdings. We have
not managed to complete online conversion of existing catalogs.
Retrospective conversion and even cataloging are generally less
labor-intensive than digitizing collections. Our chances for extensive,
let alone comprehensive, digitization of primary materials are
A useful starting point for discussion of paths of access may
be to acknowledge that not everything in our collections requires
the same system of access. Limited access to highly specialized
materials may be fine. In-library only access to sensitive or restricted
materials may be the best practice. Probably, one wants to offer
wide access to information about the contents of collections through
cataloging and inventories. Probably, one wants to offer international,
networked access to some parts of our collections. But the first
step toward an art of the possible in access to audio collections
is recognition that not everything needs to be treated in exactly
the same way. Starting from this point, and pursuing, in particular,
the issues surrounding networked digital access, what are the principal
An important barrier to any access project for archival collections
is that nearly every step of the work requires specialized skill.
Simply unpacking and sorting the Laura Boulton Collection required
that we identify which typed notebooks belonged to which recordings,
which notes were lecture notes derived from field notes, then which
tapes had been copied from earlier ones and where the other accompanying
documents belonged. Often, ethnic collections require highly specialized
subject and language skill to prepare even the most rudimentary
inventory. If the collection is to be cataloged in a standard library
catalog, then a skilled cataloger familiar with national utilities
such as OCLC and RLIN is needed. Preparing electronic documents
requires some command of mark-up language. Preparing and storing
digital images implicates another set of equipment and skills.
Working with digital audio is a bona fide specialization. For networked
resources to persist and remain viable, systems of metadata need
to be developed and used. Often, a computer programmer is necessary
for such tools as digital collection management programs. Our sources
of inexpensive labor--students, interns, volunteers and the like--may
be but are not predictably suited to this work, especially with
large collections that take many months to process.
Labor, in my experience, is always the most expensive component
of any initiative, certainly in the long run. Moreover, pleas for "more
staff" generally require extensive justification and are rarely
met by budget-conscious administrators who may be under the impression
that most work can now be automated and that little human intervention
is actually necessary. The expense of audio reformatting is phenomenal.
Getting the "last, best play" from a fragile recording may occupy
four hours of skilled labor for one hour of sound.
One common solution to the cost of labor is to get a grant. Following
the investment of weeks or months of time preparing a compelling
argument for a necessarily trendy or attractive part of one"s collection
and assembling the requisite budget, a granting agency may provide
the needed help. The problem is that, at the end of the grant,
project staff must depart, taking their skills with them, and one
is generally left to start all over in another part of the collection.
Maintenance of digital products created by grant-funded projects
may itself be a problem.
One might argue with some justification that some of the necessary
skills seem to be fast becoming common. Many of us can scan a document,
burn a CD and put together a Web site that is fine for rudimentary
purposes and may offer decent access to our collections. But what
if you want your access tools to persist, to be durable and refreshable?
One homemade compact disc probably will not meet this case nor
will it offer networked access. Hard links on Web sites eventually
lead to non-existent servers. CDRs made just a few years ago may
or may not play on every CD player.
Given the cost of labor and the value of our collections, our
products must last as long as possible. We cannot afford to make
and remake them, if, indeed, we have the opportunity to do so.
We need durable audio products. We have seen the failings of cassettes,
open-reel tape, CDRs and DATs. Our cataloging and other electronic
documents must be stored in a secure and widely accessible environment,
preferably one that can be searched internationally free of charge.
There is an important, qualitative difference between building
a Web site such as a course page (or even an institutional Web
site) and building an electronic resource such as a finding aid.
At our university, for example, our finding aid for the Laura Boulton
Collection differs from the course page for Professor Thomas Kelly's
well-known music course, "First Nights." Kelly customarily describes
his course page as a pile of rocks, that is, ideas that he and
his assistants have tried out, moved around, added, or eliminated
(thus changing the shape of the rock pile) in different versions
of the site. Mutability is critical to his use of his course site
as a dynamic aid to teaching. The Laura Boulton site, on the other
hand, is characterized by the goal of near-immutability. Unlike
teaching tools, library resources need to remain relatively stable
over time. We must construct a series of permanent resources. We
must finish one and move to another and so the revising and innovating
that is appropriate to the "First Nights" page would be inefficient
for our purposes. We want to select durable technologies and document
our choices and procedures well so that the processes of migration,
refreshing and so on can be conducted mechanically if possible.
Whereas we welcome the flexibility of electronic formats for adding
new data or correcting errors, we do not really want to constantly
change our pile of rocks.
Well-organized and accessible housing and storage of physical
materials can be expensive; digital storage is a major technological
and financial challenge. For the long run, digital objects and
metadata about them must be stored securely, preferably in a place
where migration and refreshing can be managed automatically. We
can learn from radio and national archives in Norway, Switzerland,
and Germany that have developed and are using such systems.
Metadata becomes critically important and we need all sorts of
it. We need descriptive metadata: what is it that is stored? We
need structural metadata: how do I find this virtual object and
what is its virtual format? And we need administrative metadata:
who reformatted this object and what equipment was used? Without
the metadata, we may as well not bother to create the digital object.
Without the metadata, we probably can't find it, let alone use
it or migrate it.
Cataloging, of course, is a familiar form of metadata in which
we record information about the physical and intellectual characteristics
of our collections. I suspect that most of our archives produce
fairly good catalogs, given the staff to do so, and have done for
some time. Our issues in intellectual access involve searchability
across archives. In the first place, we need databases and library
catalogs that present users with familiar formats and familiar
mechanisms for finding out what we have. Even though we can now
potentially access each other's databases if they are online and
use them, I have never felt that inventing an idiosyncratic, stand-alone
database is a good idea. It seems to me that we need catalogs and
databases that are more or less standard, that look or feel similar
to each other. The Archives for Traditional Music at Indiana University
was the first such collection to enter its cataloging on OCLC.
Adjustments of standard library formats--particularly MARC--were
necessary, of course, but the result was widespread access to information
about the Archives that reached from the university into public
libraries and school systems. Non-specialists could find information
about the Archives' collection by using a standard library tool.
This is surely a good thing. Making use of existing practices,
adapting them if necessary, is an effective approach to access.
Unfortunately, it does not always work. Existing classifications
systems and such common tools as the Library of Congress Subject
Headings, designed as they were for a limited repertory of European
arts, fail our highly differentiated, multi-cultural collections.
Developing new tools, such as thesauri, has proven complicated
by the different ways in which musicians, folklorists, anthropologists,
and local communities think about, name, and classify performances.
Creating thesauri on which any part of our community can agree
turns out to be very time-consuming and becomes work that moves
too slowly because few of us can devote the necessary time to such
a project. Hence, we lack consensus on genre terms and categories
for such common concepts as devotional music. What do we do about
Arab American Muslim communities that refer to their Sufi rituals
as dhikr, where their Turkish American co-religionists call
the same phenomenon zikr? In the Indian communities, we
find Sanskrit-derived names that are also written in Tamil script
and have English versions. Systematic transliterations of the Sanskrit
and Tamil names produce two different romanizations, and the English
version may be different still. We can decide to use AACR2 rules
to "establish"the name; however, who is going to verify that the
multiple variants represent the same person? Representing our various
local communities accurately is hard and searching is harder.
Electronic finding aids constructed to the standards of Encoded
Archival Description (EAD) offer a looser, more narrative, and
adaptable format for inventorying collections than does standard
cataloging; they are a good alternative. Producing the proper diacritical
marks for the names and terms of a Vietnamese or Hmong community
in these documents is nearly impossible at present. Does this matter
Designations from the Human Relations Area Files have been useful
for organizing access to ethnic collections, however these are
a bit old and sometimes incomplete. The terms can be too "purist" to
suit multi-cultural communities.
As archivists, we may easily feel "stuck"; everything we do, we
may feel, has something wrong with it. We make very little progress
in our collections without running into an insurmountable wall
that seems to preclude access to a collection.
Attempting to step out of the morass, I would like to describe
an initiative our library launched in 1999, that we called Music
from the Archives and that attempts preservation and access to
some of our unique collections. I offer this not as a prescription
at all, but as an experience and as a set of decisions that might
start our discussion. "Music from the Archives" engages digital
technology to offer a model for access. It is not conceived as
a comprehensive program through which everything we have will be
digitized; rather, it tries to advance ways to offer as wide access,
intellectually and virtually, to selected items from our collections.
Our selections proceed from the strengths of our collection, which
in turn proceeds from the priorities of our primary constituency
which is the faculty and students of the Harvard Music Department
and the related larger research community.
The contents of a collection will be presented in an electronic
document that follows the format of the electronic finding aid.
It draws upon national standards and practices for the creation
of EAD documents and serves them from Harvard's online OASIS catalog,
which includes Harvard's other finding aids for archival collections
across the university. Audio files of selected performances and
image files of field notes and other documentation will be available
through links from the finding aid. What we are working toward
ultimately is a thoroughly integrated multi-media finding aid in
which the digital resource itself will be conceived as having multiple
manifestations. Whereas now we can move around from one set of
digital objects to another, our ultimate plan is to produce a more
flexible tool that will allow us to show relationships among parts
of our collection--for example, between a festival program book,
a photograph, a concert program, and a recording--that may not be
readily apparent to the user. We will thus be able to bring parts
of our collections to the attention of users quickly and graphically.
Digital standards and systems for metadata for our images have
been developed in consultation with the Harvard University Library
Digital Imaging Group. The Music Library did not try to develop
or invent these procedures itself. We did, however, develop our
own audio preservation studio, as we considered ourselves and you,
our colleagues, more reliable resources than any existing at Harvard.
Our studio is centered around a Sonic Solutions high density audio
workstation that allows us to sample at 88.2 kHz and to digitize
audio at 24 bits, which enables us to capture sound at the densest
rate known in the audio industry in superb detail. The engineer
typically reformats recordings onto recordable compact discs (for
users) and computer data tape (for storage). This form of tape
is much more robust than any other we have. Real Audio streaming
sound files are produced for networked use. Metadata is captured
about all processing performed on the file, so that it will be
possible to recreate the labor-intensive decisions made by the
One result of our project will be the production of research-intensive
tools. Our documents will have several important features: they
will offer entire musical sources rather than short samples. Researchers
will actually be able to conduct research, not simply browse collections
or sample holdings. Although not every item from every collection
will be networked, every item will be inventoried and we will be
able to add audio files upon request.
Secondly, our digital products will be durable. With very modest
investment of time and money, we can make two copies of the CD
using products from two different manufacturers and two copies
of the exabyte tape using two different lots of tape. While no
particular claims for longevity can be made for CDRs or computer
data tape (let alone Real Audio files), we feel some confidence
that one of the four exemplars we produce will persist until a
viable remote, robotic repository is available. Certainly these
formats are most convenient and accessible, and they may be hardier
than the open-reel tape of our originals.
We seek long term solutions to the problems of digitizing, storing,
refreshing, reformatting, and migrating digital objects over the
years. Beyond the creation of access to resources, we seek to regularize
the processes of work that are necessary to the production using
our existing permanent staff wherever possible. Creating a new
flow of work and bringing together regular library staff in the
production are goals as important as the three resources themselves.
We do not want to rely on temporary project staff for these productions
whose skill and training departs with them when the project is
over; we want to rely instead on permanent staff who can contribute
to this new kind of work over the long run. So to summarize our
goals, we seek to use digital technology to develop a new model
of access to rare audio collections, to produce tangible electronic
resources, and to institutionalize the process of work that emerges.
Durability is an important result. To achieve it, attention to
the choice of digital audio formats is critical. Once formats are
chosen, a durable system of identifying, characterizing and locating
them--that is, systems of metadata--must be constructed that will
function for as long a run as we can manage. I have sought ways
to develop this project for the better part of ten years. Only
recent circumstances and priorities in my institution have rendered
it finally possible. Our work is inextricably linked to the time
and place, the character of the institution in which we work. What
is possible one place does not work in another and our project
at Harvard may not make sense in other contexts. What broad ideas
from "Music from the Archives" might help us move beyond local
To make effective progress with our collections, selectivity based
on our collections and our constituencies may help. Each of us
working selectively from strength may produce a good corporate
result for access to our collections. We should probably work together
and rely on each other, as no one institution is likely to have
all the necessary expertise or facilities to provide all of its
own paths to access. For the short term, creating multiple digital
formats may answer our needs for access and persistence if we are
careful about the equipment and processes that we use. Most physical
formats have become inexpensive to use. For long-term digital access,
we need storage facilities. Might we work collectively to persuade
public and private agencies to build digital repositories that
we might all use? To make long-term use of such facilities, we
need to master systems of metadata. Certainly we need to re-tool
ourselves a bit for these tasks. However, we also need to find
ways to acquire or to share the services of specialists such as
audio engineers, computer programmers and subject specialists.
Most importantly, we need to fashion workable collaborations that
produce results and that do not produce years of committee meetings
that yield nothing we can actually use.