The Library of Congress >> Cataloging Directorate

The Library of Congress - Cataloging

Organizing the Global Digital Library Conference

Library of Congress
Digital Library Visitors' Center
Madison Building

December 11, 1995

Sarah Thomas, chair of the conference, welcomed the participants by thanking the cosponsors of the event, the Council on Library Resources and the National Digital Library Program of the Library of Congress. The purpose of the conference was to form a consensus on a list of principles and assumptions regarding the nature of organization in the digital library of the future. The use of the word "organizing" rather than "cataloging" was a deliberate choice when titling the conference. With the critical mass of digital resources, we will have to make decisions regarding which resources warrant bibliographic "control" in the traditional sense and which can be organized in ways that users can access through gateways to internet resources. Other goals of the meeting were to identify the challenges and difficulties which impede progress; promote a coalescence of current efforts regarding the organization of digital resources by drawing on the collective expertise of those gathered; identify the expectations for digital libraries in the future; and ideally, develop an action plan for moving efforts forward. Although one of the goals of the conference was to develop the librarian/technologist partnership and identify areas for collaboration, the action items which developed are intentionally librarian-centric in order to provide avenues for the library community to become more involved in this arena.

The twenty-three participants represented a diverse group with numerous connections to other task forces and organizations which are actively engaged in resolving issues closely related to the topic of the meeting, whether they are computer scientists, social scientists, consultants, representatives of funding agencies, or librarians associated with research and national libraries. Among the groups or projects to which the participants had close links were the Association for Library Collections & Technical Services Task Force to Define Bibliographic Access in the Electronic Environment, the Coalition for Networked Information's group preparing a white paper on "Networked Information: Discovery and Retrieval," the Encoded Archival Description (EAD) project initiated by the University of California at Berkeley, the IFLA Study Group on the Functional Requirements of the Bibliographic Record, the Internet Engineering Task Force, Joint Steering Committee for the Revision of AACR (i.e. the Anglo-American Cataloguing Rules), the LC National Digital Library Program, the National Digital Library Federation Planning Task Force, the National Science Foundation digital library projects, the OCLC InterCat project, and the Research Libraries Group's Digital Image Access Project. In addition, participants had organized, attended or been speakers at the American Library Association's preconference "AACR 2000," the Center for Electronic Texts in the Humanities Text Encoding Initiative May 1994 conference, and the OCLC Metadata Workshop. Representatives from the British Library and the National Library of Canada provided an international perspective.

Assumptions become Principles

In preparation for the conference, the participants had been invited to submit lists of "digital assumptions" which would challenge the thinking of the group and stimulate discussion. Some of these assumptions appeared often enough to be considered consensus statements. Upon further discussion and clarification of terminology, the following statements were accepted as principles:

  1. Libraries exist to provide value-added services to a wide variety of materials, including:
    • selection
    • organization
    • access
    • location information
    • delivery, and
    • preservation.

  2. Libraries will include a mix of traditional materials (print and non-print) and digital resources indefinitely.

  3. Library collections will continue to be only subsets of the universe of publications, resources, and information.

  4. Like traditional materials, digital resources will have more value and utility if they are organized, making resources known and available.

  5. Libraries should integrate access to digital resources with access to conventional materials.

  6. Genre is a more useful organizing principle than format.

  7. Information seekers benefit from self-indexing resources, producer-generated access, and librarian-generated access.

  8. Librarians will continue to use judgment in applying varying levels of description and access, as appropriate to each resource, in order to provide retrieval of relevant resources in a cost-efficient manner.

Important Themes

Several major themes emerged during the course of the conference. Some of the discussion points are identified below, although it should be noted that there may not have been a consensus among the participants regarding these points. Many elements of the themes overlap. They have been divided into the following categories for ease of presentation:

Integration

Although the title of the conference included the words "global digital library," it was noted that libraries in the future will not be digital only, but an integration of traditional library materials (print and non-print) with digital resources, and that the catalogs of the future should integrate access to all materials. The links and relationships between traditional and digital resources which librarians can provide should enable users, be they sophisticated or naive, to retrieve all relevant materials. These links are also essential in the context of developing a dynamic bibliographic model (under investigation by IFLA's Bibliographic Control Study Group on Functional Requirements of the Bibliographic Record) and informing users of the existence of multiple versions. Integration can also work in reverse, as users may also find segregation of resources to be an important approach. The library "catalog" of the future will be a collection of traditional bibliographic records and a gateway to networked information. What tools need to be developed to allow for integration of existing MARC-based records with networked information?

Selection

Just as libraries have only "collected" a subset of the universe of conventional library materials, they can reasonably be expected to select only a subset of available digital resources to organize and integrate, not only because economic resources to organize are limited, but so is the relevance of these materials to users and the collections of different libraries with which they are meshing. As is true for print materials, libraries will need to rely on producers of digital information in order to identify resources for selection. Although concern is frequently heard about the tremendous volume of resources on the Net, many of these resources are of the type traditionally collected by archives while others are more library related, and some are intentionally ephemeral and not intended to be preserved for future use. However, we do not know what the digital resources of the future will look like, as new genres will continue to develop. Librarians will need to select and organize some resources individually, while "organization" for other materials may mean providing links from the catalog to network tools, such as indexes of home pages. There are likely to be more network resources than we had traditional materials, therefore, cataloging by a skilled cataloger may be a decreasing portion of the total materials.

Organizing for Access

Three major approaches for organizing emerged in this discussion:

  1. Augment cataloging data through human intervention. This value-added service which librarians can provide is most closely related to traditional cataloging, by augmenting the catalog with added fields and/or links and connections to resources. This approach enhances the traditional catalog by making it an entry point, or gateway, which will allow for stitching together, or meshing, traditional catalog records with finding aids, tiered access, new software tools, and services. There is growing consensus that the current cataloging tools (especially the Anglo-American Cataloguing Rules, 2nd ed., rev.) may not be adequate to describe all digital resources, but whether these tools will continue to "evolve" to cover digital resources or whether they will need to be heavily re-engineered is not yet apparent.

  2. Maximize use of Internet software tools. Libraries could improve catalog-type access to Internet resources by providing input to the software developers who are building tools such as those used to create, manage, and monitor links. We should find out what is missing (such as classification and controlled vocabularies) and provide expertise in these areas. It will be a challenge to recognize and integrate these tools.

  3. Find ways to expand the use of metadata that forms part of the digital object. To increase the self-indexing data available for manipulation, librarians should include metadata in digital resources and develop mechanisms for integrating different forms of metadata (MARC, TEI, EAD, etc.). Libraries should identify incentives (e.g., copyright, patent, revenue, prestige) for creators to produce useful metadata and provide feedback to those who develop and apply metadata. Although metadata efforts are more advanced for digital text material (such as those employing the TEI header), other digitized resources (such as text bit-mapped images) could also benefit from metadata schemes.

Collection/Archiving

The traditional concept of "collection" is changing in the digital arena. Archiving digital resources is another important value-added service libraries must provide to guarantee a future of enduring access and to develop a culture of stability. The idea of cooperating in this collection task is more important in the digital environment, as we may need to entrust our digital future to the good will of those in the Internet community to preserve and store digital resources. These "pools of quiescence" where standards are emerging and practices are stabilizing are already being developed by libraries, organizations, and producers, but the issue of delegation of archival responsibilities is one which libraries need to resolve in cooperation with other stakeholders. As with conventional materials, some will have value over time, and the archives need to be available to mine in the future, although current and future uses may be different. Storage costs for digital archiving are dropping, making it reasonable to collect "copies" of some resources. Reformatting and refreshing of digital resources, however, is an expensive operation and tools are scarce.

Volatility/Stability

The often-discussed volatile nature of the Internet and its resources causes great alarm among librarians charged with cataloging these resources. Volatility can occur as changes to information resources (frequent iterations), although this can be beneficial in fields where researchers need to view and communicate about research in progress or where the research front advances rapidly or unevenly. Another area of volatility is the changing location of information resources. One of the value-added services which libraries should provide via selection, collection, and organization, is stability in the sense of a mechanism to resolve addresses of valued internet sites to ensure long-term ready access. Volatility is nothing new to librarians (e.g., serial publications are constantly changing names and publication patterns), however with on-line information, the previous versions are sometimes deleted without warning and lost forever. Stability and change are two separate qualities (e.g., a newspaper is a stable publication, but it changes every day), and libraries should not eschew responsibility for digital materials merely because of their dynamic nature.

Libraries must also make use of methods to archive iterations in some digital resources in order to preserve these resources over time, and should also cooperate with the producers of digital resources to maximize stability. Libraries will have to take the lead in defining standards for stability and working with producers to achieve them. Producer notification whenever a resource changes would be a useful addition to capabilities/tools which currently exist (such as taking and storing snapshots of data, web pages, etc.) which have not been used extensively by libraries but should be examined for their application. Fundamental issues related to "granularity" and "versioning" need to be examined in order to determine what constitutes a "work" or an "edition" in the digital environment.

Issues for the Future

Those involved in digital library efforts need to plan for the digital library of the future, in addition to addressing immediate challenges. We don't necessarily know what digital resources will look like in the future, as "genres" are still developing and a culture of unstable transition needs to be considered in the context of the things libraries will continue to do, such as collect and organize print materials. Given that we will continue to have limited resources, how will we make the best use of these resources? Libraries are going to need to make choices--is there a right choice and a wrong choice? There was certainly consensus, often repeated, that the "wrong" choice is to do nothing.

Most of the discussion focussed on information created originally in digital format ("digital resources" in the sense of Internet information sites), but libraries are also playing a large role in "digitizing," or producing conversions from hard-copy to digital formats, or CD-ROM digital resources. There exists an unproven assumption that conversion to digital form will increase access. Selection procedures to determine what to digitize need to be developed, although they will vary institution by institution depending on specific priorities. One suggestion is to digitize certain resources "on demand" so as not to waste scarce resources.
Action Items and Recommendations

Libraries should:

  1. Lead by working with non-library groups and producers of digital resources to assure that developments progress with the needs of libraries in mind, particularly as related to self- indexing concepts. To begin with, libraries should contact the Internet Engineering Task Force (IETF) and find out how they can be more involved. [http://www.ietf.cnri.reston.va.us/home.html]

  2. Review the work of the six projects in the NSF/NASA/ARPA Digital Library Project. [http://www.grainger.uiuc.edu/dli/national.htm]

  3. Attend and participate in the Digital Libraries '96 conference sponsored by the Association for Computing Machinery, to be held March 20-23, 1996, Bethesda, MD. [http://fox.cs.vt.edu/DL96]

  4. Consider ways to make contact with the IETF planning groups and the technical community. Plan a workshop (under the auspices of the Corporation for National Research Initiatives [http://www.cnri.reston.va.us/]) which would involve librarians, those involved with the D-Lib group [http://www.dlib.org], producers of digital resources, and researchers, which would examine how the traditional skills of catalogers and developing computer tools can be coordinated.

  5. Participate in one of the Uniform Resource Name (URN) testbed projects which are commencing in the next few months (LC- NDL is participating in one of the testbeds). [contact [email protected]]

  6. Continue work to identify data elements which aid in machine-retrieval for automated metaindex searching, such as those projects at the University of Michigan [http://http2.sils.umich.edu/UMDL/HomePage.html], Stanford University [http://diglib.stanford.edu/diglib], and uses of the Dublin Core metadata set. Are new projects needed?

  7. Track the feedback on the field tests of current metadata schemes, such as OCLC's Spectrum project, and work at the National Library of Australia.

  8. Encourage libraries to experiment with the inclusion of metadata in their electronic publications and projects, such as LC's National Digital Library. Develop models as requirements for digitizing contractors.

  9. Identify areas where libraries can capture data to apply metadata schemes and/or other tools, such as:
    • finding aids
    • dissertations
    • university press publications.

  10. Make use of OCLC's InterCat [http://www.oclc.org:6990] database as a testbed to examine such things as URLs/URNs, MARC and non-MARC mixes, and mechanisms for capturing user-supplied data. Structure and implement user tests.

  11. Examine creative applications for displaying and/or mapping to MARC format (e.g., capture Lycos records and create rudimentary MARC records for display in catalogs, or SGML to MARC mapping).

  12. Agree upon methods for carrying metadata in digital resources (as the TEI header does for some SGML documents). Possibly issue informational Request for Comment (RFC) on the topic and shepherd through IETF or NISO. Also need to develop tools for application of the metadata schemes and make them available to those willing to test them.

  13. Explore continued relevance and application of cataloging practices and rules at meetings such as the OCLC Intercat Project Colloquium at ALA Midwinter in San Antonio [http://www.oclc.org/oclc/man/catproj/announce.htm] and the international conference of cataloging experts being planned for 1997 by the Joint Steering Committee for the Revision of AACR.

  14. Monitor D-LIB magazine [http://www.dlib.org] as a good source of information on current digital library research.

Participants

Sarah Thomas (Chair)
Acting Director for Public Service Collections
Library of Congress

Duane Arenales
Chief, Technical Services Division
National Library of Medicine

Bill Arms
Corporation for National Research Initiatives

Ross Atkinson
Associate University Librarian
Cornell University

John Byrum
Chief, Regional & Cooperative Cataloging Division
Library of Congress


Alan Danskin
Head, Authority Control
British Library

Beth Davis-Brown
National Digital Library
Library of Congress

Peter Deutsch
President, Bunyip Information Systems Inc.

Stephen James
Chief, Humanities and Social Sciences Division
Library of Congress

Erik Jul
Manager, Customer Services
OCLC, Inc.

Glenn LaFantasie
Program Officer
Council on Library Resources

Sandy Lawson
Assistant to the Director for Public Service Collections
Library of Congress

David Levy
Member, Research Staff
Xerox Palo Alto Research Center

Clifford Lynch
Director, Library Automation
University of California

Carol Mandel
Deputy University Librarian
Columbia University

Susan Morris
Assistant to the Director for Cataloging
Library of Congress

Ingrid Parent
Director General, Acquisitions and Bibliographic Services
National Library of Canada

Brian Schottlaender
Associate University Librarian for Collections & Technical Services
University of California, Los Angeles

Barbara Tillett
Chief, Cataloging Policy and Support Office
Library of Congress

Linda West
Director for Member Services and Support
Research Libraries Group

Beacher Wiggins
Acting Director for Cataloging
Library of Congress

Jennifer A. Younger
Assistant Director for Technical Services and Liaison to the Regional Campus Libraries
The Ohio State University

Helena Zinkham
Head, Processing Section, Prints and Photographs Division
Library of Congress

revised 01-16-95


The Library of Congress >> Cataloging Directorate
January 15, 2003

Contact Us