PROPOSAL NO: 97-9

DATE: May 1, 1997
REVISED:

NAME: Renaming of subfield 856$u to accommodate URNs

SOURCE: National Digital Library Program

SUMMARY: This paper proposes a variety of approaches when recording a URN in a MARC bibliographic record. For the current need to record a handle, field 856 subfield $u is used. In the future fields for other identifiers may be used (e.g. ISBN, ISSN, SICI), and at that time a content designator (perhaps subfield $u in the appropriate field) may be added to indicate that the identifier serves as a URN. This proposal suggests changing the name of 856$u to URI to accommodate both URLs and URNs and adding an indicator value blank (#) to show that that the access method is not provided for use with a URN.

KEYWORDS: Field 856 (Bibliographic/Holdings/Classification/Community Information); Electronic Location and Access; Subfield $u, in field 856 (Bibliographic/Holdings/Classification/ Community Information); URI; Uniform Resource Name

RELATED: DP96 (June 1996)

STATUS/COMMENTS:

5/1/97 - Forwarded to USMARC Advisory Group for discussion at the 1997 Annual MARBI meetings.

6/29/97 - Results of USMARC Advisory Group discussion - Approved as amended. A separate subfield should be used for the URN. LC should survey USMARC users to see if subfield $g has been used. If not, it can be reused for the URN; if so, another subfield will be chosen. Guidance should be given on how to code the indicator if there is both a URL and a URN in the field. It is important to note that this proposal approves a URN subfield at the 856 level, rather than the record level. As the URN standard moves forward, additional proposals will deal with the definition of a URN element when it relates to the entire resource described in the record.

8/21/97 - Result of final LC review - Agreed with the MARBI decision.


PROPOSAL NO. 97-9: Renaming of Field 856$u to accommodate URNs

1.      BACKGROUND

Discussion Paper No. 96 (Defining a Uniform Resource Name Field
in the USMARC Bibliographic Format) explored issues concerning
the definition of a data element in MARC for a Uniform Resource
Name (URN).  It considered the need to define a data element,
either in a new field in the standard number block of fields
(0XX) or as a subfield of field 856 (Electronic Location and
Access).  The Uniform Resource Name (URN) is the evolving
standard being developed by the Internet Engineering Task Force
(IETF) as one of the set of Uniform Resource Identification (URI)
standards which deals with naming conventions.  A URN is the name
of a resource that identifies a unit of information independent
of its location.  URNs improve upon URLs because they are
intended to provide a globally unique, location independent
identifier that can be used for identification of the resource,
and to thus facilitiate access to both metadata ("data about
data") about it and to the resource itself.  The URN refers to
the intellectual entity, while the URL refers to a particular
physical entity at a particular location.

When Discussion Paper No. 96 was discussed in July 1996, Clifford
Lynch, who is active in the IETF's URN group, reported on the
progress being made on the issue.  A new IETF working group had
been recently formed that will attempt to use a few resolution
schemes for experimenting with the URN framework.  The goal is to
attempt to disentangle operational issues from syntactical ones. 
Because of the multiplicity of naming assigners and complex
relationships between URNs and a bibliographic record, he said
that there may not be one clean answer concerning where to record
a URN.  MARBI was advised not to look to the IETF URN group to
solve naming problems.  In the future, identifiers already in
place may serve as URN's, e.g. ISBNs, so an existing field could
be used and a URN not necessarily carried explicitly.  MARBI
asked to be kept informed on URN developments to further consider
where to put them in MARC in the future.


2.      DISCUSSION

Since Discussion Paper No. 96 was discussed, substantial progress
has been made in the IETF on URN syntax and resolution.  Although
solutions have not been widely implemented, consensus has been
reached on many issues.  It is clear that any existing naming
(identification) schemes will be accommodated in any URN
solution.  In support of this point, a paper was recently written
by Cecilia Preston, Clifford Lynch and Ron Daniel on using
existing bibliographic identifiers as Uniform Resource Names; it
discusses fitting the ISBN, ISSN, and SICI into the URN framework
and syntax.  (URL: ftp://ds.internic.net/internet-drafts/draft-
ietf-urn-biblio-00.txt)

Another URN candidate is the handle identifier that the
Corporation for National Research Initiatives (CNRI) is
developing for the Library of Congress under contract.  The
handle server provides a lookup service for electronic resources
that are part of the National Digital Library Program.  Each item
in a digital library is given a handle, which is a globally
unique, persistent, independent identifier.  The handle server
will then supply data that an online catalog system or other
access software needs to access the item.  The handle server thus
resolves the handle into a URL (or other form of physical
locator) to retrieve the resource.  The handle server system is
designed to be a globally interconnected system.  A catalog or
web-browser that can connect to any single handle-server can
request resolution of any handle, without having to know in
advance which handle server holds the registration information.

Recently ten institutions were awarded funds for digitization of
historical material to be included in the American Memory Project
through a grant from Ameritech.  These digitized items will be
distributed on the Internet in a manner that will augment the
collections of the National Digital Library Program at the
Library of Congress.  Access to the digital collection is to be
given through metadata in MARC, in non-MARC records or in finding
aids.  Some of the awardees were interested in using MARC records
for metadata and wanted to explore the use of URNs such as
handles as persistent identifiers linking the record to the item.

To facilitate this effort and other projects demonstrating
interoperability among catalogs and distributed digital
repositories, the National Digital Library Program is considering
how and where in the MARC record to record a handle.  Two
approaches might be used: 1) recording the "bare" handle (e.g.
hdl:loc.pp.detroit/4a3271t) or 2) combining the handle with the
Internet address of a proxy resolver into a "proxy URN" (e.g.
http://hdl.handle.net/loc.pp.detroit/4a3271t). This proxy URN is
a URL, which can be used by any system that can use a URL as a
link but is no longer truly location-independent and persistent,
because it relies on the continued existence of a particular
proxy handle server.  

The proxy URN approach corresponds closely to OCLC's Persistent
URL (PURL) system;  the term "PURL-handle"  has sometimes been
used instead of "proxy URN."  OCLC's PURL system creates logical
addresses in the form of URLs which are translated through a PURL
resolver into the URL of the current physical location.  OCLC
runs a PURL resolver through which others may create PURLs for
their resources and offers the software free to any institution
wishing to run its own resolver.  This is a short-term partial
solution to the problem of names for Internet resources that
change location, a solution that can be supported now by all
browsers and catalogs that recognize URLs but relies on the
continued existence of the particular PURL resolver through which
the PURL has been registered.  The plan is for handle resolvers
to be networked in a manner to make the particular resolver
system approached irrelevant.

In the long run, recording a handle in its bare form is more
desirable than turning it into a URL, since the handle is truly
location-independent and therefore more persistent.  However,
until all catalogs,  web-browsers, and other access systems
recognize URNs (and hence handles), there will be a transitional
need to record identifiers in both forms to support broad-based
experiments and allow the same MARC  record to provide an
effective linking mechanism in several systems, some of which may
not be capable of recognizing handles.  Eventually the need to
record the URLs for items for which handles (or other forms of
URN) exist will disappear, but the transitional phase may take
many years.

The Research Libraries Group is looking at similar naming issues
in its "Studies in Scarlet" project.  It expects to register
PURL/handles for digitized items to function as URN's.  RLG could
also benefit from a resolution of the issue of where to put URN's
in the MARC record. 


3.      FIELD 856 AND URNs

The National Digital Library Program (NDLP) would like to use
repeatable 856$u subfields to record URNS such as handles and the
URLs associated with them (either a proxy URN or other URL). 
Because of the multiplicity of 856 fields that could exist in a
record (e.g., different formats of same document, mirror sites,
different subsets of same document, etc.), using repeated 856
fields would result in the need to link these repeated fields in
the record with subfield $8 since the URNs and URLs need to be
linked.  This is not desirable both because of its complexity and
because subfield $8 has not yet been implemented as a link in
systems.

Currently NDLP is experimenting with the multiple subfield $u's
in field 856, some of which are handles.  The field could include
the following:
        - an http URL pointing to the item 
        - a proxy URN (which is an http URL) attached to a handle;
        this approach treats the handle server as if it were a PURL
        resolver and can be used now to access the item by the
        handle
        - a bare handle, which can currently only be used directly
        by systems that recognize handles and can communicate with a
        handle-server to request resolution.  Appropriate software
        exists to incorporate URNs into recent versions of the most
        popular Web browsers. 
The advantage to the last approach is all the advantages of URNs:
persistence, global uniqueness, and location independent. 
Although it cannot currently be widely implemented yet, progress
has been made at LC for its future use.  In addition, it is
desirable to record it for the future while creating the
bibliographic record.  Some of the other institutions that have
been awarded grants for digitization through the Ameritech
competition are interested in using URNs (either handles or
others).  

In order to use subfield $u for URLs and URNs, the handle URN
would need to include the initial letters: "urn:".  This syntax
has been accepted within the Internet Engineering Task Force's
URN Working Group.

Example:
        245 00          $aGottscho Schleissner Collection (Library of
                        Congress $h[graphic]
        260             $cca. 1896-1970, bulk 1935-1955
        300             $aca. 28,350 negatives :$bsafety film, some
                        nitrate ;$c5 x 7 in. (13 x 18 cm.) or smaller.
        300             $aca. 300 transparencies :$bfilm, color ; $c8 x 10
                        in. (21 x 26 cm.) or smaller
        300             $a11 albums of photographic or photomechanical
                        prints :$bsilver gelatin, some cyanotype, some
                        color ;$c16.5 x 14 in. (42 x 35 cm.) or smaller
        300             $aca. 275 photographic prints :$bb&w, silver
                        gelatin ;$c17 x 14 in. (43 x 36 cm. or smaller.
        856 40          $u//lcweb2.loc.gov/ammem/gschtml/gotthome.htm
                        l $uhttp://hdl.handle.net/loc.test/gotthome
                        $uurn:hdl:loc.test/gotthome                             
        [Record is for original with 856 added for access to the
        digitized items.  First $u is a URL; second $u is a proxy
        URN; third $u is a bare handle. Three $u subfields have been
        shown here to cover the options described above.  In
        practice there would seldom be a reason to use all three
        except for experimental projects.  Two $u subfields would be
        more common.]

If a bare handle were used in field 856 subfield $u, the following
changes would need to be made to the 856 field:  1) change the name
of $u to "Uniform Resource Identifier (URI)"; this is the umbrella
term used for the various UR standards 2) add an indicator value to
the first indicator (Access method) to show that a URN is in the
field.  Although the definition of an new indicator value for "URN"
could be considered, a more generic value might be added that shows
that an Internet access protocol is not retrieving the item, but
the name must be resolved into a location.  It may be preferable to
limit additions to the indicator values, since the values are
essentially redundant information that is also contained in the URN
or URL itself.  However, there are a few cases where the
information in the indicator is not explicit in the field, such as
resources accessible by email (e.g., listservs).  If repeatable
$u's are used as in the record above, guidelines could state that
if there is at least one http URL, indicator value 4 (HTTP) is
used.  

Alternatively, a new subfield could be defined.  However there are
only two subfields available: $e or $y.  Since a URN is intended to
be used for locating the item and systems are programmed to use $u
as a hot link, it may be more appropriate to also use it for URNs.

Using a standard number field in the 0XX block is not a workable
solution for all NDLP records.  There is no assurance that there
will be one URN per bibliographic record.  For instance, handles
are being assigned for different portions of the items, such as a
separate one for a picture from a digitized book.  In addition LC
expects to create handles for a finding aid which would be
referenced in the record along with a handle for the object itself. 
In the future different naming authorities may assign URNs or
handles according to different guidelines.  Thus, there may not be
one place in the record to contain a URN, but various depending
upon at what level the bibliographic description is given and how
the electronic resources are being referenced.

As other URN schemes are implemented, other proposals will be
brought forward as needed to accommodate them.  For instance, it is
likely that the existing 020 field (ISBN) and 024 (Other Standard
Number) could be used for URNs based on existing standard numbers
such as the ISBN and SICI.  A possible means to specify URNS based
on standard numbers is to add a $u subfield to any of these fields. 
Specific encoding of the data will need to be worked out (i.e.,
whether to encode the complete URN as it is formulated, even though
some of the information would be redundant with what is contained
in $a, or only to indicate in some way that the number can also be
used as a URN).  


4.      PROPOSED CHANGES

The following is presented for consideration:

        *       In the USMARC Bibliographic/Holdings/Classification/
                Community Information Formats, change the name of
                subfield $u (Uniform Resource Locator) in Field 856 to
                Uniform Resource Identifier.

        *       In the USMARC Bibliographic/Holdings/Classification/
                Community Information Formats, define the following value
                in Field 856, First indicator:                          
                #       No information provided


Go to:


Library of Congress
Library of Congress Help Desk (09/02/98)