EAD Application Guidelines for Version 1.0

Chapter 3. Creating Finding Aids in EAD: Continued

3.5. Building an EAD Finding Aid

The preceding sections of this chapter have attempted to lay the theoretical foundation for the more detailed examination of EAD elements in section 3.5. The order in which we discuss these elements does not necessarily correspond to the order in which you might draft various sections of a finding aid, nor does it follow strictly the structure of the EAD DTD, which requires that a valid EAD instance(54) begins with the outermost EAD <ead> tag, followed first by metadata (in the mandatory EAD Header <eadheader> and optional Front Matter <frontmatter> elements) and next by the Archival Description <archdesc> (see Figure 3.4a). We elected instead to begin with <archdesc>, since it embodies the bulk of the finding aid and contains the information archivists are most accustomed to including in their inventories and registers. Yet before we become immersed in <archdesc>, it may be helpful to say a few words about the other high-level elements, which would appear before <archdesc> in an encoded finding aid.

The document element <ead> encloses all other elements. It indicates to a computer that what follows is a machine-readable version of a finding aid that has been encoded using the SGML document type definition known as Encoded Archival Description. Setting the AUDIENCE attribute in <ead> to "external" will display the contents of all of the subelements, unless the attribute in an individual subelement is set to "internal." The element <ead> also has a RELATEDENCODING attribute that can be used to declare a descriptive encoding system, such as MARC, Dublin Core, or ISAD(G), to which many EAD elements can be mapped using the ENCODINGANALOG attribute. <eadheader> and <frontmatter>, the other two high-level elements inside <ead>, are discussed in detail in section 3.6 at the end of this chapter.

The required <eadheader> is an essential part of a properly encoded finding aid; it contains metadata about the title, author, and creation date of the finding aid, as well as information about the language in which the finding aid is written and details about its encoding. The optional <frontmatter> element includes subelements that can be used to create nicely formatted title pages and other publication-type prefatory material such as acknowledgements and introductions.

Since EAD permits a great deal of flexibility in the order of information within <archdesc>, these Guidelines discuss the <archdesc> subelements in an order that matches a suggested sequence for the information in an online finding aid. This order is unlikely to correspond exactly to the manner in which you collect and compile the information in your finding aids. As mentioned in section 3.2, information for portions of a finding aid may have been gathered over a long period of time. When processing a collection, particularly one that is large and complex, you may draft descriptions of discrete and possibly noncontiguous parts as you complete their organization and arrangement. Presenting this information in the most "logical" order for your researchers offers a challenge that these Guidelines aim to help you address.

3.5.1. Describing the "Whole": Collection-Level Information <archdesc>

As you read through the following sections that discuss application of specific EAD elements, it might be a good idea to have a copy of the EAD Tag Library close at hand. These Application Guidelines provide options and guidance for use of important elements and attributes, and the Tag Library supplements this by specifically defining each element, listing all possible attributes, and providing additional tagged examples. If you find the examples in the following sections difficult to follow, review the basic conventions in the How to Use This Manual" section of these Guidelines. (55)

Note that some of the tagged examples in this chapter omit required elements when the additional tagging would obscure the point being made in the accompanying text. Whenever you have a question as to where an element is available or which parent elements are required, consult the EAD Tag Library.

As noted in section 3.5, the EAD element that encompasses the text of the archival finding aid is <archdesc>, within which are nested all other descriptive elements. The <archdesc> element is a wrapper; it holds the other elements together in a cohesive package. In addition, its required LEVEL attribute identifies the highest level of description represented in the finding aid, which is usually set to "fonds," "collection," or "record group." Occasionally a finding aid may relate to only one "series," "subgroup," "subseries," "file," or "item," and those alternative values could also be selected for the <archdesc> LEVEL attribute. Although EAD accommodates such instances, these Guidelines have been written with the assumption that most inventories and registers will capture at least a basic description of the highest-level fonds, collection, or record group before describing individual components, and that the <archdesc> LEVEL attribute should therefore reflect the highest tier in the collection's hierarchy. Lower tiers in the hierarchy are identified by setting the LEVEL attribute in the component descriptions, as explained in section 3.5.2.

There are several other attributes in <archdesc> that can be used to control or provide information for the entire finding aid:

An <archdesc> start-tag that includes this full complement of attributes may look like this (the attributes may be arranged in any order):

	<archdesc audience="external" relatedencoding="marc"
	langmaterial="eng" legalstatus="public" level="fonds"

Having specified this control information in <archdesc>, the archivist then proceeds to identify some basic facts about the collection by using the Descriptive Identification <did> element. Basic Description: The High-Level <did>

The elements available within the Descriptive Identification <did> element represent the basic building blocks for any level of description in a multilevel archival finding aid. These fundamental elements answer such questions as those below:

Because these questions apply to all levels of archival groupings, from the fonds, record group, or collection, down to the item, <did> is available at all levels of description. While a single occurrence of <did> is required (usually as the first element within <archdesc>), specific elements within <did> are not, because not all will be needed at every level of description. For example, once Origination <origination> or Repository <repository> data has been specified for an entire body of materials, it may not need to be repeated at the series or file level.

One of the advantages of bundling such information in <did> is that it serves as a wrapper for these essential pieces of information in an online environment, where retrieval of coherent chunks of descriptive information about a given archival unit is critically important to the end user's understanding of a search result. It may also encourage good descriptive practice by reminding archivists to include the same basic data at all levels of description.

The first occurrence of <did>, which represents the highest level of description for a given body of materials, should allow a researcher to determine whether the materials are pertinent to his or her line of inquiry without having to read far down into the finding aid. To facilitate this resource discovery and recognition, the first <did>, referred to hereafter as the "high-level <did>," should include the following elements, which are discussed in greater detail below:

At its most basic, the high-level <did> might therefore look like this (with the sequence of <did> subelements determined by the repository):

	   <repository>Harry Ransom Humanities Research Center</repository>
	   <origination>Stoppard, Tom</origination>
	   <unittitle>Tom Stoppard Papers</unittitle>
	   <physdesc>68 boxes (28 linear feet)</physdesc>
	   <abstract>The papers of British playwright Tom Stoppard (b. 1937)
	   encompass his entire career and consist of multiple drafts of his plays,
	   from the well-known
	   <title render="italic">Rosencrantz and Guildenstern Are Dead</title> 
	   to several that were never produced, correspondence, photographs, and posters, as
	   well as materials from stage, screen, and radio productions from around the world.</abstract>

Other elements such as ID of the Unit <unitid> and Physical Location <physloc> should also be included if the repository assigns a unique identifier (perhaps an accession number) to the materials, and if a physical location of the entire collection is specified in the finding aid. It is also recommended that the high-level <did> be given a Heading <head> and be encoded fairly specifically with various subelements and ENCODINGANALOG attributes; this will enable search engines to retrieve a basic description about the collection or to facilitate the extrapolation of a skeletal MARC record. Again, each of these subelements and attributes are explained in the sections which follow. A fully encoded <did> might look like this:

	   <head>Summary Description of the Tom Stoppard Papers</head>
	      <corpname>The University of Texas at Austin
	      <subarea>Harry Ransom Humanities Research Center</subarea>
	      <persname source="lcnaf" encodinganalog="100">Stoppard,
	   <unittitle encodinganalog="245">Tom Stoppard Papers, </unittitle>
	   <unitdate type="inclusive">1944-1995</unitdate>
	   <physdesc encodinganalog="300">
	      <extent>68 boxes (28 linear feet)</extent>
	   <unitid type="accession">R4635</unitid>
	   <physloc audience="internal">14E:SW:6-8</physloc>
	   <abstract>The papers of British playwright Tom Stoppard (b. 1937)
	   encompass his entire career and consist of multiple drafts of his plays,
	   from the well-known
	   <title render="italic">Rosencrantz and Guildenstern Are Dead</title> 
	   to several that were never produced, correspondence, photographs,
	   and posters, as well as materials from stage, screen, and radio productions
	   from around the world.</abstract>
Such detailed markup, which includes subelements and attributes, is recommended at the highest level of your multilevel description but may not be necessary or even desirable at the component level. Conversely, other <did> subelements such as Container <container>, Note <note>, Digital Archival Object <dao>, or Digital Archival Object Group <daogrp> are often unnecessary in the high-level <did>. The subelement <container> is discussed in section; the latter three elements are mentioned briefly in the discussion of the high-level <did> and are discussed more thoroughly in section and section 7.3.6.

All subelements within <did> have a LABEL attribute. This attribute functions somewhat like the Heading <head> element (which is used in lieu of LABEL for non-<did> elements; see section in that it can be used to generate print or display constants. The <did> subelements carry a LABEL attribute instead of a <head> subelement primarily because the information contained in the <did> subelements tends to be brief-frequently only a few words-in contrast to such elements as <scopecontent> and <bioghist>, which tend to consist of longer narrative chunks of text. If each <did> subelement contained <head>, a Paragraph <p> would also be necessary in order to enter the text of the element; this would effectively double the amount of tagging for such small bits of information.

The LABEL attribute is especially useful at the highest-level <did> to aid readers in interpreting the collection summary description, while LABEL or <head> information may be less frequently necessary elsewhere in the finding aid. For example, this markup

	   <repository label="Repository:">
	      <corpname>The University of Texas at Austin
	   	      <subarea>Harry Ransom Humanities Research
	   <origination label="Creator:">
	      <persname source="lcnaf" encodinganalog="100">Stoppard,
	   <unittitle label="Title:">Tom Stoppard Papers,
	      <unitdate type="inclusive">1944-1995</unitdate>
can generate the following display based on specifications in a stylesheet:

	Repository:	The University of Texas at Austin
			Harry Ransom Humanities Research Center
	Creator:	Stoppard, Tom
	Title:		Tom Stoppard Papers, 1944-1995

A stylesheet is a text file or output specification that is used by a processing system in conjunction with the encoded finding aid to control how the document will be displayed or formatted.(56) Stylesheets define the appearance of each element in each of its contexts within the document. Any element can be assigned specific display features, such as font size, style, and color. A stylesheet also can be used to insert preceding characters or spacing to an element, rather than using the LABEL attribute as noted in the example above. A stylesheet allows you to modify the element's display or formatting features in relation to where the element appears in the finding aid. For example, in certain contexts, such as in the high-level <did>, you may want the <unittitle> of the collection or fonds to appear on a separate line, in a certain font size and style, and preceded by the word "Title," followed by a colon. Elsewhere in the finding aid, you may want the <unittitle> of a component to appear inline and in a smaller font size than that of the higher-level <unittitle>. The stylesheet allows you to control those decisions so you need not hardwire formatting codes into a document in the same way that you do in a word processing or HTML document.

Omitting such "formatting" instructions from your data allows you to change the appearance of all your finding aids by simply changing the stylesheet; this also will simplify the encoding of each individual EAD finding aid. Keep in mind, however, that an end user can choose to replace the stylesheet you created with a different stylesheet. Use of the LABEL attribute may better ensure that words you designate will stay with the finding aid regardless of the stylesheet attached to it.

Note also that it is possible to create multiple stylesheets to use with your EAD finding aids. For example, you may create one stylesheet for online display and another for printed output. Creating your finding aid in EAD allows you to separate what the text actually is from how the text is rendered, thereby making it possible to process or format the same text in different ways. Using the <did> Subelements Repository <repository>
In the distributed online environment in which many EAD finding aids will reside, inclusion of the repository's name in the finding aid is critical. This information may or may not have been recorded in finding aids in the paper environment. The <repository> element identifies the institution or agency responsible for providing intellectual access to the materials being described. As shown in the Tom Stoppard example in section, the Corporate Name <corpname>(57) and Subordinate Area <subarea> elements may be used within <repository>. This more precise markup facilitates flexible display and retrieval of the information. (You may want your parent institution's name to display in 18 point bold Times New Roman and your repository's name in 14 point bold Times New Roman, or you may want to facilitate the ability to restrict searches to collections located in specific divisions or subunits of an institution.)

Note that in most instances the repository that provides intellectual access to the materials is also the institution that holds physical custody, but when that is not the case, the name and other pertinent information about the physical custodian should be encoded in the <physloc> element. Origination <origination>
The <origination> element specifies the individual, family, or organization responsible for the creation, accumulation, or assembly of the described materials before their incorporation into an archival repository. As with <repository>, it is possible to embed specific name elements (Personal Name <persname>, Family Name <famname>, or Corporate Name <corpname>) within <origination> to improve specificity. Further, the ROLE attribute can be used to indicate whether the originator was a "creator" or "collector" or had some other function relative to the materials. It is also possible, as noted earlier in section, to supply such information by using the LABEL attribute in <origination> and having a stylesheet display or print the word "creator" or "collector" before or after the text that appears in <origination>. For example, the markup

	<origination label="creator">Mary Hutchinson</origination>

in combination with stylesheet instructions, could result in either of the following outputs:

	Creator:  Mary Hutchinson
	Mary Hutchinson, creator

In addition to the LABEL attribute, <origination>, like many EAD elements, has an ENCODINGANALOG attribute that enables you to specify a MARC or other encoding scheme field that relates to this element (in this case, the MARC 100 field). Either of the following options is valid in EAD, but the latter is more specific:

	<origination encodinganalog="100" label="creator">Mary Hutchinson</origination>

	<origination label="creator">
	   <persname encodinganalog="100">Mary Hutchinson</persname>

It would also be possible to invert the <persname> data (Hutchinson, Mary) to match the formatting of a MARC 100 field for retrieval purposes. Using the NORMAL attribute would accomplish the same purpose:

	<origination label="creator">
	   <persname encodinganalog="100" normal="Hutchinson, Mary">Mary Hutchinson</persname>

For more information about "name" subelements, see the discussion of <controlaccess> in section 3.5.3. Title of the Unit <unittitle>
Archivists typically assign titles to collections, since no bibliographic entity such as a title page exists from which to transcribe the information for descriptive purposes. The <unittitle> element is used to provide the title of the materials, either formal or supplied, at any level of description. National descriptive content standards such as APPM or RAD often provide the best guidance for archivists in determining how to construct supplied titles for archival materials.

In the high-level <did>, the <unittitle> is the title of the collection, or perhaps of a subgroup or series, depending on what the highest level of description in the finding aid is. If the title of the body of materials includes a formal title, it may be desirable to nest the Title <title> element within <unittitle> for display or retrieval purposes. For example:

		<unittitle encodinganalog="245">Stuart Johnson Collection of
		<title>Alice in Wonderland</title> Memorabilia</unittitle>

This markup would permit the display or printing of the title Alice in Wonderland in any fashion desired by the repository, including but not limited to italics, through use of the RENDER attribute in <title> or a stylesheet. In addition, it would facilitate the retrieval of the phrase "alice in wonderland" in a <unittitle> or <title> search, and, through the use of the ENCODINGANALOG attribute, the export of the text within the <unittitle> element to a MARC record for the archival collection. Date of the Unit <unitdate>
The inclusive dates of a collection are considered a basic part of most finding aids. In EAD it is possible to embed the dates of the materials within <unittitle> by using <unitdate>. The <unitdate> element is also available outside of <unittitle>, so both of the examples shown below are valid. Note that <unitdate> has an optional TYPE attribute that allows you to specify whether the dates are inclusive dates, bulk dates, or a single date.

	<unittitle>Stuart Johnson Collection of <title>Alice in
	Wonderland</title> Memorabilia, <unitdate type="inclusive">

	<unittitle>Stuart Johnson Collection of <title>Alice in
	Wonderland</title> Memorabilia, </unittitle>
	<unitdate type="inclusive">1905-1928</unitdate>

Each repository should choose one of the above methods of encoding with regard to the relative placement of <unittitle> and <unitdate> and be consistent both within an individual finding aid and across all finding aids. National descriptive standards may provide guidance on this point. Archivists who catalog their materials using APPM are accustomed to thinking about span and bulk dates for a body of materials as part of the title, while RAD users treat such dates as a separate data element in the Dates of Creation Area.(58) In either case, <unittitle> and <unitdate> information may be displayed together.

The element <unitdate> also has a NORMAL attribute that allows dates to be stated in a standardized form: YYYYMMDD. Use of this attribute would facilitate retrieval of date information if implemented consistently. However, date information is provided in numerous formats in finding aids, as in the following examples:

	March 17, 1946
	17 March 1946
	1946 March 17
	ca. 1946

Because of this the additional markup required to supply normalized dates for searching purposes may be prohibitively time consuming. Physical Description <physdesc>
Archivists use many different expressions of extent for their holdings, such as linear or cubic feet, number of boxes or other containers, or perhaps number of items. Frequently this is a very simple expression; sometimes it is a series of statements delineating quantities of various formats of material within a collection. In other cases, the physical description of a collection may include information about the method of creation.

It is possible to put a basic statement of extent into the <physdesc> element without using any subelements:

	<physdesc>149 cubic feet</physdesc>

	<physdesc>3800 photographs</physdesc>

In many cases, this level of markup is sufficient. The use of subelements, such as Extent <extent> and Genre/Physical Characteristic <genreform>, as well as attributes in <physdesc> can, however, render the physical description much more specific:

	<physdesc encodinganalog="300">149 cubic feet</physdesc>

	   <extent>3800</extent> <genreform>black and white prints

	   <extent>46</extent> <genreform>sound recordings</genreform>

What is the benefit of such additional encoding? Any time information is encoded at a more granular (detailed) level, the ability to manipulate and reuse the data is enhanced. For retrieval purposes, you might want to search for a particular type of photograph, such as albumen prints, salted paper prints, or hand-colored prints. While this can be done using a keyword search, searching for these types of terms within a <physdesc> or <genreform> element improves the relevance of the search result. Abstract and Note <abstract> and <note>
In an online environment, it can be extremely helpful to the end user if a brief statement describing the context and content of a body of materials appears on the first or second screen of a finding aid. This can help the user immediately determine the relevance of the collection to his or her research. The <abstract> element was created for this purpose. In the high-level <did>, <abstract> may be a hybrid of a creator sketch and a scope and content note; in other words, it may contain a brief statement about the creator or collector of the materials as well as a very general summary of their scope. Summarize as briefly as possible, while still providing pertinent information; the Biography or History <bioghist> and Scope and Content <scopecontent> elements, described in section and section, are used for more expansive information.

	<abstract>The archive comprises records mainly from the pre-1837
	Archdeaconry before its removal from the jurisdiction of York to that of
	Lincoln. Most of the documents stem from the Archdeacon's twice-yearly
	visitations and the cases pursued in his court, the earliest dating from the
	16th century. The Archdeaconry of Nottingham joined with the county of Derby
	[from the Diocese of Lichfield] to form the Diocese of Southwell.</abstract>

The <abstract> element could easily be confused with <note>, which also is available within <did>. The <note> element should not be used for summary descriptive information, but rather to cite the source of a quotation (as in a footnote), provide a short explanatory statement or user directive, or for miscellaneous purposes such as to indicate the basis for an assertion. In general, the generic text element <note> should never be used when a more specific structural EAD element is more appropriate.

	<note><p>Note to researchers:  To request materials, please note both
	the location and box numbers shown below.</p></note>

In the high-level <did>, a <note> also could be used to alert the reader to the fact that the materials described in the high-level <did> are in fact a component of a larger body of materials that had to be described in separate EAD instances because of the difficulties encountered in parsing and downloading a single large finding aid file for the entire fonds or record group. The creation and linking of separate finding aids for a single large collection is discussed in greater detail in connection with the Archival Reference <archref> element in section 7.3.3.

Because of its utility as explanatory text, the <note> element is also available outside <did>, as explained in section ID of the Unit <unitid>
Archivists often assign unique identifying numbers or alphanumeric strings to units of archival material for control and citation purposes. Such identifiers include accession numbers, lot numbers, classification numbers, or entry numbers in a bibliography or catalog. The <unitid> element is used to encode such numbers; do not confuse it with <physloc> or <container>, which encode information about the physical location or housing of the material.

Two attributes are available in <unitid> that are not available anywhere else in EAD and which should be used only in the high-level <did>: COUNTRYCODE and REPOSITORYCODE. COUNTRYCODE provides the unique code, taken from the ISO 3166 Codes for the Representation of Names of Countries, for the country in which the archival materials are held. REPOSITORYCODE contains another unique code, taken from the official repository code list for the country in which the repository is located, for the repository responsible for the intellectual control of the materials being described.(59) These two attributes relate specifically to the ISAD(G) reference codes in the Identity Statement Area(60) and guarantee uniqueness of the <unitid> in a multinational finding aid database. If desired, the attribute values could be manipulated by a stylesheet to display or print the name of the country and the name of the repository as part of the <unitid> information. At the highest level of description, a <unitid> might look like this:

	<unitid countrycode="gbr" repositorycode="067">ES</unitid> Physical Location <physloc>
Some finding aids may include information about the location of materials within the repository. This information is encoded using the <physloc> element, which may refer to an actual shelf location or instead indicate that a collection is stored off-site, warning the researcher that the material may not be immediately available.


	<physloc>The Mary Hutchinson Papers are stored offsite, and 24-hour notice is
	required to retrieve the materials.</physloc>

<physloc> is repeatable, so both types of information can be provided when needed. If the repository chooses to include the shelf location in the finding aid for its own internal use, the information can be encoded but shielded from public access by using the AUDIENCE attribute (if your server is capable of suppressing information coded as "internal" when delivering your EAD files to users):

	<physloc audience="internal">14E:SW:6-8</physloc> Digital Archival Object and Digital Archival Object Group <dao> and <daogrp>
One of the exciting features of EAD is its ability to connect the finding aid to electronic representations of the described materials. Two special linking elements, <dao> and <daogrp>, are used for this purpose. The <dao> element is used to point to individual images, and <daogrp> is used to bundle multiple versions of the same images (for example, a thumbnail and a reference copy). Since these elements may be used in many places throughout a finding aid, they are described in more detail, along with other widely available elements, in section Aspects of their use are also covered in the discussion of linking elements in section 7.3.6.



Return to Menu


  1. In SGML parlance, "instance" is the term used to refer to a particular SGML-encoded document, such as a single EAD-encoded finding aid.

  2. You should also consult later chapters of these Guidelines for certain technical issues affecting encoding. For example, see section 4.3.5 regarding the use of Headings <head>, whitespace, and punctuation.

  3. See section 5.3.3 for more information on stylesheets.

  4. See section 3.5.3 for a discussion of controlled vocabulary elements.

  5. Rules for Archival Description, section 1.4.

  6. If your country has no official list, do not use this attribute. For U.S. repositories, cite code from: USMARC Code List for Countries (Washington: Library of Congress Cataloging Distribution Service, 1993).

  7. ISAD(G), section 3.1.

Table of Contents
Home Page Preface Acknowledgments How to Use
This Manual
Setting EAD
in Context
Creating Finding
Aids in EAD
Authoring EAD
Publishing EAD
EAD Linking

Go to:

Copyright Society of American Archivists, 1999.
All Rights Reserved.

[VIEW OF LC DOME] The Library of Congress

Library of Congress Help Desk (11/01/00)