Audio-Visual Prototyping Project

Interim Naming System for Recorded Sound Collections
Names for Aggregate and Item Directories and for Digital Files
Assigned During Digital Reformatting


Introduction
Example 1 - Marine Corps Combat Recording
Example 2 - Spoken-word 78 rpm Disc
Example 3 - Long-playing Phono-album
Summary Comparison of the Examples in Table Form
Abbreviations and Conventions

Introduction

This document describes an interim naming system for recorded sound collections and items by providing three sets of examples. In actual practice, general instructions for naming will be established for a given collection by the staff of the Recorded Sound Processing Section of the Motion Picture, Broadcasting, and Recorded Sound Division (M/B/RS) at the Library of Congress. The names will be assigned to the files and directories by the engineers or image-makers who produce the essence files, i.e., the files that contain the bitstreams for the audio or images that reproduce elements in recorded sound items. These engineers and image-makers will be the staff of the M/B/RS Recording Laboratory or of a Library contractor.

The names "contain" certain types of non-descriptive metadata and are being employed in the year 2000 while a new non-descriptive metadata system is being developed. When the content that is digitized in 2000 is ingested into the M/B/RS prototype digital-content repository, the new metadata system will be in use and there will be no further need to rely upon the metadata contained within the directory and file names. Future production will move in parallel with the capture of non-descriptive metadata and M/B/RS will consider dropping this approach to naming directories and files.

The logical pathname used for the newly created essence files is aggregate/item/essence file. An aggregate is a logical grouping of items, e.g., a coherent part of a collection. An item is an entity as described in a bibliographic record, e.g., a phonograph record album. A single item may be reproduced by a set of related essence files, e.g., images of the album jacket and labels, and audio files for the two disc sides. The aggregate/item/essence pathname is used by those digitizing the content and in "holding" storage in a UNIX file system. When content is stored in servers, it may be useful to establish intervening directories called "convenience group directories" in order to prevent the quantity of files or extent of data (bytes) in single directories from becoming unmanageable. After the content has been ingested into the repository, which may have its own system for storage, this three-level filesystem structure may be discarded.

Since the production life-cycle for content may require that files and directories be copied into and out of a DOS-based system or placed on CD-ROM disks, the names proposed here keep to the eight-plus-three-character convention. In general, the practice conforms to the ISO9660 standard for CD-ROMs, except for the use of lower case alpha characters. The Library uses lower-case characters in its content-holding UNIX filesystems and, in order to make copying from delivery media easy, requires the use of lower case throughout.

The aggregate identifiers are generally mnemonic in some way pertaining to the collections. Item identifiers are "semi-mnemonic" for such things as phono-album release number. When possible, essence filenames for audio are also "semi-mnemonic" (e.g., for a record label number or a tape "cut" number) or echo the item name. The essence filenames for images of record labels follow the pattern for the audio files. We say "semi-mnemonic" because many of the identifiers are truncated forms of their "source" names in order to fit the four- or five-character slots available within the eight-character names. In contrast, the filenames for images of containers and booklets receive non- mnemonic "one-up" sequential numbers with the prefixes ct and bk. As explained in Abbrevations and Conventions, containers are defined as boxes, album jackets, dust sleeves, wire-recording cans, cassette J-cards, audio CD jewel-case printed matter, etc., and booklets are defined as any accompanying printed matter, except J-cards and jewel-case booklets, which are treated as parts of the container.

This proposal that is intended to foster discussion; please send comments to Nancy Seeger ([email protected]) or Carl Fleischhauer ([email protected]).


Example 1: Marine Corps Combat Recording

Aggregate and Item Directories

5442001 - Aggregate identifier Aggregate for first portion of Marine Corps Combat Recordings to be digitized. The assumption is that the digitization will proceed from the 10-inch analog preservation tapes. The identifier is based on the laboratory copy-project "LWO" number (LWO 5442). The added "001" digits permit creation of several aggregates if the file or byte quantities grow too large: 5442002, 5442003, etc. For delivery and/or server storage purposes, this is a high level delivery directory that could be divided into convenience groups.

01b03 - Item identifier The lower-level delivery directory with this item-name will contain all files for the item. The item identifier for this illustrative recording is created as follows:

NOTE: In multi-cut instances, information about the first cut is used to create the identifier for the whole item. For example, this illustrative item contains three cuts: cuts 3 and 4 on side B of reel 1, and cut one on side A of reel 2.

Essence Filenames for Audio Files

01b03m.wav - filename for master manifestation for first cut
01b04m.wav - filename for master manifestation for second cut
02a01m.wav - filename for master manifestation for third cut

01b03s.wav - filename for WAVE service manifestation for first cut
01b04s.wav - filename for WAVE service manifestation for second cut
02a01s.wav - filename for WAVE service manifestation for third cut

01b03sh.ra (.ram) - filename for hifi RealAudio service manifestation for first cut
01b04sh.ra (.ram) - filename for hifi RealAudio service manifestation for second cut
02a01sh.ra (.ram) - filename for hifi RealAudio service manifestation for third cut

01b03sl.ra (.ram) - filename for lofi RealAudio service manifestation for first cut
01b04sl.ra (.ram) - filename for lofi RealAudio service manifestation for second cut
02a01sl.ra (.ram) - filename for lofi RealAudio service manifestation for third cut

Essence Filenames for Image Files

This illustrative examples assumes that the project goes back to the containers for the original items and, say, scans the boxes holding the original Amertape film recordings and/or the cans containing the wire recordings that were made as copies of some of the original film recordings. The prefix "ct" stands for container.

Image-makers are expected the assign the names in a logical sequence, i.e., "01" is the top with writing and the company logo, "02" is the end with a written identification, etc.

ct01m.tif - filename for master image of Amertape box top
ct02m.tif - filename for master image of Amertape box end
ct03m.tif - filename for master image of Amertape box side
ct04m.tif - filename for master image of Amertape box bottom
ct05m.tif - filename for master image of container label for the copy wire, if deemed relevant

ct01sh.jpg - filename for service (hi-res) image of Amertape box top
ct02sh.jpg - filename for service (hi-res) image of Amertape box end
ct03sh.jpg - filename for service (hi-res) image of Amertape box side
ct04sh.jpg - filename for service (hi-res) image of Amertape box bottom
ct05sh.jpg - filename for service (hi-res) image of wire container label

ct01sl.jpg - filename for service (lo-res) image of Amertape box top
ct02sl.jpg - filename for service (lo-res) image of Amertape box end
ct03sl.jpg - filename for service (lo-res) image of Amertape box side
ct04sl.jpg - filename for service (lo-res) image of Amertape box bottom
ct05sl.jpg - filename for service (lo-res) image of wire container label

ct01st.jpg - filename for service (thumbnail) image of Amertape box top
ct02st.jpg - filename for service (thumbnail) image of Amertape box end
ct03st.jpg - filename for service (thumbnail) image of Amertape box side
ct04st.jpg - filename for service (thumbnail) image of Amertape box bottom
ct05st.jpg - filename for service (thumbnail) image of wire container label


Example 2: Spoken-word 78 rpm Disc

Aggregate and Item Directories

780001 - Aggregate identifier
Aggregate for first portion of spoken word 78s to be digitized. The proposal is that all digitized 78 rpm disks be placed in a series of "78" aggregates; the 0001 suffix means that there could be as many as 9,999 of these aggregates. For delivery and/or server storage purposes, this is a high level delivery directory that may in turn be divided into convenience groups.

vic5555a - Item identifier
The bibliographic data in the case of the spoken-word recordings is for "sides" and not for the original two-sided discs; the digital files will represent only one side of any given disc. The item identifier for this illustrative recording is created as follows:

Essence Filenames for Audio Files

In this example, the essence identifiers echo the item identifier.

5555am.wav - filename for master manifestation
5555as.wav - filename for WAVE service manifestation
5555ash.ra (
.ram) - filename for hifi RealAudio service manifestation
5555asl.ra (.ram) - filename for lofi RealAudio service manifestation

Essence Filenames for Image Files

In this example, we are assuming that the original labels and dust jackets are located and scanned. The logical sequence for the two-sided dust jacket (container) is obverse first, reverse second.

5555alm.tif - filename for master image of disc label
5555alsh.jpg - filename for service (hi-res) image of disc label
5555alsl.jpg - filename for service (lo-res) image of disc label
5555alst.jpg - filename for service (thumbnail) image of disc label

ct01m.tif - filename for master image of dust jacket obverse for original disc
ct01sh.jpg - filename for service (hi-res) image of dust jacket obverse for original disc
ct01sl.jpg - filename for service (lo-res) image of dust jacket obverse for original disc
ct01st.jpg - filename for service (thumbnail) image of dust jacket obverse for original disc

ct02m.tif - filename for master image of dust jacket reverse for original disc
ct02sh.jpg - filename for service (hi-res) image of dust jacket reverse for original disc
ct02sl.jpg - filename for service (lo-res) image of dust jacket reverse for original disc
ct02st.jpg - filename for service (thumbnail) image of dust jacket reverse for original disc


Example 3: Long-playing Phono-album

Aggregate and Item Directories

lp0001 - aggregate identifier Aggregate for first portion of long playing phonorecords (lps) to be digitized. The proposal is that all digitized lps be placed in a series of "lp" aggregates; the 0001 suffix means that there could be as many as 9,999 of these aggregates. For delivery and/or server storage purposes, this is a high level delivery directory that may in turn be divided into convenience groups.

mer39016 - item identifier The bibliographic data describes the complete album, i.e., 3 discs, 6 sides (6 suites). This is the item in this illustration. The item identifier for this illustrative recording is created as follows:

Essence Filenames for Audio Files

The item identifiers for this illustrative audio files are created as follows:

NOTE: this is the last four digits for the matrix number for disc one. Mercury's matrix/label numbering, as applied for this series of albums, uses the suffix letters a and b to indicate sides A and B for each disc.

0445am.wav - filename for master manifestation of side A of first disc
0445as.wav - filename for WAVE service manifestation of side A of first disc
0445ash.ra (
.ram) - filename for hifi RealAudio service manifestation of side A of first disc
0445asl.ra (.ram) - filename for lofi RealAudio service manifestation of side A of first disc

0445bm.wav - filename for master manifestation of side B of first disc
0445bs.wav - filename for WAVE service manifestation of side B of first disc
0445bsh.ra (.ram) - filename for hifi RealAudio service manifestation of side B of first disc
0445bsl.ra (.ram) - filename for lofi RealAudio service manifestation of side B of first disc

0446am.wav - filename for master manifestation of side A of second disc
0446as.wav - filename for WAVE service manifestation of side A of second disc
0446ash.ra (.ram) - filename for hifi RealAudio service manifestation of side A of second disc
0446asl.ra (.ram) - filename for lofi RealAudio service manifestation of side A of second disc

(0446bm.wav, 0446bs.wav etc.)

Essence Filenames for Image Files

0445alm.tif - filename for master image of label of side A of first disc
0445alsh.jpg - filename for service (hi-res) image of label of side A of first disc
0445alsl.jpg - filename for service (lo-res) image of label of side A of first disc
0445alst.jpg - filename for service (thumbnail) image of label of side A of first disc

0445blm.tif - filename for master image of label of side B of first disc
0445blsh.jpg - filename for service (hi-res) image of label of side B of first disc
0445blsl.jpg - filename for service (lo-res) image of label of side B of first disc
0445blst.jpg - filename for service (thumbnail) image of label of side B of first disc

0446alm.tif - filename for master image of label of side A of second disc
0446alsh.jpg - filename for service (hi-res) image of label of side A of second disc
0446alsl.jpg - filename for service (lo-res) image of label of side A of second disc
0446alst.jpg - filename for service (thumbnail) image of label of side A of second disc

(0446blm.tif, 0446blsh.jpg, etc.)

ct01m.tif - filename for master image of box cover (container)
ct01sh.jpg - filename for service (hi-res) image of box cover
ct01sl.jpg - filename for service (lo-res) image of box cover
ct01st.jpg - filename for service (thumbnail) image of box cover

There is no printing on the back of the box. The following names assume that there is printing worthy of capture on both sides of all of the sleeves, e.g., a libretto.

ct02m.tif - filename for master image of first side of sleeve for first disc
ct02sh.jpg - filename for service (hi-res) image of first side of sleeve for first disc
ct02sl.jpg - filename for service (lo-res) image of first side of sleeve for first disc
ct02st.jpg - filename for service (thumbnail) image of first side of sleeve for first disc

ct03m.tif - filename for master image of second side of sleeve for first disc
ct03sh.jpg - filename for service (hi-res) image of second side of sleeve for first disc
ct03sl.jpg - filename for service (lo-res) image of second side of sleeve for first disc
ct03st.jpg - filename for service (thumbnail) image of second side of sleeve for first disc

ct04m.tif - filename for master image of first side of sleeve for second disc
ct04sh.jpg - filename for service (hi-res) image of first side of sleeve for second disc
ct04sl.jpg - filename for service (lo-res) image of first side of sleeve for second disc
ct04st.jpg - filename for service (thumbnail) image of first side of sleeve for second disc

(ct05m.tif, ct05sh.jpg, etc.)

bk01m.tif - filename for master image of page 1 of booklet
bk01sh.jpg - filename for service image (hi-res) of page 1 of booklet
bk01sl.jpg - filename for service image (lo-res) of page 1 of booklet
bk01st.jpg - filename for service image (thumbnail) of page 1 of booklet

bk02m.tif - filename for master image of page 2 of booklet
bk02sh.jpg - filename for service image (hi-res) of page 2 of booklet
bk02sl.jpg - filename for service image (lo-res) of page 2 of booklet
bk02st.jpg - filename for service image (thumbnail) of page 2 of booklet

bk03m.tif - filename for master image of page 3 of booklet
bk03sh.jpg - filename for service image (hi-res) of page 3 of booklet
bk03sl.jpg - filename for service image (lo-res) of page 3 of booklet
bk03st.jpg - filename for service image (thumbnail) of page 3 of booklet

bk04m.tif - filename for master image of page 4 of booklet
bk04sh.jpg - filename for service image (hi-res) of page 4 of booklet
bk04sl.jpg - filename for service image (lo-res) of page 4 of booklet
bk04st.jpg - filename for service image (thumbnail) of page 4 of booklet


Summary Table
Key Elements in Naming System

Collection Aggregate ID

(directory)

Item ID

(directory)

Audio ID

(master file)

Label ID

(master file)

Container ID

(master file)

Marine Corps 5442001 01b03 01b03m.wav N/A ct01m.tif
78s 780001 vic5555a 5555am.wav 5555alm.tif ct01m.tif
LPs lp0001 mer39016 0445am.wav 0445alm.tif ct01m.tif

Abbreviations and Conventions

Essence Files for Audio

Note: Files with the extension ram contain special "RealAudio metadata" and send clients from a web server (where they discovered the content item) to a special server with streaming software, where the actual ra (RealAudio) essence files are stored.

m.wav = master (preservation copy)
s.wav = WAVE service manifestation
sh.ra (.ram) = RealAudio hifi service manifestation
sl.ra (.ram) = RealAudio lofi service manifestation

Essence Files for Images

Labels:

The naming convention for disc labels should echo that of the audio since labels usually reflect the audio contained on individual discs or sides of discs.

Containers:

m.tif = master image (preservation copy)
sh.jpg = hi-res service image
sl.jpg = lo-res service image
st.jpg = thumbnail image

Containers are defined as anything that houses recordings, such as a jackets, sleeves, boxes, etc. If the prose annotating notes for an album are printed on a container (jacket, box cover, jewel case or J-card surface or pages), these are counted and numbered as container sequence images. The names will be prefixed with ct in the naming structure and given additional numerical designations in logical order. Logical in this context means an order that does a reasonable job of sequencing the images to represent the way a person approaching the original would encounter them. For example, a dust jacket might be called "ct01" and "ct02" for the front and back respectively; if the sleeve is scanned as well, then its front and back would be designated "ct03" and "ct04" respectively.


Go to top
Go to LC documents
Go to AV prototype home
(4/17/00)
Library of Congress
Comments: AV Prototype Coordinator ([email protected])
Legal | External Link Disclaimer
( August 31, 2010 )