NAME: Machine Generation Flag in USMARC Authority Records
SOURCE: Cooperative Cataloging Council, Series Authority Record Task Group
SUMMARY: This paper discusses options for flagging USMARC authority records that have been created or modified by machine.
KEYWORDS: Authority format; Field 008/08 (Authority); Field 008/33 (Authority); Level of establishment; Machine-generation flag
12/01/95 - Forwarded to the USMARC Advisory Group for discussion at the January 1996 MARBI meetings.
1/21/96 - Results of USMARC Advisory Group discussion - Several vendors and networks that did machine generation of authority records indicated that they marked the records with local content designation. It was asked if this was useful for internal systems but unnecessary in the communications environment? Would the value mean different things in different systems as the generation mechanisms may vary, producing authority records with various levels of completeness and standardization? It was indicated that the PCC may establish 3 levels of machine generated records, but they did not request that format to reflect these. The USMARC Advisory Group requested more information from the PCC concerning:
DISCUSSION PAPER NO. 91: Flagging Machine Generation 1. BACKGROUND This paper discusses flagging USMARC authority records that have been created or modified by machine. It presents the rationale for indicating that an authority record was machine generated and suggests several options for providing such a flag. Options suggested include ones that make use of new and existing authority record data elements. The need for such a flag is an outgrowth of a national effort to increase the amount of authority control provided in national data bases. 2. DISCUSSION In the spring of 1994 the Cooperative Cataloging Council (CCC) established the Series Authority Record Task Group to define the content and functional uses of series authority records. The creation of this group followed a two year period (1993 to 1994) during which the Library of Congress considered making changes to the amount of authority work done for monographic series titles. A September 1994 report from the Task Group made suggestions and recommendations to the CCC about changes to the series authority record which it said were needed to support the goals of the Program for Cooperative Cataloging (PCC). As a result of PCC Executive Council review of the report at the ALA Midwinter Conference in 1995, one of the recommendations from the Task Group was forwarded to MARBI for consideration. The recommendation forwarded to MARBI was that a data element should be made available in the USMARC Format for Authority Data for indicating that an authority record (for a series title or any other heading type) was initially generated by machine. The Task Group suggested this because it believed that this information was important in the context of computers being used to generate some records in the National Authority File (NAF) so that all headings used in access points could be under authority control. In their proposal they suggested that a new code in an existing fixed-length data element (008/33 (Level of establishment) could be used. Current State of Machine Generation Many library systems already provide for the automatic generation of authority records for headings in authority controlled fields in bibliographic records. In most systems with this functionality, authority records are created for any heading not already covered by an existing authority record. The content of machine-generated authority records varies but some systems are able to create records which contain as much information as a human would supply when simple headings are involved. Examples of the kinds of data elements supplied by machine include the 1XX (Heading) field, 670 (Sources Found Note), and certain control information. It is even possible for systems to provide some cross references, although in most case this is left for humans to provide. Unfortunately, the USMARC Authority format does not include any data element designed to indicate creation or manipulation of a MARC record by machine. Machine generation of authority records offers a means for libraries to provide full authority control while reducing individual effort. Both time and cataloging resources can be saved. Even if an authority record is later updated by a human to add references and other information, creation of a brief record by machine from data already keyed in a bibliographic records avoids rekeying and the cost connected to it. When multiplied by thousands of headings, the savings can be significant. Machine generation of an authority records from a heading in a bibliographic record also guarantees a match between the two. System validation of headings in a bibliographic files against an authority file is often part of the process. With the functionality of library systems expanding, machine generation and manipulation of authority records is already widely available. Task Group Requirements The flagging of machine-generated records could meet several requirements. The Task Group suggests that a machine generation flag is needed for analytical purposes. It would facilitate the assessment of the effects of machine generation on the overall character of authority files. If defined adequately, it could also help to improve software that generates authority records automatically. A data element to signal machine generation is essential in order to identify records that have not been reviewed and updated by a human. In an environment where authority data is shared, it would allow systems to prioritize authority records, giving, perhaps, greater value to records created by human than to those created by machine. The SAR Task Group is of the opinion that it is important to identify machine-generated authority records as a distinctive group. They believe that in the future machine-generated authority records will reach such a level of sophistication in production that they will coexist with human-generated records in resource files including the National Authority File. 3. POSSIBLE OPTIONS a) Make use of an existing fixed-length USMARC authority data element by validating a new value. The CCC recommended using field 008/33 (Level of establishment) to indicate that the record was machine generated. This would have the advantage of making use of an existing data element that could be easily and reliably coded by machine. The disadvantage of using 008/33 is that the data element as currently defined relates to the heading in a 1XX field, not necessarily the entire record. Even though a record may be machine generated, the heading might be "fully established" (one of the other currently-valid code defined for 008/33). The use of a new code would eliminate the possibility of also coding one of the other aspects that is handled by 008/33. Field 008/29 (Reference evaluation) might be a more appropriate data element for which to define a new code. It is assumed that in the case of machine-generated records, the need/evaluation of references is the area where catalogers would be likely to have the most concern. In most cases, particularly if the heading field were generated from a bibliographic record, the 1XX field would be reliably authoritative. b) Make use of an existing variable length USMARC data element. Field 042 (Authentication Code) might be ideal for this purpose. Since the data in this field is not often validated, it would result in the least change needed to implementations of the USMARC authority format. A special code or codes could be used to identify the lack of human authentication for the record. Field 040 (Cataloging Source) could also be used, although since none of the currently-defined subfields would be appropriate for a machine-generation flag, a new subfield would be needed. c) Define one of the available (undefined) field 008 positions (e.g., position 08) for a machine generation flag. The advantages to this option are that it does not confuse or eliminate the coding possible in other fixed-length data elements or variable fields. As a separate data element, several values could be defined to allow the quality/complexity of the machine generation to be specified more accurately (e.g., machine generated 1XX only, or 1XX and 670, or 1XX, 670, and obvious 4XX references based on computer algorithms). If field 008/08 were undesirable for some reason, field 008 positions 18-27 and 30 are also currently undefined. 4. QUESTIONS The suggestion of defining or identifying an existing USMARC Authorities data element to flag machine generation raises several questions. 1) What function would the flag actually serve? Would USMARC users be likely to really use the information about machine-generation to some end? Some people worry that users would be doing a lot of coding that nobody would make much use of. 2) Would the machine-generation flag be permanent? If not, changing the flag to some other value would further burden catalogers who must already update authority records for other purposes. 3) What assumptions are there behind machine-generated records? Would a flag such as the one suggested in this paper imply certain characteristics in the record, for example, certain fields present, other lacking? 4) Is machine-generation a concern if quality is not affected? Some have suggested that as many as 50% of authority records could be machine generated with equal content and quality because references are not involved. If this is true, would such record be better off without the machine generated flag? 5) What is the analysis design behind the CCC request for a flag for machine generation. What kinds of analyses are likely to be depended on it? 6) Is there a need to identify what pieces of an authority record were generated by machine, i.e., at the field/subfield level? (NOTE: Some cataloging agencies use a locally defined subfield to indicate machine manipulation of access points.) 7) What are the implications of the existence of a machine-generated flag on existing authority files that contain machine-generated records. None of the options can deal with the perhaps large number of machine-generated records that already exist. 8) How would a machine generation flag relate to other record-level flags in the Authority format? (record completeness in Leader/17; how heading was constructed in field 008/10 and /11; reference evaluation in field 008/29; level of establishment in field 008/33).