Sustainability of Digital Formats: Planning for Library of Congress Collections

Introduction | Sustainability Factors | Content Categories | Format Descriptions | Contact
Format Description Categories >> Browse Alphabetical List

SAS Version 5 Transport File Format (XPORT)

>> Back
Table of Contents
Format Description Properties Explanation of format description terms

Identification and description Explanation of format description terms

Full name SAS Transport File Format (XPORT) Version 5
Description

The SAS Transport File Format is an openly documented specification maintained by SAS, a commercial company with a variety of software products for statistics and business analytics, including the application now known as SAS/STAT, which originated in the late 1960s as SAS (an acronym for Statistical Analysis System) at North Carolina State University. The transport format was originally developed in the late 1980s when the corporate entity was known as SAS Institute, Inc. and the software as SAS, to support data transfers between statistical software systems, especially between SAS applications running on different operating systems. SAS considers it non-proprietary. This description is for the original version, now termed Version 5, which was introduced in the late 1980s. This format is referred to in several ways, including XPORT and XPT. In this description, "SAS_xport_5" will be used. Version 8 was introduced in October 2012. See Usage Note 46944: New SAS transport format and tools available. References on the Web to the SAS transport format without qualification as to version should probably be assumed to refer to Version 5.

See SAS_xport_family for a summary of the structure that is common to both versions of the format.

SAS_xport_5 is subject to certain restrictions. The list below is adapted from Appendix A1 in the Stata manual section entitled Import and export datasets in SAS XPORT format:

  • The dataset may contain only 9,999 variables. This is constrained by a limit of 4 decimal digits in the NAMESTR header.
  • The names of the variables and value labels may not be longer than eight characters and are case insensitive; for example, myvar, Myvar, MyVar, and MYVAR are equivalent.
  • Variable labels may not be longer than 40 characters.
  • The contents of a variable may be numeric or string:
    • Numeric variables may be integer or floating point. Floating point variables may not have absolute values smaller than 5.398e–79 or greater than 9.046e+74. The range and precision are controlled by the IBM Double Precision (8-byte) numeric format. For more on how numeric formats are stored, see Numeric Precision in SAS Software
    • String variable values may not exceed 200 characters. String variables are padded with blank/space characters to the fixed length declared in the descriptor for the variable. Hence, when variables are read, it cannot be determined whether the original variable value had trailing blanks.
    • When data is missing, a missing data value is stored in the first byte of the data location for the variable. The variable value is padded with Hex 0x00 bytes to the declared length for the variable.
  • Value labels are not written in the XPORT dataset. Suppose that you have a variable sex in the data with values 0 and 1, and the values are labeled for gender (0=male, and 1=female). When the dataset is written in SAS XPORT Transport format, you can record that the variable label gender is associated with the sex variable, but you cannot record the association with the value labels male and female. Value-label definitions are typically stored in a second XPORT dataset or in a text file containing SAS commands.
Production phase See SAS_xport_family.
Relationship to other formats
    Subtype of SAS_xport_family, SAS Transport File Format Family (XPORT)
    Has later version SAS_xport_8, SAS Version 8 Transport File Format (XPORT)

Local use Explanation of format description terms

LC experience or existing holdings See SAS_xport_family.
LC preference See SAS_xport_family.

Sustainability factors Explanation of format description terms

Disclosure Publicly documented format developed by SAS Institute, Inc. SAS considers it non-proprietary but controls the specification.
    Documentation Version 5 is documented in SAS Technical Paper TS-140: Record Layout of a SAS Transport Data Set.
Adoption

Since 1999 at the latest, the U.S. Food and Drug Administration has required the SAS_xport_5 format as the format for datasets submitted in electronic form with new drug and new device applications. See Guidance for Industry: Providing Regulatory Submissions in Electronic Format - General Considerations, 1999. For more current information, see Electronic Regulatory Submissions and Review | Helpful Links; FDA | Study Data Specifications, Version 2, July 18, 2012.; and FDA Data Standards Catalog v4.5.2 (04-13-2017) - Supported and Required Standards from the FDA. The Centers for Disease Control (CDC) also use the SAS transport format for distributing public data. See, for example, 2014 BRFSS Survey Data and Documentation, and NHANES Tutorial: Download Data Files.

For other aspects of adoption of this format, including support for import by other statistical software, and acceptability for deposit by data archives, see SAS_xport_family.

    Licensing and patents See SAS_xport_family.
Transparency See SAS_xport_family.
Self-documentation See SAS_xport_family.
External dependencies See SAS_xport_family.
Technical protection considerations See SAS_xport_family.

Quality and functionality factors Explanation of format description terms

Dataset
Normal functionality See SAS_xport_family.

File type signifiers and format identifiers Explanation of format description terms

Tag Value Note
Filename extension xpt
The specification does not mandate a particular extension, but .xpt is most commonly used, particularly for Version 5. Listed in Gary Kessler's File Signatures. The FDA mandates the use of .xpt as the file extension for submitted datasets. See FDA Study Data Specifications.
Magic numbers ASCII: HEADER RECORD*******LIBRARY HEADER RECORD!!!!!!!
Hex: 48 45 41 44 45 52 20 52 45 43 4F 52 44 2A 2A 2A 2A 2A 2A 2A 4C 49 42 52 41 52 59 20 48 45 41 44 45 52 20 52 45 43 4F 52 44 21 21 21 21 21 21 21
 

Notes Explanation of format description terms

General See SAS_xport_family.
History See SAS_xport_family.

Format specifications Explanation of format description terms


Useful references

URLs


Last Updated: 12/27/2022