Sustainability of Digital Formats: Planning for Library of Congress Collections

Introduction | Sustainability Factors | Content Categories | Format Descriptions | Contact
Format Description Categories >> Browse Alphabetical List

CDF, Common Data Format (multidimensional datasets)

>> Back
Table of Contents
Format Description Properties Explanation of format description terms

Identification and description Explanation of format description terms

Full name Common Data Format
Description CDF is a conceptual data abstraction for storing, manipulating, and accessing multidimensional data sets. The basic component of CDF is a software programming interface that is a device-independent view of the CDF data model. In addition to the actual data being stored, CDF also stores user-supplied descriptions of the data, known as metadata. This self-describing property allows CDF to be a generic, data-independent format that can store data from a wide variety of disciplines. The application developer is insulated from the actual physical file format for reasons of conceptual simplicity, device independence, and future expandability. CDF files created on any given platform can be transported to any other platform to which CDF is ported and used with any CDF tools or layered applications.

CDF Version 2.7 and up contain support for Java Application Program Interfaces (APIs), in addition to the C and Fortran APIs of earlier versions.

Production phase Generally used for middle- and final-state archiving.
Relationship to other formats
    Has subtype Has several versions not documented separately here.

Local use Explanation of format description terms

LC experience or existing holdings None.
LC preference The Library of Congress Recommended Format Statement (RFS) lists CDF as an acceptable format for Datasets. The RFS does not specify a version of CDF.

Sustainability factors Explanation of format description terms

Disclosure Fully documented. Specifications of the format and the APIs in Java, C, and Fortran are freely available. Source code for the CDF software package is also freely available.
    Documentation

Available from https://cdf.gsfc.nasa.gov/. Documentation includes CDF User's Guide and complete list of APIs and their descriptions in reference manuals for the supported programming languages. Maintained by the Space Physics Data Facility (SPDF) at NASA/Goddard Space Flight Center.

Adoption

In use in various versions since 1985. From CDF FAQ: "The CDF software package is used by hundreds of government agencies, universities, and private and commercial organizations as well as independent researchers on both national and international levels. CDF has been adopted by the International Solar-Terrestrial Physics (ISTP) project as well as the Central Data Handling Facilities (CDHF) as their format of choice for storing and distributing key parameter data."

CDF is supported by commercial and open source data analysis/visualization software such as IDL, MATLAB, and IBM's Data Explorer (XP).

    Licensing and patents None.
Transparency TBD
Self-documentation

CDF control information acts as an embedded data dictionary. Additional metadata appropriate for any particular dataset can be stored as attribute entries as part of the application data within the CDF. Guidelines for the Space Physics community are found at https://spdf.gsfc.nasa.gov/sp_use_of_cdf.html.

Accessibility Features

Accessibility features for datasets and databases typically involve conformance to W3C's guidelines for page structure, tables and forms. In practical terms, this means pages (if applicable to the dataset) should be well-structured with regions and headings identified and the content is marked up or tagged on a page in a way that uses appropriate and meaningful elements; tables are organized through logical relationship in grids with labeled header cells and data cells that define their relationship; and forms (if applicable to the dataset) validate input provided by the user and provide options to undo changes and confirm data entry and notify users about successful task completion, any errors, and provide instructions to help them correct mistakes. Each of these criteria should be supported by text accessible to a screen reader.

CDF has the potential to support accessibility features. CDF contains user-supplied metadata about the CDF and the variables in the CDF, so it is possible for users to create metadata that supports the accessibility of the dataset as a whole and of individual variables. Also, CDF can be marked up in XML using CDF Markup Language (CDFML). As described in XML, XML-based formats have good support for accessibility features and can include features that promote accessibility depending on implementation. Comments welcome.

External dependencies None.
Technical protection considerations None.

Quality and functionality factors Explanation of format description terms

Dataset
Normal functionality Good support. Structured representation of typed data.
Support for software interfaces (APIs, etc.) The basic component of CDF is a software programming interface that is a device-independent view of the CDF data model. Hence the specification focuses on an API rather than on organization of data in files. APIs in Fortran and C are available for all versions, in Java for version 2.7 and up.
Data documentation (quality, provenance, etc.) Capabilities for embedding user documentation for the dataset as a whole or for particular elements through a data dictionary can support documentation of precision, provenance, etc.
Beyond normal functionality

CDF is designed to support multi-dimensional data. The CDF structure is based on variable definitions (name, data type, number of dimensions, sizes, etc.) where a collection of data elements is defined in terms of a variable. The structure of CDF allows one to define an unlimited number of variables completely independent (loosely coupled) of one another and disparate in nature, a group of variables that illustrate a strong dependency (tightly coupled) on one another or both simultaneously.

Compared to HDF format, CDF permitted cross-linking data from different instruments and spacecraft in ISTP with one development effort (according to https://web.archive.org/web/20160801173718/http://nssdc.gsfc.nasa.gov/nost/fep/researcher-szabo-cdf.html (link via Internet Archive)).


File type signifiers and format identifiers Explanation of format description terms

Tag Value Note
Filename extension cdf
From https://fileinfo.com/.
Other NF00144
See https://www.archives.gov/files/lod/dpframework/id/NF00144.ttl.
Pronom PUID fmt/1948
See http://www.nationalarchives.gov.uk/PRONOM/fmt/1948 for version 2.0-2.5.
Pronom PUID fmt/1949
See http://www.nationalarchives.gov.uk/PRONOM/fmt/1949 for version 2.6-2.7.
Pronom PUID fmt/1950
See http://www.nationalarchives.gov.uk/PRONOM/fmt/1950 for version 3.x.
Wikidata Title ID Q1116060
See https://www.wikidata.org/wiki/Q1116060.

Notes Explanation of format description terms

General In 2002, the CDF office developed an XML-based markup language called CDF Markup Language (CDFML) to describe CDF data and metadata.

Translators among various data formats, including CDF are available at https://cdf.gsfc.nasa.gov/html/dttools.html.

History

CDF was designed and developed in 1985 by the National Space Science Data Center (NSSDC) at NASA/GSFC. CDF was originally written in FORTRAN and only ran in VAX/VMS environments.

CDF V3.0 was released on February 10, 2005. V3.0 is backward compatible with CDF V2.7, V2.6, and V2.5, but not vice versa. Libraries for CDF 3.0.0 and later will read a file that was created with CDF 2.5, 2.6, or 2.7 library, and save the file in the version that was originally created under (not 3.0). A file created from scratch with CDF 3.0.0 or later will be stored in the new format. The 3.0 format is incompatible with the previous versions of the CDF library. As of January 2023, the latest version of the CDF library is 3.9.1.


Format specifications Explanation of format description terms


Useful references

URLs


Last Updated: 04/09/2025