Sustainability of Digital Formats: Planning for Library of Congress Collections

Introduction | Sustainability Factors | Content Categories | Format Descriptions | Contact
Format Description Categories >> Browse Alphabetical List

Java Virtual Machine Class File Format

>> Back
Table of Contents
Format Description Properties Explanation of format description terms

Identification and description Explanation of format description terms

Full name Java Virtual Machine Class File Format
Description

The Java Virtual Machine Class (known as "class" in the lower case) file format is a proprietary binary format created and maintained by Oracle. Class files are a crucial component of the Java programming language and represent each Java class in bytecode from within a Java program’s source code with instructions for the Java Virtual Machine (JVM). Java programs typically have a java filename extension.

The Java programming language is human-readable and “write once, run anywhere” (WORA), meaning that Java code can be executed on many devices with the JVM. To be executable on the JVM, a java file must be compiled using “javac,” the Java Compiler tool in the Java Development Kit (JDK). It is this process which transforms the Java source code into bytecode containing class files. For more on the JVM and javac, see: General Note.

Each class file corresponds to a Java class. A “class” in Java is a logical template used to create objects which share common properties and methods. Each class file contains a single Java class in bytecode. The class file bytecode is platform-independent and can be executed on any device with a compatible JVM. Each java file may contain multiple classes and would result in the creation of multiple associated class files after javac’s compilation.

Class files can be individually or bundled in a Java Archive (JAR) file with other package files. Some Java Integrated Development Environment (IDE) applications, such as Eclipse or NetBeans, can generate class files from JAR files. For more on JAR files, see: General Note.

Other programming languages, including Clojure, Groovy, Scala, Kotlin, JRuby, and JPython can be compiled by the JVM and generate class files.

Structure

Class file structure is defined in Chapter 4 of the Java Virtual Machine specification:

“A class file consists of a stream of 8-bit bytes. All 16-bit, 32-bit, and 64-bit quantities are constructed by reading in two, four, and eight consecutive 8-bit bytes, respectively. Multibyte data items are always stored in big-endian order, where the high bytes come first. In the Java SE platform, this format is supported by interfaces java.io.DataInput and java.io.DataOutput and classes such as java.io.DataInputStream and java.io.DataOutputStream.”

Versions

The standard for class file versions is subject to change with each release of the JDK. The release note for class file version 54 states that “the OpenJDK community has adopted a new time-based release model, in which major releases of the Java platform occur every 6 months. As a consequence, it is anticipated that class file changes will also occur more rapidly. To ensure predictability for the tooling that processes class file bytes, the class file version will be incremented every major release even if there are no other changes to the class file format. In effect, the class file version will be 44 + $FEATURE, where $FEATURE is the feature-release counter (previously referred to as the major number) of the Java SE Platform and the JDK version string."

Class files self-document themselves to the context of their java source code through their structure which contains 10 main components of data, described in Chapter 4.1 of the Java Virtual Machine Specification:

  • 1. Magic Number: The file begins with a four-byte magic number (hexadecimal: CAFEBABE). This magic number indicates that the file is a valid Java class file.
  • 2. Version Information: The next two bytes represent the minor and major versions of the Java compiler used to generate the file.
  • 3. Constant Pool: This table contains the class file’s constants such as literals or symbolic references to classes and interfaces, field and method references, as well as other constant values. The constant pool is indexed from 1.
  • 4. Access Flags: Two bytes specify the access level of the class (public, private, protected, etc.).
  • 5. This class: Two bytes index into the constant pool and represent the class or interface name.
  • 6. Super class: Two bytes index into the constant pool and represent the name of the direct superclass of the class.
  • 7. Interfaces: Two bytes indicate the number interfaces implemented in the class, followed by a two byte index into the constant pool for each interface.
  • 8. Fields: Two bytes indicate the number of fields in the class, followed by detailed descriptions of each field including access flags, name, descriptor, and attributes.
  • 9. Methods: Two bytes indicate the methods in the class followed by a detailed description of each method including access flags, name, descriptor, and attributes.
  • 10. Attributes: Two bytes indicate the number of attributes for the class, fields, and methods. Attributes provide additional information, such as code for methods or line numbers for debugging.

The structure of a class file appears thus:

ClassFile { u4 magic; u2 minor_version; u2 major_version; u2 constant_pool_count; cp_info constant_pool[constant_pool_count-1]; u2 access_flags; u2 this_class; u2 super_class; u2 interfaces_count; u2 interfaces[interfaces_count]; u2 fields_count; field_info fields[fields_count]; u2 methods_count; method_info methods[methods_count]; u2 attributes_count; attribute_info attributes[attributes_count]; }

Production phase This file is intermediary. It is derived from Java code, written as a file, and then executed by the Java Virtual Machine (JVM). It is used while working with Java, on the way to creating a usable, compiled format (ex: .JAR). Not suitable as an archival format, nor particularly important to keep because it's just a middle file between java (or other source) and .JAR (or other target).
Relationship to other formats
    Affinity to JAR. Not described separately at this time. Java Archive (JAR) is often the target file format for class files. Oracle Java documentation describes JAR as ”a file format based on the popular ZIP file format and is used for aggregating many files into one."

Local use Explanation of format description terms

LC experience or existing holdings The Library of Congress has a small amount of class files in its varied collections.
LC preference The Library of Congress has not yet expressed any format preference for system files. See the Recommended Formats Statement for format preferences for software.

Sustainability factors Explanation of format description terms

Disclosure Fully documented within Oracle's Java Virtual Machine specification.
    Documentation Defined as part of Oracle's Java Virtual Machine Specification. Documentation: "Chapter 4: The class File Format".
Adoption Used in Java applications and any programming language that compiles to the Java Virtual Machine.
    Licensing and patents The class files generated by a Java compiler and executed on the Java Virtual Machine (JVM) may be subject to Java Community Process (JCP) licenses and Oracle Binary Code License Agreement (BCLA). The BCLA is a licensing agreement that users must adhere to when using the JDK, which includes the Java Runtime Environment (JRE) and associated tools.
Transparency This is a binary file that requires the Java Virtual Machine to use.
Self-documentation

Class files contain supporting metadata in headers to easily identify and organize the data within. See Description or General for more information.

External dependencies

The JVM is required to use Class files.

Class files are created by the JVM, utilizing javac, from source code.

Some Java Integrated Development Environment (IDE) applications, such as Eclipse or NetBeans, can generate class files from JAR files.

Technical protection considerations None.

Quality and functionality factors Explanation of format description terms

Other
Bundling/compression No inherent compression.
Support for error detection Class files contain an Exceptions attribute. According to Chapter 4.7.5 of the Java Virtual Machine specification, Exceptions hold information about raised errors. The Exceptions attribute is part of the attributes section of the method_info structure in the class file format. It helps document the exceptions that a method may throw, allowing the compiler to enforce proper exception handling at compile-time. These requirements are not enforced in the JVM, they are enforced only at compile-time.

File type signifiers and format identifiers Explanation of format description terms

Tag Value Note
Filename extension class

Chapter 4 of the Java Virtual Machine specification.

See also: PRONOM: https://www.nationalarchives.gov.uk/PRONOM/x-fmt/415

See also: Wikidata: https://www.wikidata.org/wiki/Q2193155

Internet Media Type application/java-vm
httpd MIME Types: https://svn.apache.org/repos/asf/httpd/httpd/tags/2.4.9/docs/conf/mime.types
Internet Media Type application/x-httpd-java
Cups.org “Using CGI Programs”: https://www.cups.org/doc/cgi.html
Magic numbers CAFEBABE

Chapter 4 of the Java Virtual Machine specification.

The history of why this was chosen as a magic number is detailed in this correspondence between James Gosling and Bill Bumgarner. James Gosling is the founder and lead developer of the Java programming language.

Pronom PUID x-fmt/415
PRONOM: https://www.nationalarchives.gov.uk/PRONOM/x-fmt/415
Wikidata Title ID Q2193155
Wikidata: https://www.wikidata.org/wiki/Q2193155
Uniform Type Identifier (Mac OS) com.sun.java-class
See: https://developer.apple.com/library/archive/documentation/Miscellaneous/Reference/UTIRef/Articles/System-DeclaredUniformTypeIdentifiers.html
Other NF00576
NARA File Format Preservation Plan ID. See: https://www.archives.gov/files/lod/dpframework/id/NF00576.ttl

Notes Explanation of format description terms

General

Java Virtual Machine (JVM) and the Java Compiler (javac)

According to the 2023 Stack Overflow Developer Survey, Java is the seventh most commonly used programming language.

The JVM is a virtual environment allowing computers to execute not only Java programs as well as programs written in other languages that have been compiled into bytecode. Programs are compiled into bytecode using the Java Compiler (javac).

Javac is provided within the Java Development Kit (JDK). It is responsible for translating human-readable Java source code, or programs written in other languages, into bytecode that can be executed on the JVM.

Each version of Java will have its own specification for the JVM. These specifications are available on the official Oracle documentation page for Java. As of January 2024, the most recent version is Java SE 21, released in September 2023.

History

The history of class files is closely tied to the development and evolution of the Java programming language. The concept of class files and bytecode have been incorporated into Java since 1995. The specification discusses Java’s history.

1991-1995:

Java, initially called Oak, was developed by James Gosling and his team at Sun Microsystems in the early 1990s with the aim of creating software for electronic devices. In 1995, Java was officially announced, and its main focus shifted to being a platform-independent programming language for the emerging internet.

1995:

In 1995, Java introduced the JVM, the concept of bytecode, and class files.

1995-2000s:

Java gained rapid popularity in the 1990s, particularly in web development, due to its portability and security features.

2000:

The Java Community Process (JCP) is established to guide the evolution of the Java platform. The JCP is the mechanism by which standards for the Java programming language and the Java Platform are developed. It is an open and inclusive process that allows developers, organizations, and other stakeholders to participate in shaping the evolution of the Java ecosystem.

Class files are first documented in the JVM specification as part of JDK version 1.0.


Format specifications Explanation of format description terms


Useful references

URLs


Last Updated: 03/12/2024