Sustainability of Digital Formats: Planning for Library of Congress Collections

Introduction | Sustainability Factors | Content Categories | Format Descriptions | Contact
Format Description Categories >> Browse Alphabetical List

Parity Volume Set File Format Family

>> Back
Table of Contents
Format Description Properties Explanation of format description terms

Identification and description Explanation of format description terms

Full name Parity Volume Set File Format Family
Description

The Parity Volume Set file format, as it's called in the file format specifications is also known as Parity Archive Volume Set, Parchive, or simply PAR. According to the file format specifications, it is an open-source recovery file format designed to accompany a set of files intended for transfer or storage, verify the data integrity of those files, and if needed, repair corrupted files or reconstruct missing files. PAR was originally developed in 2001 by a group known as the Parchive Project to solve a common problem among those who shared files on Usenet newsgroups, where uploaded files could become corrupted, or some files could be missing after download. See History for more information.

Although PAR was specifically designed with Usenet in mind, its usage isn't limited to Usenet. According to the Parchive website and specifications, PAR files can be created to backup a set of files and then used to verify the integrity of the set and restore corrupted or missing data. PAR files provide parity (i.e., redundant data) for a group of input files, called the "recovery set." Parity is calculated using Reed-Solomon codes. To generate and make use of PAR files, a program that can read and write PAR files (called a "PAR client") is needed.

There are two versions of PAR. PAR v2.0 is still in use as of this writing in 2025. Development of PAR v3.0 has been attempted twice but never released. See Adoption and History for more information.

According to the file format specifications, a parity volume set consists of two parts. The first part is an index file. The index file contains the filenames of the files in the recovery set, their MD5 checksums, and their size. In PAR v2.0, the index file contains all main packets, file description packets, and input file checksum packets, as described in the file specification. A PAR client verifies that all files in a set are present and uncorrupted by calculating MD5 checksums for the files (or for slices of files in PAR v2.0) and comparing them to the MD5 checksums in the index file. If all files are present and valid, it isn’t necessary to utilize the parity volume files.

Parity volume files are the second part of the parity volume set. The parity volumes contain the redundant data that can be used to repair and reconstruct input files, as well as the same information contained in the index file. The redundant data are called "recovery slices" in PAR v2.0. Because the parity volume files contain the same information contained in the index file, it is not necessary to have the index file to recover data.

It is important to note that parity volume files on their own, without any of the input files in the recovery set, cannot recover the entire recovery set. A PAR client uses the available parity data and the uncorrupted parts of the recovery set to repair/reconstruct corrupted or missing parts of the recovery set. From the PAR v2.0 specification: "Reed-Solomon codes can do this recovery as long as the number of missing data blocks does not [outnumber] the recovery blocks."

The v2.0 specification addressed several limitations that existed with v1.0. Peter B. Clements, one of the authors of the v2.0 specification and the developer of the PAR client QuickPar, explained the differences between v1.0 and v2.0 on the QuickPar website: In v1.0, damaged files had to be completely reconstructed; all parity volume files were of equal size and contained enough recovery data to reconstruct the largest source file; damaged parity volume files could not be used to reconstruct input files and were, therefore, useless; creating PAR files for a small number of input files was inefficient and required the input files to be split into many pieces; and finally, v1.0 could not process more than 255 files.

In v2.0, damaged files can be repaired, instead of having to be completely reconstructed; there is no relationship between the size the input files and the size of the parity volume files; the undamaged parts of damaged parity volumes files can be used to repair input files; PAR files can be generated for a small number, or even one, input file without needing to split the input files; and v2.0 can handle up to 32,768 files. All of these improvements are possible because PAR v2.0 virtually splits the input files into smaller slices, or blocks, of data, and then processes the slices in the same way that PAR v1.0 would process whole files. One disadvantage of using v2.0 versus v1.0, pointed out on the par2cmdline GitHub page, is that it takes longer for PAR clients to create PAR v2.0 recovery files, compared to a PAR v1.0 client.

Production phase May be used at any lifecycle phase to be distributed or stored with the recovery set.
Relationship to other formats
    Subtype of Parity Volume Set File Format, Version 1.0, not described separately at this time.
    Subtype of Parity Volume Set File Format, Version 2.0, not described separately at this time.
    Subtype of Parity Volume Set File Format, Version 3.0, not described separately at this time. Though draft specifications have been written for PAR v3.0 on two occasions, mostly recently in 2022, the Parchive Project has never released PAR v3.0.
    May contain RAR_Family, RAR Archive File Format Family. According to multiple Usenet/PAR guides, such as those from Binaries4all, Newsgroup Reviews, and Harley Hahn’s Usenet Center, among others, a common workflow for file sharing on Usenet was to split/compress file(s) using a RAR client and then to use a PAR client to create PAR files for the RAR file(s). It is uncertain how widely this workflow continues to be used as of this writing in 2025. Comments welcome.

Local use Explanation of format description terms

LC experience or existing holdings The Library of Congress has a small amount of PAR files in its collections.
LC preference See the Library of Congress Recommended Formats Statement for format preferences for aggregation and/or transfer of datasets, software and video games, web archives, and email.

Sustainability factors Explanation of format description terms

Disclosure Open standard, fully documented. As stated by the authors in v1.0 of the file format specification: "…the file format [specification] and [its] details can be used free by everyone for any type of software."
    Documentation The full file format specification is publicly available on the Parchive: Parity Archive Volume Set website, including v1.0, v2.0, and an alpha draft of v3.0.
Adoption

A variety of clients for PAR v1.0 and PAR v2.0 have been developed for Windows, Mac OS X, and Linux, including Parchive command line tools, MacPAR, and MultiPar, among others. Additionally, some Usenet newsreaders have built-in PAR capabilities and will automatically verify and attempt to repair corrupted files after download; for a list, see Wikipedia’s Comparison of Usenet newsreaders.

PAR v2.0 continues to be used within the Usenet community. According to the Parchive website: "Has been around for 20 years and there have been millions of 'par2' clients downloaded. 'par2' is the de facto standard for safely sending large files on Usenet (a.k.a., 'network news') and is built into many newsreaders. It is also used for backups, where people fear a CD/DVD/tape will get damaged or, when using multiple media, a few will fail entirely."

It is not clear how widely PAR is used outside of the Usenet community. Comments welcome.

    Licensing and patents

Licensing information has varied between versions of the file format specification, though all versions have been open.

In v1.0 of the specification, the authors stated that the specification can be used freely. Their only stipulation was anyone wanting to making changes to the format must first discuss proposed changes with the original specification authors in order to maintain compatibility.

A GNU Free Documentation License was applied to v2.0 of the specification. The GNU Free Documentation License allows users to copy, redistribute, and modify the specification, provided all copies and derivatives are also made available under the GNU Free Documentation License.

Transparency

The ability to use PAR files to repair and/or reconstruct original data files depends upon tools to read and make use of the PAR files. Because the format is fully open, several open-source PAR clients have been built for different operating systems and are freely available. This functionality is also built into many Usenet newsgroup readers.

According to v2.0 of the specification: "All strings in the 'core' spec are ASCII. This was chosen because Unicode is not sufficiently supported by tools. There exist optional portions of the spec that do support Unicode strings."

The Reed-Solomon algorithm that is used to produce the redundant data in the parity volumes is documented within the file format specifications. It is based on the article A Tutorial on Reed-Solomon Coding for Fault-Tolerance in RAID-like Systems by James S. Plank.

Comments welcome.

Self-documentation

Every PAR file contains information about the PAR client that created it. In v1.0, it contains the client name and version. In v2.0, the creator packet should contain the client name and a way to contact the client developer. This information is shown to the user if a PAR client is unable to process a recovery set, with the goal that any incompatibilities between PAR clients can be identified and resolved.

Parity index files contain information about the files that are part of the recovery set, including their filenames, MD5 checksums, and size. This information is repeated in each of the parity volume files.

Comments welcome.

Accessibility Features

No specific features in the file format. Features to support accessibility would be found in the bundled and compressed files (such as embedded captions and subtitles in audiovisual content, tagged and structured text in textual documents, and alt text for images). Aggregate files can also contain separate files for transcripts, timed text or captions as part of the bundled package. See Relationships to other formats for details.

External dependencies

None, beyond the availability of software to create parity volume sets and to use parity volume sets to repair/reconstruct files from the recovery set. Clients exist for Windows, OS X, and Linux. Many Usenet newsreaders have the capability to verify and repair/reconstruct files with parity volume sets. Comments welcome.

Technical protection considerations

No specific features in the file format. Any features that protect intellectual (or other) property, such as encryption, would be inherent in the input files of the recovery set. Comments welcome.


Quality and functionality factors Explanation of format description terms

Aggregate
Compression

PAR files are not natively compressed. If it is desirable files in the recovery set to be compressed before distribution and/or storage, there is a variety of aggregate formats that can be used, including RAR, GZIP, bzip2, and others. PAR files could then be created for the compressed files. Comments welcome.

Support for Error Dectection

PAR files use MD5 hashes to verify if any files in the recovery set have been corrupted and Reed-Solomon codes to produce redundant data in the parity volume files and to repair corrupted data. Comments welcome.

Beyond normal functionality

None. Comments welcome.


File type signifiers and format identifiers Explanation of format description terms

Tag Value Note
Filename extension par
Extension for the index file for PAR v1.0.
Filename extension pxx
Extension for the parity volume files for PAR v1.0 (i.e., .p01, .p02, .p03, and so on, depending on the number of parity volume files).
Filename extension par2
Extension for index and parity volume files for PAR v2.0. If the file is a parity volume file, then ".par2" should be preceded by ".volXX-YY," where XX to YY is the range of exponents for the recovery slices.
Filename extension pa3

Extension for PAR v3.0, proposed in 2010, which was tested but never officially released. Comments welcome.

Filename extension par3

Extension for PAR v3.0, proposed in 2022. Still in development. Comments welcome.

Internet Media Type application/x-par
Not listed in IANA. From TriD via Wikidata. Used in the gpar2 client.
Internet Media Type application/x-par2
Not listed in IANA. From TriD via Wikidata. Used in the gpar2 client.
Internet Media Type application/x-par3
Not listed in IANA. From TriD via Wikidata.
Magic numbers Hex: 50 41 52 00 00 00 00 00
ASCII: PAR
The PAR v1.0 specification states: "Offset: 0x0000, Type (Size): 8 Bytes, Data: The string 'PAR' followed by 5 null bytes." The File Formats Wiki references the equivalent to the ASCII characters written as hex values.
Magic numbers Hex: 50 41 52 32 00 50 4B 54
ASCII: PAR20PKT
The PAR v2.0 specification states: "Length (bytes): 8, Type: byte[8], Description: Magic sequence. Used to quickly identify location of packets. Value = {'P', 'A', 'R', '2', '\0', 'P', 'K', 'T'} (ASCII)." The File Formats Wiki references the equivalent to the ASCII characters written as hex values.
Magic numbers Hex: 50 41 33 00
ASCII: PA3
The specification proposal for PAR v3.0 (2010) states: "Length (bytes): 4, Type: byte[4], Description: Magic sequence. Used to quickly identify location of packets. Value = {'P', 'A', '3', '\0'} (ASCII)." The File Formats Wiki references the equivalent to the ASCII characters written as hex values.
Magic numbers Hex: 50 41 52 33 00 50 4B 54
ASCII: PAR30PKT
The alpha draft specification for PAR v3.0 (2022) states: "Length (bytes): 8, Type: byte[8], Description: Magic sequence. Used to quickly identify location of packets. Value = {'P', 'A', 'R', '3', '\0', 'P', 'K', 'T'} (ASCII)." The hex value was inferred from the ASCII.
Other See note.  NARA File Format Preservation Plan ID has no corresponding entry as of January 2025.
Pronom PUID See note.  PRONOM has no corresponding entry as of January 2025.
Wikidata Title ID Q497118
See https://www.wikidata.org/wiki/Q497118 (Parchive). Note that this is the general Wikidata entry for PAR.
Wikidata Title ID Q28791524
See https://www.wikidata.org/wiki/Q28791524 (Parchive, version 1). Note that the Wikidata entry does not cite any sources. Refers to the same format and version as Q105866299.
Wikidata Title ID Q105866299
See https://www.wikidata.org/wiki/Q105866299 (Parity Archive Volume Set (Par1)). Refers to the same format and version as Q28791524.
Wikidata Title ID Q28791553
See https://www.wikidata.org/wiki/Q28791553 (Parchive, version 2).
Wikidata Title ID Q105865405
See https://www.wikidata.org/wiki/Q105865405 (Parity Archive Volume Set (Par3)). Note that the Wikidata entry mostly refers to the 2022 proposal for PAR v3.0, except for the file extension, .pa3, which was only used in the 2010 PAR v3.0 proposal and never released.

Notes Explanation of format description terms

General For a helpful explanation on the many file formats that use .par, or variations of .par, as a file extension, see Tyler Thorsted's blog post on the topic. His post discusses Apache Parquet (.parquet), Solid Edge 3D models (.par), DVD Studio Pro parse files (.par), and Reflexw data-format (.par), in addition to the PAR files described in this FDD.
History

The idea for the PAR file format was born out of a need to be able to reliably post large files to Usenet newsgroups, where users often shared an archive in the form of several RAR files. RAR was used in order to split and compress files that were too large to be shared in their original form. According to Slyck's Guide To The Usenet Newsgroups: "During an archive's propagation throughout Usenet, it's not uncommon for an [archive] to suffer damage to a few files (or parts). Before there were PAR files, if an archive was damaged or a part was missing, the only recourse for the user was to request a repost. This approach had its problems. It could take a while for a repost to appear, or perhaps never happen. This was another contributing reason why the newsgroups remained an elusive method of information for a majority of users." Thus, there was a need to be able to verify the integrity of shared files and a way to repair damaged files and reconstruct missing files.

The original goal of the Parchive Project was to provide both a file format specification and a client to generate and make use of PAR files. On the first iteration of the developers' SourceForge website in 2001, Tobias Rieper was credited with coming up with the idea. The initial file specification was drafted by Stefan Wehlus (a.k.a. Beaker) with input from Roger Harrison (a.k.a. Kilroy Balore).

The Parity Volume Set Specification was first published on July 12, 2001, and was further refined as PAR clients began to be developed. Suggestions for improvements to the specification were shared on the Parchive Project’s File Format Specification discussion board and incorporated into the spec, such as Karl Vogel’s suggestion to use Reed-Solomon coding to calculate parity over normal parity recovery, with additional contributions from Willem Monsuwe and Ryan Gallagher, the project's webmaster. The specification was stable with revision 1.0 on October 14, 2001, with further suggestions considered for v2.0.

In parallel to refining the file specification, developers also worked on releasing PAR clients. Weyhlus developed the Mirror client as a Win32 test implementation of the file format specification. Command-line clients were also developed for MS-DOS, Unix, and Mac OS X and hosted on the Parchive Project’s website. By the end of the year, additional clients were announced.

In 2002, work on v2.0 of the file format specification began immediately following the release and widespread adoption of 1.0 on Usenet, evidenced by conversations on the project discussion board. A first draft of Parity Volume Set Specification 2.0 was published on July 24 by Michael Nahas. Several revisions followed with additional contributions from Peter B. Clements, Paul Nettle (developer of the PAR client FSRaid), and Gallagher. A draft specification was announced in September for client developers to test. Several months later, on May 11, 2003, the final specification was released, along with two PAR2 clients developed by Clements, QuickPar and par2cmdline. Usenet users seemingly embraced v2.0, with reports on the developer mailing list of sightings of PAR2 files across various newsgroups and the Parchive Project reaching the top of SourceForge’s projects of the month, making it to their front page by the end of May.

In 2010, Yutaka Sawada, developer of the MultiPar client, and others wrote a proposal for v3.0 of the specification. Sawada created a test client for his PAR v3.0 specification in MultiPar v1.1.7.1, as announced on the MultiPar discussion board. His proposal never moved beyond testing, and the specification is not recommended for practical usage, which Sawada made clear on the MultiPar discussion board. Comments welcome.

In 2019, Nahas restarted conversation on developing a new version of PAR, also called v3.0, in an issue on the par2cmdline GitHub repository. In January 2022, a specification draft was made available in the Parchive GitHub. Sawada was working on developing a reference implementation in the par3cmdline GitHub repository, but no commits have been made since 2023. Comments welcome.

With the rise of other methods of file sharing and venues for discussion on the internet, Usenet has declined in popularity since the initial development of PAR in the early 2000s. PCMag reported that in 2024 Google Groups ended its support to post, subscribe, or view Usenet newsgroups. PAR2 seemingly remains in use for those who continue to share files via Usenet, as evidenced by the continued maintenance of PAR clients, such as MultiPar, and recent interest in the development of PAR v3.0. Comments welcome.


Format specifications Explanation of format description terms


Useful references

URLs


Last Updated: 01/21/2025