Sustainability of Digital Formats: Planning for Library of Congress Collections |
|
Introduction | Sustainability Factors | Content Categories | Format Descriptions | Contact |
Full name | ChemDraw Exchange |
---|---|
Description |
CDX is the native file format for the program ChemDraw which stores molecular data in a tagged binary format. FileInfo.com describes ChemDraw as a molecular editing program suite used for storing accurate chemical drawings. According to CDX File Format Documentation, "ChemDraw stores a document as a set of objects and properties. Objects are such things as atoms, bonds, fragments, arrows, and text. Properties are things like position, color, arrow type, and bond order." Richard L. Apodaca in the article, A Brief Introduction to the ChemDraw CDX File Format, describes "ChemDraw as the industry standard tool for generating publication-quality chemical structure graphics." ChemDraw Editor Software: Described on the ChemDraw Wikipedia page, ChemDraw is a molecule editor software program, first developed in 1985 by David A. Evans and Stewart Rubenstein. Cambridge Scientific Computing was launched in 1986 and eventually became CambridgeSoft Corporation. ChemDraw is now part of Revvity Signals Software's ChemOffice Suite of programs. According to Revvity Signals Software's blog, Back to School with ChemDraw, September 2022, "ChemDraw software is the most efficient way to draw and represent complex chemical structures and reaction schemes." CDX File Format: CDX Documentation, states that "the CDX file format is a tagged file format, meaning that it consists of a series of objects, each of which is preceded by a tag that identifies what the object represents." The general structure of a CDX file, as described by CDX Documentation, contains a header, objects, and properties, and uses little- endian byte order.
Text-Based CDXXML File Format: As stated in the CDX Documentation, "CDXML is an XML encoding of CDX -- a variant of CDX that complies with the XML specification. It differs from CDX only in the details of its formatting, and it doesn't even differ by that much...This is a very important point: a document can be converted from binary CDX to text-based CDXML and back again with absolutely no loss of information." As stated by Richard L. Apodaca, in the article, A Brief Introduction to the ChemDraw CDX File Format, 2010, "Interconversion between the two formats is lossless; everything that can be represented as a binary CDX file can also be represented as an XML CDX file." According to CDX Format Wikipedia, CDXML is the XML version and the preferred version of CDX. CDX Documentation describes the CDXML format as a file containing a header, followed by a series of tagged items and the end of file document object. CDXML files contain properties and objects same as the CDX files.
CDX Documentation has 'A Simple Example,' that includes a graphical drawing, the CDX binary version, the CDXML version, and a side-by-side comparison of the two file formats. Uses of CDX Files: According to An Nguyen in the Journal of Cheminformatics, December 2019, "The file formats CDX and CDXML are often used for the capture of chemical information...In addition, the CDX format allows the embedding of chemical structures into the Word files DOC or DOCX while maintaining the consistency and the synchronization of the ChemDraw information...The content can be used to process and retain most of the important information that was generated via the ChemDraw editor. Both file formats contain chemical objects (e.g., atoms, bonds, reactions) and properties (e.g. charge, valence, atom number, bond order) as structure content." |
Production phase | Middle to final state. CDX files are mainly used for storage, the CDXML format is used for delivery. |
Relationship to other formats | |
Defined via | XML, Extensible Markup Language (XML). CDX Documentation, "A CDXML is a CDX file specially formatted so that it conforms to the XML specification." |
LC experience or existing holdings | None |
---|---|
LC preference | The Library of Congress has not yet expressed any format preference for scientific data. |
Disclosure |
Standard format partially documented. According to the CDX Documentation, because of its ability to incorporate custom information, and because it is in the public domain, CDX has been adopted by the U.S. Patent Office as its standard chemical format. Depth-First.com article, An Introduction to the ChemDraw CDXML Format, Richard L. Apodaca states "The authoritative specification from Perkin Elmer (PE) offers a starting point for understanding CDXML...Although this documentation is mostly complete, several items are missing." Note: Authoritative Specification from Perkin Elmer (PE) is the same as CDX Documentation referenced in this document. |
---|---|
Documentation |
CDX File Format Documentation.(http://www.cambridgesoft.com/services/documentation/sdk/chemdraw/cdx/) |
Adoption |
According to CDX Wikipedia, the CDX file format is used across Windows, Mac, and Linux distributions. ChemDraw allows the use of the system clipboard, allowing users to copy and paste CDX files from ChemDraw to either Mac or Windows clipboards. Richard L. Apodaca, in the article, A Brief Introduction to the ChemDraw CDX File Format, 2010, describes CDX files, stating, "Chemists rarely save CDX files to disk themselves. Instead, ChemDraw content is copied from a drawing tool and pasted into Microsoft Office documents (particularly Word). This embedded CDX then gets saved along with the rest of the document into a single file. Extracting this embedded CDX content requires an Office file API." CDX Office File API Example: CDX Readers and Writers:
C++ header file - human-readable enumerations of all CDX object/property values, for writing third-party CDX readers. CDX is also supported by Wolfram Research's Mathematica application. |
Licensing and patents |
According to FairSharing.org, "Because of its ability to incorporate custom information, and because it is in the public domain, CDX has been adopted by the U.S. Patent Office as its standard chemical format." |
Transparency |
Depends on the format and available software. According to iChemLabs.com's news article, Read and Write ChemDraw Files, January 2010, "ChemDraw has two formats that need to be considered, the ChemDraw Exchange format (CDX) and it's xml sister (CDXML). The CDX format is a pure binary format (users won't be able to make sense of the objects inside when users open it in a text editor) while the CDXML format is text based and can be coherently read. Both formats are structurally identical and completely describe any group of ChemDraw objects. CambridgeSoft has been urging users to switch to the CDXML format due to its ease of use, but there are some drawbacks to the XML version due to its inherently larger size." |
Self-documentation |
Supports the inclusion of metadata. As stated on CDX Format Wikidata, "file format for two- dimensional atomic coordinates, chemical bond information and metadata." As explained in the CDX Documentation, CDX binary files consist of a fixed header, followed by tagged items/objects which can have properties (attributes) applied to them. "Properties, also called attributes, are self-contained. A property applies to the object which logically contains it. It may also describe other objects contained within the object which logically contains the property." |
External dependencies |
None beyond availability of supporting software. |
Technical protection considerations |
None found. |
Text | |
---|---|
Normal rendering |
Some support. CDX Documentation states, "The CDX file format (binary) is a tagged file format, meaning that it consists of a series of objects, each of which is preceded by a tag that identifies what the object represents. Tagged file formats in general are very flexible. Readers of a tagged file can efficiently skip over parts they aren't interested in or do not recognize...This flexibility means that a tagged file format can be expanded without invalidating any existing files...In the simplest view, a CDX file consists of a document header followed by a stream of tagged items followed by the end of the Document...Nesting can be difficult to see in a raw binary file." CDX Documentation, "A CDXML (XML) is a CDX file specially formatted so that it conforms to the XML specification. We expect that anyone who manipulates a CDXML file will be familiar with the general XML specifications." See XML for more information. |
Integrity of document structure |
Good support. CDX files have a general structure, see description. |
Integrity of layout and display |
Little to no information about CDX layout and display. In An Introduction to the ChemDraw CDXML Format, 2021, Richard L. Apodaca states, "CDX/ML is an odd cheminformatics file format in that is mixes a molecular graph encoding system with visual elements and styling. For example, a given CDX/ML file may contain a chemical structure together with a TLC plate. Each individual bond can be colored, and the text on atom labels can bear custom colors, fonts, and layout instructions. Sometimes visual elements can carry chemical meaning. For example, an arrow may be part of a reaction scheme. Likewise, a bracket may surround the repeating unit of a polymer. This broad scope, in which chemically meaningful elements are mixed with visual layout and arbitrary vector graphics, makes CDX/ML one the most complicated file formats in cheminformatics. Coupled with the useful, but incomplete PE specification, CDX/ML is not an easy format to understand or use." CDX saves ChemDraw drawings without loss of data. |
Support for mathematics, formulae, etc. |
Little to no information on support of mathematics, chemical formulae, diagrams, etc. |
Functionality beyond normal rendering |
According to Richard L. Apodaca in the article, An Introduction to the ChemDraw CDXML Format, 2021, it is "more common to find them (CDX/ML files) embedded in Microsoft Office or ChemOffice documents. Often, CDX/ML makes its way from these embedded environments to the outside world via the system clipboard. As stated in the CDX Documentation, "When an object is copied, ChemDraw puts a CDX binary file directly on the clipboard. The data placed on the clipboard is exactly the same as would be written to a file, so once users retrieve it from the clipboard in the first place, users can process it exactly as the user would process a disk-based file." |
Tag | Value | Note |
---|---|---|
Filename extension | cdx |
CDX Documentation, "CDX is the native file format of ChemDraw." |
Filename extension | cdxml |
CDX Documentation, "CDXML is a variant of CDX that complies with the XML specification...Everything that can be stored in a CDX file can also be stored in a CDXML and vice versa." |
Internet Media Type | chemical/x-cdx |
The National Archives, Chemical Draw Exchange Format. See https://www.nationalarchives.gov.uk/pronom/fmt/378. |
Pronom PUID | fmt/378 |
The National Archives, Chemical Draw Exchange Format. See https://www.nationalarchives.gov.uk/pronom/fmt/378. |
Wikidata Title ID | Q5010021 |
CDX Format, file format for two-dimensional atomic coordinates, chemical bond information and metadata, ChemDraw Exchange, CDX. See (https://www.wikidata.org/wiki/Q5010021) |
Wikidata Title ID | Q5010020 |
CDXML, CDX file specially formatted so that it conforms to the XML specification, ChemDraw Exchange XML format, CDXML. See https://www.wikidata.org/wiki/Q5010020 |
Wikidata Title ID | Q898716 |
ChemDraw, software for chemical structure drawing. See https://www.wikidata.org/wiki/Q898716 |
Wikidata Title ID | Q105850644 |
ChemDraw Template, file format. See https://www.wikidata.org/wiki/Q105850644 |
General | |
---|---|
History |
Bethany Halford's article, Reflections on ChemDraw, describes how ChemDraw was developed by the collaboration between Stewart Rubenstein and David and Sally Evans. Cambridge Scientific Computing was launched in 1986 and eventually became CambridgeSoft Corporation. CambridgeSoft Corporation later became PerkinElmer Informatics and was acquired by PerkinElmer, Inc. In 2011. As of May 9, 2023, PerkinElmer Informatics is now Revvity Signals Software. According to ChemDraw Wizard Pierre Morieux, Ph.D., in the article, Back to School with ChemDraw, September 2022, "ChemDraw has been the software application chemists use to draw chemical structures since 1985. It has long since become the industry standard and is packed with features that make it easy to create publication-ready drawings." |
|