Compliance and Metadata Recommendations
Grid Data Recommendations
Swath Data Recommendations
Not-a-Number (NaN) Value —
Recommendation Details: The Institute of Electrical and Electronics Engineers (IEEE) floating-point standard defines the NaN (Not-a-Number) bit-patterns to represent results of illegal or undefined operations. Unless carefully written, any arithmetic operation involving NaN values can halt a program. Furthermore, any relational operator with at least one NaN value operand must evaluate to False. These properties make NaN values difficult to handle in numerical software and reduce the interoperability of datasets that contain NaN.
When to Employ Packing Attributes —
Recommendation Details: Packing refers to a lossy means of data compression that typically works by converting floating point data to an integer representation that requires fewer bytes for storage. The packing attributes
add_offsetare the netCDF (and CF) standard names for the parameters of the packing and unpacking algorithms. If
scale_factoris 1.0 and
add_offsetis 0.0, the packed value and the unpacked value are identical, although their datatype (float or integer) may differ. Unfortunately, many datasets annotate floating point variables with the attributes, apparently for completeness, even though the variables have not been packed and remain as floating point values. Incorporating packing attributes on data that have not been packed is a misuse of the packing standard and it should be avoided. Data analysis software that encounters packing attributes on data that are not packed is liable to be confused and perform in unexpected ways. Packed data must be represented as integers, and only integer types should have packing attributes.
Distinguish clearly between HDF and netCDF packing conventions —
Recommendation Details: Earth Science observers and modelers often employ a technique called “packing” (a.k.a. “scaling’) to make their product files smaller. "Packed" datasets must be correctly "unpacked" before they can be used properly. Confusingly, non-netCDF (e.g., HDF4_CAL) and netCDF algorithms both store their parameters in attributes with the same or similar names – and unpacking one algorithm with the other will result in incorrect conversions. Many netCDF-based tools are equally unaware of the non-netCDF (e.g., HDF_CAL) packing cases and so interpret all readable data using the netCDF convention. Unfortunately, few users are aware that their datasets may be packed, and fewer know the details of the packing algorithm employed. This is an interoperability issue because it hampers data analysis performed on heterogeneous systems.
Use Only Officially Supported Compression Filters on NetCDF4 and NetCDF4-Compatible HDF5 Data —
Recommendation Details: NetCDF4 has enabled access to non-default (i.e., non-DEFLATE) HDF5 compression filters starting from version 4.7.0. However, the filter identification and access are currently obscure (~five digit IDs) and non-portable (no guarantees client software will be able to decompress them). DEFLATE is currently the only compression filter that is guaranteed to work with default (non-customized) netCDF4 installations, and so DEFLATE is the only compression filter that should be used in interoperable Earth Science data products in netCDF4 or netCDF4-compatible HDF5 formats. Use of the shuffle filter is not prohibited since it is not a compression filter and is supported by the netCDF4 default installation. Combining the shuffle and the DEFLATE filters can noticeably improve the data compression ratio.
Make HDF5 files netCDF4-Compatible and CF-compliant within Groups —