You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 5 Current »

Recommendations

Compliance and Metadata Recommendations

Grid Data Recommendations

Swath Data Recommendations

NetCDF Recommendations

  • Page:
    When to Employ Packing Attributes

    We recommend that packing attributes (i.e., scale_factor  and add_offset ) be employed only when data are packed as integers.

    Recommendation Details: Packing refers to a lossy means of data compression that typically works by converting floating point data to an integer representation that requires fewer bytes for storage. The packing attributes scale_factor  and add_offset  are the netCDF (and CF) standard names for the parameters of the packing and unpacking algorithms. If scale_factor  is 1.0 and add_offset  is 0.0, the packed value and the unpacked value are identical, although their datatype (float or integer) may differ. Unfortunately, many datasets annotate floating point variables with the attributes, apparently for completeness, even though the variables have not been packed and remain as floating point values. Incorporating packing attributes on data that have not been packed is a misuse of the packing standard and it should be avoided. Data analysis software that encounters packing attributes on data that are not packed is liable to be confused and perform in unexpected ways. Packed data must be represented as integers, and only integer types should have packing attributes.

  • Page:
    Not-a-Number (NaN) Value

    We recommend Earth Science data products avoid using Not-a-Number (NaN) in any field values or as an indicator of missing or invalid data.

    Recommendation Details: The Institute of Electrical and Electronics Engineers (IEEE) floating-point standard defines the NaN (Not-a-Number) bit-patterns to represent results of illegal or undefined operations. Unless carefully written, any arithmetic operation involving NaN values can halt a program. Furthermore, any relational operator with at least one NaN value operand must evaluate to False. These properties make NaN values difficult to handle in numerical software and reduce the interoperability of datasets that contain NaN.

  • Page:
    Standardize File Extensions for HDF5/netCDF Files

    We recommend using standardized file name extensions for HDF5 and netCDF files, as follows:

    • .h5 for files created with the HDF5 API;
    • .nc for files created with the netCDF API; and
  • Page:
    Distinguish clearly between HDF and netCDF packing conventions

    We recommend that datasets with non-netCDF packing be clearly distinguished from datasets that use the netCDF packing convention.

    Recommendation Details: Earth Science observers and modelers often employ a technique called “packing” (a.k.a. “scaling’) to make their product files smaller. "Packed" datasets must be correctly "unpacked" before they can be used properly. Confusingly, non-netCDF (e.g., HDF4_CAL) and netCDF algorithms both store their parameters in attributes with the same or similar names – and unpacking one algorithm with the other will result in incorrect conversions. Many netCDF-based tools are equally unaware of the non-netCDF (e.g., HDF_CAL) packing cases and so interpret all readable data using the netCDF convention. Unfortunately, few users are aware that their datasets may be packed, and fewer know the details of the packing algorithm employed. This is an interoperability issue because it hampers data analysis performed on heterogeneous systems.

  • Page:
    Make HDF5 files netCDF4-Compatible and CF-compliant within Groups

    We recommend that all HDF5 Earth Science product files be made netCDF4-compatible and CF-compliant within groups.

    Recommendation Details:

  • Page:
    Character Set for User-Defined Group, Variable, and Attribute Names

    We recommend that user-defined group, variable, and attribute names follow the Climate and Forecast (CF) convention's specification. The names shall comply with this regular expression: [A-Za-z][A-Za-z0-9_]* . Exempt are system-defined names for any of these objects that are required by various APIs or conventions.

  • No labels