Recommendation:

We recommend that datasets with non-netCDF packing be clearly distinguished from datasets that use the netCDF packing convention.

Recommendation Details: Earth Science observers and modelers often employ a technique called “packing” (a.k.a. “scaling’) to make their product files smaller. "Packed" datasets must be correctly "unpacked" before they can be used properly. Confusingly, non-netCDF (e.g., HDF4_CAL) and netCDF algorithms both store their parameters in attributes with the same or similar names – and unpacking one algorithm with the other will result in incorrect conversions. Many netCDF-based tools are equally unaware of the non-netCDF (e.g., HDF_CAL) packing cases and so interpret all readable data using the netCDF convention. Unfortunately, few users are aware that their datasets may be packed, and fewer know the details of the packing algorithm employed. This is an interoperability issue because it hampers data analysis performed on heterogeneous systems.

  • One widely used HDF4 "packing" convention uses the following "unpacking" equation: unpacked = scale_factor x (packed - add_offset) . We shall refer to this convention as "non-netCDF" .
  • The standard netCDF "packing" convention uses the following "unpacking" equation: unpacked = scale_factor x packed + add_offset .
  • To disambiguate the various packing conventions we recommend that two new attributes be included in NASA Earth Science data products, especially if something other than the netCDF convention is used:
    • packing_convention="non-netCDF" 
    • packing_convention_description="unpacked = scale_factor x (packed - add_offset)"

or

    • packing_convention="netCDF" 
    • packing_convention_description="unpacked = scale_factor x packed + add_offset"
  • Future packing implementations should use scale_factor  and add_offset  only if these adhere to  the netCDF packing convention.