Page tree

 

 

 

 

Earth Science Data and Information Systems (ESDIS) Project, Code 423

 

 

Unified Metadata Model - Variable (UMM-Var)

 


 

 

 

Signature/Approval Page

 

 

 

Prepared by:

 

 

 

 

 

 

 

Name

 

Date

Title/Role

 

 

Organization

 

 

 

 

 

Reviewed by:

 

 

 

 

 

 

 

Name

 

Date

Title/Role

 

 

Organization

 

 

 

 

 

Approved by:

 

 

 

 

 

 

 

Name

 

Date

Title/Role

 

 

Organization

 

 

 

 

 

Concurred by:

 

 

 

 

 

 

 

Name

 

Date

Title/Role

 

 

Organization

 

 

 

[Electronic] Signatures available in B32 Room E148

online at: / https://ops1-cm.ems.eosdis.nasa.gov/cm2/


Preface

This document is under ESDIS Project configuration control. Once this document is approved, ESDIS approved changes are handled in accordance with Class I and Class II change control requirements described in the ESDIS Configuration Management Procedures, and changes to this document shall be made by change bars or by complete revision.

 

Any questions should be addressed to: esdis-esmo-cmo@lists.nasa.gov

ESDIS Configuration Management Office (CMO)
NASA/GSFC

Code 423

Greenbelt, Md. 20771


Abstract

This document describes the Unified Metadata Model for Variables (UMM-Var) to be used by the National Aeronautics and Space Administration (NASA) Earth Science community and addresses the need for describing the types of variables that exist within data products that are described by the Unified Metadata Model for Granules (UMM-G) metadata records. Developers, engineers and architects should reference this document and the Unified Metadata Model (UMM) as a guide while implementing Common Metadata Repository (CMR) components, CMR clients or services that make use of the CMR or CMR clients. Data providers should use this model as a guide during metadata generation.

 

This version of the variable model focuses on what is the minimum variable metadata needed to support the User Interface/User Experience (UI/UX) leading to an improved user experience. Since there will be many thousands of variables in the CMR, it also supports the notion of auto-population of variable metadata records. This aims to reduce the workload on the metadata curator in their task to manage variable metadata over time.

 

 

Keywords: UMM-Var, UMM-S, UMM-G, Variables, NASA Earthdata Search, EOSDIS, ESDIS, CMR


Change History Log

 

Revision

Effective Date

Description of Changes

(Reference the CCR & CCB Approval Date)

 

 

CCR 423-ESDIS-XXX; CCB Approved

Pages:

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 


Table of Contents

1 Abstract

2 Change Explanation

3 Introduction

3.1 Purpose

3.2 Scope

3.3 Related Documentation

3.3.1 Applicable Documents

3.3.2 Reference Documents

3.4 Impact

3.5 Copyright Notice

3.6 Feedback

3.7 Document Conventions

4 Unified Metadata Model - Variables

4.1 Variable Context Diagram and Metadata Model Relationships

4.2 Use Cases

4.2.1 Browse Variables of a Collection

4.2.2 Faceted Browse

4.2.3 Update Variable Associations

4.2.4 Search Relevancy Ranking

4.2.5 Cross-site Data Subsetting

4.2.6 Access Variables, including Ancillary Variables (extension of the Cross-Site Subsetting Use Case).

4.2.7 Integrating Global Imagery Browse Services (GIBS) with Web-Based Clients

4.2.8 Measurement Comparison of two in-situ measurements of Species X

4.2.9 Size Estimation

4.3 UMM-Var Metadata Model

4.3.1 Name [R]

4.3.2 Alias

4.3.3 LongName [R]

4.3.4 Definition [R]

4.3.5 Unit [R]

4.3.6 DataType [R]

4.3.7 Scale and Offset

4.3.8 VariableType

4.3.9 VariableSubType

4.3.10 Characteristics

4.3.10.1 GroupPath

4.3.10.2 IndexRanges

4.3.11 ScienceKeywords

4.3.12 FillValues

4.3.13 Dimensions [R]

4.3.14 ValidRange

4.3.15 SizeEstimation

4.3.16 MeasurementIdentifiers

4.3.16.1 MeasurementName

4.3.16.2 MeasurementSource [R]

4.3.17 Sets

4.3.18 SamplingIdentifiers

4.3.19 AcquisitionSourceName [R]

5 Appendix A Tags Glossary

6 Appendix B Keywords and Measurements Governance Structure

7 Appendix C Analysis of CSDMS and CF Standard Names as a Source of Tagging

8 Appendix D Definitions of Terms

9 Appendix E Abbreviations and Acronyms

List of Figures

No table of figures entries found.

List of Tables

No table of figures entries found.

 

 


1           Introduction

The NASA Earth Observing System Data and Information System (EOSDIS) generates, archives, and distributes massive amounts and a large variety of Earth Science data via twelve Distributed Active Archive Centers (DAACs). Reliable, consistent and high-quality metadata are essential to enable cataloging and proper use of these data. To improve the quality and consistency among its metadata holdings, EOSDIS has developed models for metadata that it archives and maintains. This model aims to document vital elements that may be represented across various data models and standards and unify them through mainstream fields useful for data discovery, data use, and service invocations. This unified model, aptly named the Unified Metadata Model (UMM) will be used by the CMR and will drive search metadata cataloged within that system and retrieve data discovered through such searches.

 

This document describes the Unified Metadata Model for Variables (UMM-Var). It includes the uses cases for UMM-Var model itself and its relationship with other UMM models, element descriptions, and examples.

 

Listed below are some definitions with examples that will help the reader understand this model.

  • Measurement: The act or process of measuring an observable property, usually geophysical, geo-biophysical, physical, or chemical. In the case of air temperature, for instance, the object of the measurement is air and the property being measured is temperature. For models, it is a simulated observable property.
    • Using Scott Peckham's model as a basis for a measurement naming convention, the Measurement names can be expressed as: <<object, quantity>>,   object = "Aerosol", quantity = "Optical Depth".
    • Examples: Aerosol Optical Depth, Air Temperature, Surface Albedo, Solar Irradiance, Surface Reflectance, Atmospheric Moisture, Methane Concentration, Sulphur Dioxide Concentration, Ozone Concentration.
  • Variable: A named set of data that contains the recorded values of a measurement. In this context, the variable is described by its name and characteristics. For instance, a variable contained within the MYD08_M3V5 dataset is called: Optical_Depth_Land_Maximum. There are other variables in the set, including variables which contain information about geographic position and quality.

 

The description of the variable may include what was intended to be measured, i.e.,   the observable property, and how the variable was measured, such as measurement technique and the instrument used.

 

Variables may be classified as science variables, quality variables and ancillary variables (or other, when one of these classifications cannot be used). A variable can also be the output of a model.

  • Examples: Aerosol Optical Depth 550nm (Dark Target), Aerosol Optical Depth 550nm (Deep Blue, Land Only), Air Temperature (Daytime/Ascending), Air Temperature at 2m, Air Temperature at Surface (Daytime/Ascending), Air Temperature at Surface (Nightime/Descending), Relative Humidity (Daytime/Ascending), Relative Humidity (Nightime/Descending), Water Vapor Mass Mixing Ratio (Daytime/Ascending), Water Vapor Mass Mixing Ratio (Nightime/Descending), Methane Total Column (Nightime/Descending), SO2 Column Mass Density, SO2 Column Mass Concentration, Ozone - reported in parts per billion by volume.

 

  • ​Sample Illustration:
    • Example 1:
      • Measurement:
        • Aerosol Optical Depth
      • Variables:
        • Aerosol Optical Depth 550nm (Dark Target) - this is a science variable example
        • Aerosol Optical Depth 550nm (Deep Blue, Land Only) - this is another science variable example
        • Deep_Blue_Aerosol_Optical_Depth_550_Land_QA - this is a quality variable example
        • Deep_Blue_Algorithm_Flag_Land - this is an ancillary variable example
    • Example 2:
      • Measurement:
        • Ozone Mixing Ratio reported in parts per billion by volume
      • Variable:
        • O3_ppbv - this is a science variable example
    • Example 3:
      • Measurement:
        • Integrated Column NO2 loading
      • Variables:
        • NO2_Column - this is a science variable example

 

The term "Measurement" is the act or process of measuring an observable property, and is mostly likely to be used as a search term, as an alternate to the Science Keywords. The term "Variable" is an artifact that represents a Measurement. The UMM-Var model is not interested in the direct measurement that the instrument made. It is the "feature of interest" and the "observed property" represented by the data that are of interest. The Variable class will be used to store metadata about each variable. The Variable metadata will consist of its name and other characteristics. The CMR Variable class can be utilized to simplify search and retrieval of data products at the variable level.

 

In terms of the data product and its file structure, variables are stored within a data granule, such as   Aerosol Optical Depth 550nm (Dark Target), Aerosol Optical Depth 550nm (Deep Blue, Land Only), along with its associated data quality variables, and ancillary variables,   such as   latitude and longitude information.

 

1.1          Purpose

The purpose of UMM-Var is to express a variable model applicable to CMR that stores variable metadata. In addition, the UMM-Var model is related to the other CMR metadata models, such as   UMM-S, which supports the specification of variables which have associated services.

 

Note: the previous variable design principally addressed the concept of parameters. The parameter version of this model, known as UMM-P, sought to bridge the divide between variables and collection-level additional attributes. However, this new model, UMM-Var, considers variables in their own right. Now variables can be stored and discovered in ways described by a new set of use cases. Granule data may be subsetted by variable, or transformed in other ways, as supported by services. The user experience guides what selections and choices a user makes at the UI for typical data transformations such as   spatial subsetting, reprojection, reformatting, etc. The user is exclusively concerned about what choices are available for a specific data set and the back-end services take care of any needed processing.

 

This document provides information to the NASA Earth Science community. Distribution is unlimited.

 

1.2          Scope

This document describes the Unified Metadata Model - Variables (UMM-Var) model.

 

1.3          Related Documentation

The latest versions of all documents below should be used. The latest ESDIS Project documents can be obtained from Uniform Resource Locator (URL): https://ops1-cm.ems.eosdis.nasa.gov . ESDIS documents have a document number starting with either 423 or 505. Other documents are available for reference in the ESDIS project library website at: http://esdisfmp01.gsfc.nasa.gov/esdis_lib/default.php unless indicated otherwise.

 

1.3.1         Applicable Documents

The following documents are referenced within, are directly applicable, or contain policies or other directive matters that are binding upon the content of this document.

 

Document Number

Document Title

N/A

CMR Life Cycle

https://wiki.earthdata.nasa.gov/display/CMR/CMR+Documents

N/A

CMR End-To-End Services Study (Task 25) EED2-TP-025

https://wiki.earthdata.nasa.gov/download/attachments/83624411/EED2-TP-025_CMR%20End-To-End%20Services%20Study.pdf?api=v2

N/A

Scale Calibration Attributes

https://support.hdfgroup.org/release4/doc/UG_PDF.pdf (Section 3.10.6 Calibration Attributes)

N/A

Scale Attribute Conventions

https://cdn.earthdata.nasa.gov/conduit/upload/495/

netcdf_UG_3.6.3.pdf https://cdn.earthdata.nasa.gov/conduit/upload/495/netcdf_UG_3.6.3.pdf (See Appendix B Attribute Conventions)

 

1.3.2         Reference Documents

The following documents are not binding on the content but referenced herein and amplify or clarify the information presented in this document.

 

Document Number

Document Title

N/A

Tags

http://en.wikipedia.org/wiki/Tag_%28metadata%29

N/A

XPath

XPath is a language for addressing parts of an XML document, designed for use with XSLT.

 

1.4          Impact

This document outlines a model intended to be compatible with existing NASA Earth Science metadata implementations within the CMR. It will impact providers from NASA Distributed Active Archive Centers (DAACs), non-DAAC data providers, instrument Principal Investigators (PI), CMR client developers, metadata catalog developers, and users. Users will be impacted specifically in terms of data discovery and data use. This is very important for science research purposes.

 

1.5          Copyright Notice

The contents of this document are not protected by copyright in the United States and may be used without obtaining permission from NASA.

 

1.6          Feedback

Questions, comments and recommendations on the contents of this document should be directed to support@earthdata.nasa.gov

 

1.7          Document Conventions

There are two main sections to the rest document: the use cases and the detailed description of the metadata model. The use case section descibes the use cases used to create the metadata model. Each use case section contains the following information:

  • Scenarios: One or more related scenarios are described in this section.
  • Outcomes: A description of what the system provides the user as a result of the scenarios.
  • Use Case Diagram: A diagram that highlights the actor's interaction with the system.
  • Activity Diagram: A diagram that shows the flow of data in terms of the user experience.
  • Sequence Diagram: A diagram which shows the key components of the system and the sequences of actions within the system.

 

The detailed description of the metadata model section of this document describes each element within the model. Variable model elements are documented in the following way:

  • Element Name: Specifies the element name.
  • Element Specification: Provides the sub-elements, cardinality of the sub-elements within (), any valid values within <>, applicable comments and notes within {}, and any other major factors that make up the element.
  • Description: Provides background information on the purpose of the element and how it should be used. Any notes about the current usage of this element are documented here as well as any recommendations for usage or unresolved issues.
  • Tags: Provides specific, related categorical values associated with this element, which are defined in Appendix A: Tags Glossary.

 

With the exception of Element Name each of the element's sections are that are included are listed in bold to make it easier for the reader to distinguish between the element's section headings and the descriptions.

 

Table 1 . Cardinality

Value

Description

1

Exactly one of this element is required

0..N

This element is optional; up to and including N number of this element may be present

0..*

Optionally, many of this element may be present

1..*

At least one of this element is required, many may be present

 

Interaction diagrams presented in this document are based on the Unified Modeling Language (UML) n otation.


2           Unified Metadata Model - Variables

 

2.1          Variable Context Diagram and Metadata Model Relationships

Figure 1 shows the UMM-Var metadata model at a high level and its relationships with other key models: Collection (UMM-C), Granule (UMM-G), and Service (UMM-S). It is a high-level diagram, with abbreviated models, showing the key associations between the new UMM-Var and the other models in the UMM. Note that the figure specifically highlights the Variable model's role in the context of the Unified Metadata Model.

 

Figure 1 . Variable (UMM-Var) as part of the UMM context diagram.

 

2.2          Use Cases

This section provides information about use cases identified for the UMM-Var. In keeping with the UML methodology, we provide a use case diagram showing the actor's interaction with the system. An activity diagram shows the flow of data, in terms of the user experience, and a   sequence diagram which shows the sequences of actions within the system, and the key components of the system.

 

2.2.1         Browse Variables of a Collection

Scenario: A user starts with a collection and wants to know what variables it includes.

 

Outcomes: Enables a user without any knowledge of variable names to search for collections, select one, and be presented with a list of variables for that collection, grouped by measurement.

 

See the use case in Figure 2 b elow.

Figure 2 . Use Case: Browse Variables of a Collection

 

See the user experience activity diagram in Figure 3 below.

Figure 3 . Activity Diagram: Browse Variables of a Collection

 

See the system workflow sequence diagram in Figure 4 below.

Figure 4 . Sequence Diagram: Browse Variables of a Collection

2.2.2         Faceted Browse

Scenario [a]: A user of a search tool, i.e. Earthdata Search Client (EDSC - https://search.earthdata.nasa.gov ), can get a list of Measurement facets from the CMR.

Scenario [b]: A user of the EDSC can click on a "Measurement" facet value and constrain the lists to the collections that match the selected Measurement type and any other constraints (i.e. spatial, temporal) that have been selected.

 

Outcomes: A user of the EDSC, with no knowledge of the types of Measurements available within the CMR, can get a list of types of Measurements, and can further constrain the lists to Collections which match by selecting that type of Measurement, and any other constraints.

Note: The user can select a Measurement type to see the list of Variables available for that Measurement type, and an associated list of quality and ancillary variables. The association between Measurement types and Variables may be made previously by the Metadata Curator using the Metadata Management Tool (MMT), or a client specific to that provider.

 

Note: Scenarios [a] and [b] are related, in that the first represents a starting point for faceted browse, whereby the user can see a list of facets. Typically, the user wants to narrow the search criteria as much as possible, so starting with the list of facets (scenario[a]), the user selects one or more facets (scenario[b]) and via the EDSC UI, this yields a search result which displays those variables available in association with a specific Measurement type.

 

See the use case diagram in Figure 5 below .

Figure 5 . Use Case: Faceted Browse

 

See the user experience activity diagram in Figure 6 below .

Figure 6 . Activity Diagram: Faceted Browse

 

See the workflow sequence diagram in Figure 7 below .

Figure 7 . Sequence Diagram: Faceted Browse

2.2.3         Update Variable Associations

Scenario [a]: A user of the CMR client, such as the Metadata Management Tool (MMT) or other suitable metadata curation tool, can associate multiple variables with a collection.

Scenario [b]: A user of the CMR client, can submit multiple collections and all of the variables listed for each collection.

Scenario [c]: A metadata curator can populate the list of valid measurements associated with variable with selections from the Science Keywords hierarchy, Community Surface Dynamics Modeling System (CSDMS) or NetCDF Climate and Forecast (CF) metadata convention.

 

Outcomes: The curator seeded the CMR with new valid measurements chosen from the Science Keywords hierarchy and allowed editors to maintain variable and collection associations.

 

See the   use case diagram in Figure 8 below.

 

Figure 8 . Use Case: Update Variable Associations

 

See the user experience activity diagram in Figure 9 below .

Figure 9 . Activity Diagram: Update Variable Associations

 

See the workflow sequence diagram in Figure 10 below.

Figure 10 . Sequence Diagram: Update Variable Associations

 

2.2.4         Search Relevancy Ranking

Scenario: As a search engine (CMR), I can rank collections with a high relevance ranking when one or more of the search words appear in the measurement terms associated with the variables in the collection, as opposed to more generic fields such as the summary or references.

 

Outcomes: Returns to the user a list of Collections ranked by relevancy to the words used in the search term matching a measurement term.

 

See the use case diagram in Figure 11 below.

Figure 11 . Use Case: Search Relevancy Ranking

 

See the user experience activity diagram in Figure 12 below.

Figure 12 . Activity Diagram: Search Relevancy Ranking

See the workflow sequence diagram in Figure 13 below.

Figure 13 . Sequence Diagram: Search Relevancy Ranking

2.2.5         Cross-site Data Subsetting

Scenario: As a subsetting GUI, I can present the variables for a given collection in a logically categorized way, such as by measurement, and further subset the data into more specific groups based on additional criteria.

 

Outcomes: Enables users of a subsetting GUI to perform cross-site subsetting variables based on the selection of a collection and categorized by measurement. Cross-site subsetting occurs when a variable (by its association with a granule) can exist in more than one collection and these collections may be sourced from multiple sites (i.e. Goddard Earth Sciences Data and Information Services Center (GES DISC), Level-1 and Atmosphere Archive and Distribution System (LAADS), etc.). The CMR can perform a cross-site search since it houses metadata from all sites. This use case enables a user to go on to perform subsetting via a GUI.

 

Note: In the example shown below, the measurement term used was "Ozone". This resulted in three collections being returned from the search: AIRX2RET v005,   OMDOAO3 v003, and MOD08 v006. In the subsetting GUI, variables are shown grouped for each collection. The user will be able to subset the variable fields for specific granules of interest.

 

See the use case diagram in Figure 14 below and user interface examples in Figures 15 - 17 .

Figure 14 . Use Case: Cross-site Data Subsetting

 

Figure 15 . User interface view for a user to choose a subset of variables for the AIRX2RET collection

Figure 16 . User interface view for a user to choose a subset of variables for the OMDOAO3 collection

Figure 17 . User interface view for a user to choose a subset of variables for the MOD08 collection

 

See user experience activity diagram in Figure 18 below.

 

Figure 18 . Activity Diagram: Cross-site Data Subsetting

 

This following depicts the case where collections are searched for across DAACs. See the workflow sequence diagram in Figure 19 below.

 

Figure 19 . Sequence Diagram: Cross-site Data Subsetting

 

2.2.6         Access Variables, including Ancillary Variables (extension of the Cross-Site Subsetting Use Case).

Scenario: The user starts with a list of science variables, such as {"sea surface height", "10.7 micron band", "NIR radiances", ... }, and wants to know which collections contain the specified variables (and may also want to know about data quality, instrument calibration, brightness temperature). These variables may be needed to fully understand the data.

 

Outcomes: Enables a user without any knowledge of the available variables to locate those variables of interest, and constrain these by collection metadata, such as calibration, spatial extent, temporal extent, spacecraft orbit location, etc. which contain a variable containing properties selected from the initial list of properties. Allows the subsequent discovery of associated variables, such as   ancillary, calibration, geolocation, data quality variables, which are directly related to the science variable selected.

 

See the use case diagram in Figure 20 below.

 

Figure 20 . Use Case: Access Variables including Ancillary Variable

 

See the user experience activity diagram in Figure 21 below.

 

Figure 21 . Activity Diagram: Access Variables including Ancillary Variable

 

See the workflow sequence diagram in Figure 22 below.

 

Figure 22 . Sequence Diagram: Access Variables, including Ancillary Variable

2.2.7         Integrating Global Imagery Browse Services (GIBS) with Web-Based Clients

Scenario: As a user of a GIBS client (EDSC, Worldview, GloVIS - https://glovis.usgs.gov/), I can view a pre-generated visualization for a specific granule or a daily composite of multiple granules with a specific collection and variable. Through CMR, I can locate the granules, the corresponding data variable from which the layer was generated, and any ancillary variables that need to go along with that variable (geolocation, data quality, etc.). Ideally, I can transform that information into a set of subsetting request URLs that will fetch just those data variables from the appropriate granules.

 

Outcomes: Allow users of a GIBS client to fetch data subsets based on their layer selections, and any associated variables.

 

Note: It may be a prerequisite for this use case to have a way to invoke "show layers" for a collection, and selected variable(s), per the GIBS client. Also, providers may be providing a single image per granule to GIBS, such as   if it's a L3/L4 product. But for L1/L2 products, this is not always the case. For one provider, LP DAAC, the only non-composited L1/L2 granule imagery available is for the AST_L1T product.

 

See the use case diagram in Figure 23 below.

 

Figure 23 . Use Case: Integrating GIBS with web-based clients

 

See the user experience activity diagram in Figure 24 below.

 

Figure 24 . Activity Diagram: Integrating GIBS with web-based clients

 

See the workflow sequence diagram in Figure 25 below.

 

Figure 25 . Sequence Diagram: Integrating GIBS with web-based clients

2.2.8         Measurement Comparison of two in-situ measurements of Species X

Scenario: Measurement Comparison of two in-situ measurements of Species X, which is critical to understanding of tropospheric chemistry. In this scenario, the measurement techniques are not well established. Find collections (and variables) which are tagged with measurement terms for Species X (X = Nitrous Oxide).

 

Outcomes:   User obtains collections (and variables) for Principal Investigator (PI)-related data files containing Species X from instrument A and instrument B.

 

See the use case diagram in Figure 26 below.

 

Figure 26 . Use Case, Measurement Comparison of two in-situ measurements of Species X

 

See the user experience activity diagram in Figure 27 below .

 

Figure 27 . Activity Diagram: Measurement Comparison of two in-situ measurements of Species X

 

See the workflow sequence diagram in Figure 28 below.

 

Figure 28 . Sequence Diagram: Measurement Comparison of two in-situ measurements of Species X

2.2.9         Size Estimation

The EDSC needs to provide an estimate of the size of an order for the constraints provided whenever   spatial subsetting is invoked.

 

Scenario [a]: User needs to know which vars, collections and formats are available, and ultimately the Size Estimate for this request.
Scenario [b]: EDSC displays the relevant vars, collections and format, based on selection criteria, and displays these via the UI modal
Scenario [c]: EDSC makes a request to the CMR (micro-service) to compute the Size Estimate for this request, passing on the user selections for vars, collections and formats.
Scenario [d]: CMR (micro-service) gets the vars for this collection and formats for delivery (say ASCII has been selected). CMR (micro-service) gets the AverageSizeOfGranulesSampled, and AverageCompressionRateASCII and computes Size Estimate for this collection.
Scenario [e]: CMR (micro-service) returns computed Size Estimate to the EDSC.
Scenario [f]: EDSC displays the computed Size Estimate to the user, via the UI modal for user action.

 

Outcomes:   The user can know the   computed Size Estimate for a given request, based on the selection of vars, collections, formats and any selection criteria where spatial subsetting is invoked.  

 

See the use case diagram in Figure 29 below.

Figure 29 . Use Case, Size Estimation

 

See the user experience activity diagram in Figure 30 below.

  Figure 30 . Activity Diagram: Size Estimation

 

See the workflow sequence diagram in Figure 31 below.

Figure 31 . Sequence Diagram: Size Estimation

 

2.3          UMM-Var Metadata Model  

As shown in Figure 32, the UMM-Var Metadata Model asserts that a Variable metadata instance is related to one or more Collections, one or more Granules, and one or more Variables (a Science Variable may have a related Quality or Ancillary Variable).   The remaining classes: Characteristics, ScienceKeywords, Measurements, Sets, FillValues,   Dimensions and SizeEstimation are discussed in more detail throughout the remainder of this document. Each class and   relationship express   a different type of information conveyed by the variable.

Figure 32 . UMM-Var Metadata Model

 

The author of a Variable metadata record should be cognizant of the following:

 

  1. A Collection has zero or more Variables.
  2. A Granule aggregates one or more Variables. Note the Variable class lifecycle is dependent on the Granule class instance lifecycle, meaning that a Granule is defined by the Science Team first, and then each Variable which is aggregated by the Granule is defined. The metadata within the CMR simply reflects this.
  3. A Variable may be related to zero or more Variables. For example, a Variable with VariableType: Science may have a related Variable(s) with VariableType: Quality and/or a Variable with VariableType: Ancillary.
  4. The elements within the Characteristics section apply to a Variable. Not all Variables have all the elements contained within the Characteristics class.
  5. The elements of the ScienceKeywords section also apply to a Variable. The ScienceKeywords may be sourced from Global Change Master Directory (GCMD) Keywords. (See Appendix B).
  6. The elements of the Measurement section apply to a Variable. The Measurement names may be sourced from the CSDMS standard names or the CF Convention standard names. This process will be dictated by a GCMD-style Governance process. (See Appendix C).
  7. Information in the Characteristics section should be derived only from the Granule's data file. Granule selected should be from a collection that is associated with the variable.
  8. A Variable record may be created / updated via the MMT GUI or XML file.
  9. A Variable's record should answer all parts of the following question: What measurement type, collections, variables, granules, visualizations are associated with the Variable?
  10. A mechanism within the UMM-S model enables the association of Services with specific Variables.

 

All of the variable elements will be described next. The simple elements will be described first followed by the elements that contain sub-elements. For the simple elements, the required fields of Name, LongName, and DataType are derived from the variable fields in the data set. The non-mandatory elements of Units, ValidRange, Scale and Offset are derived from the data set, if available. The Definition, VariableType and VariableSubType elements are set by the metadata curator via the MMT client, or other suitable metadata curation tool .

 

2.3.1         Name [R]  

Element Specification

Name (1)

 

Description

A variable short name given by the data provider.

 

Variables are available in a wide range of forms. These variables are named similarly across a family of collections, but these names differ considerably across collections. The variety of variables is illustrated using some examples across a sample of collections in Figure 33 below.

 

The VIIRS_SST_NPP L3C-GHRSST-SST Data Set structure is represented in Figure 33.  

Figure 33 . The sea_surface_temperature Variable Highlighted within the VIIRS_SST_NPP L3C-GHRSST-SST Data Set

 

The highlighted sea_surface_temperature variable structure is shown in Figure 34 with a plot in Figure 35. Note the dimensionality of the variable is: time=1, nj=3072 and ni=4096.

 

Figure 34 . The sea_surface_temperature Variable Structure

 

Figure 35 . A sea_surface_temperature Variable Plot

 

The corresponding data quality variable is shown in Figure 36. Note the dimensionality of the variable is: time=1, nj=3072 and ni=4096.

 

Figure 36 . The quality_level Variable Structure

 

Conversely, the LST variable contained within the MOD11A1 Data Set Structure is shown in Figure 37.

 

Figure 37 . The LST_Day_1km Variable Highlighted within the MOD11A1 Data Set

 

The LST_Day_1km variable structure is represented as shown in Figure 38 with a plot in Figure 39 . Note the dimensionality of the variable is: YDim=1200 and XDim=1200.

Figure 38 . The LST_Day_1KM Variable Structure

Figure 39 . A LST_Day_1km Plot

 

The corresponding quality variable is represented as shown in Figure 40 . Note the dimensionality of the variable is: YDim=1200 and XDim=1200.

 

Figure 40 . The QC_Day Variable Structure

 

CER_BDS_Aqua-FM3_Edition1 Data Set structure is represented in Figure 41.

 

Figure 41 . The CERES_SW_Filtered_Radiances_Upwards Variable Highlighted within the CER_BDS_Aqua-FM3_Edition1 Data Set Structure

 

The selected CERES_SW_Filtered_Radiances_Upwards variable structure is represented in Figure 42. Note the dimensionality of the variable is: Records=13091 and Samples=660.

Figure 42 . The CERES_SW_Filtered_Radiances_Upwards Variable Structure.

 

CERES_Solar_Zenith_at_Surface variable structure is represented in Figure 43.

Figure 43 . CERES_SYN_1km Data Set Structure

  The SW_TOA_Clear-Sky variable is highlighted within the CERES_SYN_1km Data Set structure as shown in Figure 44.

Figure 44 . The SW_TOA_Clear-Sky Variable Highlighted within the CERES_SYN_1km Data Set Structure

In Figure 45 is the SW_TOA_Clear-Sky variable structure representation and in Figure 46 is its plot . Note the dimensionality of the variable is: Mean_&_Stdev=2, Synoptic_Hours_(1, 4, 7, 10, 13, 16, 19, 22)=8, 1.0_deg.regional_colat.zones=180 and 1.0_deg._regional_long._zones=360.

 

Figure 45 . The SW_TOA_Clear-Sky Variable Structure  

Figure 46 . A SW_TOA_Clear-Sky Variable Plot

 

AIRS.2012.02.09.L3.CO2Std008 data set structure is represented in Figure 47.

 

Figure 47 . The mole_fraction_of_carbon_dioxide_in_free_troposphere Variable Highlighted within the AIRS.2012.02.09.L3.CO2Std008 Data Set

The highlighted mole_fraction_of_carbon_dioxide_in_free_troposphere variable structure is shown in Figure 48 and its plot in Figure 49 . Note the dimensionality of the variable is: LatDim=91, LonDim=144.

Figure 48 . The mole_fraction_of_carbon_dioxide_in_free_troposphere Variable Structure

 

Figure 49 . A mole_fraction_of_carbon_dioxide_in_free_troposphere Plot

 

Sample Values (given in bold ) below:

 

sea_surface_temperature (sea surface temperature)

quality_level (quality level of the sea surface temperature)

LST_1KM_Day (daily daytime 1km grid land surface temperature)

QC_day (quality control for daytime LST and emissivity)

CERES_SW_Filtered_Radiances_Upwards (CERES SW filtered radiances, upwards)

CERES_Solar_Zenith_at_Surface (CERES solar zenith at surface)

SW_TOA_Clear-Sky (1 degree regional month observed TOA fluxes)

mole_fraction_of_carbon_dioxide_in_free_troposphere (mole fraction of carbon dioxide in free troposphere)

psl (mean sea level pressure)

O3_ppbv (ozone mixing ratio reported in parts per billion by volume)

Scat_550 (total dry aerosol scattering coefficient at 550 nm)

Sur_Refl_b01 (surface reflectance band 1)

WDB_L3MCA10 (Aerosol Optical Depth 550nm (Land Only) )

 

Tags

Required, Free Text Search

 

2.3.2         Alias

Element Specification

Alias (0)

 

Description

The alias for the name of a variable. The alias may be used by the size estimation service. This service is still evolving and examples will be provided when available.

 

2.3.3         LongName [R]

Element Specification

LongName (1)

 

Description

The expanded or long name given by the data provider.

 

Sample values (given in bold ) below:

 

sea_surface_temperature ( sea surface temperature )

quality_level ( quality level of the sea surface temperature )

LST_1km_Day ( daily daytime 1km grid land surface temperature )

QC_day ( quality control for daytime LST and emissivity )

CERES_SW_Filtered_Radiances_Upwards ( CERES SW filtered radiances, upwards )

CERES_Solar_Zenith ( CERES solar zenith at surface )

SW_TOA_Clear-Sky ( 1 degree regional month observed TOA fluxes )

>mole_fraction_of_carbon_dioxide_in_free_troposphere ( mole fraction of carbon dioxide in free troposphere )

psl ( mean sea level pressure )

O3_ppbv ( ozone mixing ratio reported in parts per billion by volume )

Scat_550 ( total dry aerosol scattering coefficient at 550 nm )

LST_1KM_Day ( daily daytime 1km grid land surface temperature )

Sur_Refl_b01 ( surface reflectance band 1 )

WDB_L3MCA10 v004 ( Aerosol Optical Depth 550nm (Land Only) )

 

Tags

Required, Free Text Search

 

2.3.4         Definition [R]

Element Specification

Definition (1)

 

Description

The meaning   of the variable given by the data provider. This can typically be found in the Collection User Guide corresponding to the variable. Ideally, it should include the details of what is being measured, the scope of the measurement, and any other information to assist   a scientist's understanding of the variable's particularities. See the SamplingIdentifiers   class for details about the sampling method and the measurement and reporting conditions.

 

Sample value: "Angstrom Exponent is an exponent that expresses the spectral dependence of aerosol optical thickness (τ) with the wavelength of incident light (λ). The spectral dependence of aerosol optical thickness can be approximated (depending on size distribution) by, τa = β λα where α is the Angstrom exponent (β = aerosol optical thickness at 1 μm)".

 

Tags

Required

 

2.3.5         Unit [R]

Element Specification

Unit (1)

 

Description

The unit used to report the variable. The list of units will be sourced from the Dataset Interoperability Working Group (https://wiki.earthdata.nasa.gov/display/ESDSWG/Dataset+Interoperability+Working+Group ). The list will be managed as a KMS-managed list.

 

Sample values:

 

Table 2 . Example Values for the Variable's Unit

Coordinate Variable

Unit Value

Examples

latitude

degrees_north

89.9 degrees_north

longitude

degrees_east

-179.9 degrees_east

pressure

Pa or hPa

50 Pa

height (depth)

meter (m) or kilometer (km)

10,000 m

time

Seconds, minutes, hours, days, etc., since a specific starting point in time, often (but not always) representing a canonical time (1 Jan 1970, TAI93, start of mission, etc.).

Time is in International Organization for Standardization (ISO)-8601 format. seconds since 1992-10-08T15:15:42.5-6:00 days since 1970-01-01T00:00:0

 

Tags

Required, Controlled Vocabulary

 

2.3.6         DataType [R]

Element Specification

DataType (1) <"byte", "float", "float32", "float64", "double", "ubyte", "ushort", "uint", "uchar", "string", "char8", "uchar8", "short", "long", "int", "int8", "int16", "int32", "int64", "uint8", "uint16", "uint32", "uint64", "OTHER">

 

Description

Specifies the basic computer science data type of a variable. These types can be either short, long, character, binary, etc. Table 3   and T able 4   list out some data types from the Hierarchical Data Format (HDF) version 4 and 5 specifications.

 

Table 3 . HDF4 User Guide as a Possible Source

HDF Data Type

Data Type Flag and Value

Description

char8

DFNT_CHAR8 (4)

8-bit character type

uchar8

DFNT_UCHAR8 (3)

8-bit unsigned character type

int8

DFNT_INT8 (20)

8-bit integer type

uint8

DFNT_UINT8 (21)

8-bit unsigned integer type

int16

DFNT_INT16 (22)

16-bit integer type

uint16

DFNT_UINT16 (23)

16-bit unsigned integer type

int32

DFNT_INT32 (24)

32-bit integer type

uint32

DFNT_UINT32 (25)

32-bit unsigned integer type

float32

DFNT_FLOAT32 (5)

32-bit floating-point type

float64

DFNT_FLOAT64 (6)

64-bit floating-point type

 

Table 4 . HDF5 User Guide as a Possible Source

HDF5 Data Type

Data Type Flag and Value

Description

string

NC_STRING

string type

char

NC_CHAR

character type

ubyte

NC_UBYTE

unsigned byte type

ushort

NC_USHORT

unsigned short type

uint

NC_UINT

unsigned integer type

uint64

NC_UINT64

64-bit unsigned integer type

byte

NC_BYTE

byte type

short

NC_SHORT

short type

int

NC_INT

integer type

int64

NC_INT64

64-bit integer type

double

NC_DOUBLE

double type

 

Sample value: "float".

 

Tags

Required, Controlled Vocabulary

 

2.3.7         Scale and Offset

Element Specification

Scale (0..1)

Offset (0..1)

 

Description

The Scale is the numerical factor by which all values in the stored data field are multiplied in order to obtain the original values. The Offset is the value which is either added to or subtracted from all values in the stored data field in order to obtain the original values. Scale and Offset may be used together. The formula by which the Scale and Offset are applied is usually one of the following:

  1. Additive Offset formula: actual data value = (scale factor * scaled value) + offset
  2. Subtractive Offset formula: actual data value = scale factor * (scaled value - offset

 

Note: the additive offset formula is the standard one, with the subtractive being non-standard, and rarely used. Exceptions include science variables from: Moderate Resolution Imaging Spectroradiometer (MODIS), MOD08_M3 (MODIS/Terra Aerosol Cloud Water Vapor Ozone Monthly L3 Global 1Deg CMG)and MCD43A4 (MODIS/Terra+Aqua BRDF/Albedo Nadir BRDF-Adjusted Ref Daily L3 Global - 500m) which use the subtractive offset formula.

 

Scale Sample value: 0.00100000004749745

Offset Sample value: 0.0

 

Tags

Recommended

 

2.3.8         VariableType

Element Specification

VariableType (0..1) <"SCIENCE_VARIABLE", "QUALITY_VARIABLE", "ANCILLARY_VARIABLE", "OTHER">

 

Description

This element is controlled and specifies the basic type of a variable. These types can be either: "SCIENCE_VARIABLE", "QUALITY_VARIABLE", "ANCILLARY_VARIABLE", or "OTHER".

 

Sample value: "SCIENCE_VARIABLE".

 

Tags

Recommended, Controlled Vocabulary

 

2.3.9         VariableSubType

Element Specification

VariableSubType (0..1) <"SCIENCE_SCALAR", "SCIENCE_VECTOR", "SCIENCE_ARRAY", "SCIENCE_EVENTFLAG", "OTHER">

 

Description

This element is controlled and specifies the sub type of a variable. There are different types of science variables and this information is variable specific and important for data use.   The sub-types can be used in the following way: science_scalar (e.g., O3, NO, NO2, CH2O, CN, etc.); science_vector (e.g., wind direction); science_array (e.g., radiation spectrum, aerosol number size distribution); science_eventflag (e.g., cloud flag, pollution plume). There are other types of variables not included here.

 

Sample value: "SCIENCE_SCALAR".

 

Tags

Recommended, Controlled Vocabulary

 

2.3.10     Characteristics

Elements

Characteristics (0..1)

Characteristics/GroupPath

Characteristics/IndexRanges

 

Description

The elements of this section apply to a Variable.

 

Tags

Recommended

 

2.3.10.1    GroupPath

Element Specification

Characteristics/GroupPath (0..1)

 

Description

The full path to the variable within the Granule structure. The main purpose of this field is to capture the full path of the variable from within the granule file structure. Sets of variables which are nested a levels below the "/' root level can be located correctly. In the example shown here, the set named "/Data_Fields" is nested in a path called '/MODIS_Grid_Daily_1km_LST". This important structural information is not lost once the Variable records have been ingested into the CMR.

 

Sample Values:  

'/MODIS_Grid_Daily_1km_LST/Data_Fields'

'/'

 

Tags

Recommended

 

2.3.10.2    IndexRanges

Element Specification

Characteristics/IndexRanges (0..1)

Characteristics/IndexRanges/LatRange (1)

Characteristics/IndexRanges/LonRange   (1)

 

 

Description

Describes the spatial index ranges of a variable, which consist of a LatRange as a pair of values and a LonRange as a pair of values. If the IndexRanges element   is used the LatRange and LonRange sub elements are required. In the example shown below, the index ranges represent the ranges of the latitude and longitude respectively of the variable. Each range is described by a pair of values.

 

Sample Values:  

Characteristics/IndexRanges/LatRange: [89.5,   -89.5]

 

Characteristics/IndexRanges/LonRange: [-179.5, 179.5]

 

Tags

Recommended

 

2.3.11     ScienceKeywords

Elements

ScienceKeywords (0..*)

ScienceKeywords/Category [R]

ScienceKeywords/Topic [R]

ScienceKeywords//Term [R]

ScienceKeywords/Variable_Level1

ScienceKeywords/Variable_Level2

ScienceKeywords/Variable_Level3

ScienceKeywords/Detailed_Variable

 

Description

Science Keywords are derived from the ESDIS   keyword management system. The   keywords are provided to enable better searches by the use of human-readable measurement terms. Note that the keywords have a more complex structure than the Measurement class. ScienceKeywords are hierarchical with the higher level keywords. Category, Topic, Term are required and the lower level keywords,   VariableLevel1, VariableLevel2 and VariableLevel3 and DetailedVariable are optional. It is important to recognize that the measurement terms are sometimes used in any one of the lower level keywords. So for example the measurement term "Methane" may be entered into the "DetailedVariable" element for a collection which possesses Methane variables, such as   AIRX3STD.006.

 

Science Keywords search is offered as the primary way to discover variables. ScienceKeywords and Measurements could   be used interchangeably for faceted browse in search clients. Elements in this category are used for search and faceting purposes.

 

Sample Values:

"Category": "EARTH SCIENCE", "Topic": "ATMOSPHERE", "Term": "ATMOSPHERIC CHEMISTRY", "Variable_Level1": "NITROGEN COMPOUNDS", "Variable_Level2": "Peroxyacyl Nitrate".

 

Tags

Recommended, Controlled Vocabulary

 

2.3.12     FillValues

Element Specification

FillValues (0..*)

FillValues/Value (1)

FillValues/Type (1)   <"SCIENCE_FILLVALUE", "QUALITY_FILLVALUE", "ANCILLARY_FILLVALUE", "OTHER">

FillValues/Description

 

Description

The fill value of the variable in the data file. It is generally a value which falls outside the valid range. For example, if the valid range is '0, 360', the fill value may be '-1'. The fill value type is data provider-defined. It is typically a value out of valid range, although some cases have been reported of exceptions to this rule.

 

Sample values:

Value: -1

Type:   SCIENCE_FILLVALUE

Description: "Valid Science Fill Value"

 

Value: -9999

Type:   QUALITY_FILLVALUE

Description: "Valid Quality Fill Value"

 

Tags

Recommended, Controlled Vocabulary

 

2.3.13     Dimensions [R]

Element Specification

Dimensions (1..*)

Dimensions/Name (1)

Dimensions/Size (1)

Dimensions/Type (1)   <"LATITUDE_DIMENSION", "LONGITUDE_DIMENSION", "PRESSURE_DIMENSION", "HEIGHT_DIMENSION", "DEPTH_DIMENSION", "TIME_DIMENSION", "OTHER">

 

Description

A variable consists of one or more dimensions. An example of a dimension name is 'XDim'. An example of a dimension size is '1200'. For the example where time=1; Name = time, Size = 1, and Type = TIME_DIMENSION. Variables are rarely one dimensional. More commonly, they are two or three dimensional.  

 

Sample values:

 

For the sea_surface_temperature variable, the dimensionality is: time=1, nj=3072 and ni=4096.

For the quality_level variable variable, the dimensionality is: time=1, nj=3072 and ni=4096.

For the LST_Day_1KM variable, the dimensionality is: YDim=1200 and XDim=1200.

For the QC_Day variable, the dimensionality is: YDim=1200 and XDim=1200.

For the CERES_SW_Filtered_Radiances_Upwards variable, the dimensionality is: Records=13091 and Samples=660.

For the SW_TOA_Clear-Sky variable, the dimensionality of the variable is: Mean_&_Stdev=2, Synoptic_Hours_(1, 4, 7, 10, 13, 16, 19, 22)=8, 1.0_deg.regional_colat.zones=180 and 1.0_deg._regional_long._zones=360.

For the mole_fraction_of_carbon_dioxide_in_free_troposphere variable, the dimensionality of the variable is: LatDim=91, LonDim=144

 

Tags

Required, Conrolled Vocabulary

 

2.3.14     ValidRange

Element Specification

ValidRange (0..1)

ValidRange/Max (0..1)

ValidRange/Min (0..1)

ValidRange/CodeSystemIdentifierMeaning (0..*)

ValidRange/CodeSystemIdentifierValue (0..*)

 

Description

ValidRange specifies the minimum and maximum valid values of the variable represented in the data field.   Optionally, if the valid range is not continuous, a code system can be defined. If there is a discrete number system used for the data values, then there needs to be a code system identifier.   The CodeSystemIdentifierMeaning element can be used to specify a code system identifier meaning. For example, 'Open Shrubland' corresponds to the value of '7'.   The CodeSystemIdentifierValue element describes the textual or numerical value assigned to each meaning. The number of code system identifier meanings must match the number of values. Other examples include   cloud masks, land surface classification variables, etc.

 

Sample values:

ValidRange/Max: 5000

ValidRange/Min: -100

 

Sample values:

ValidRangeCodeSystemIdentifierMeaning: <no_data, bad_data, worst_quality, low_quality, acceptable_quality, best_quality>

ValidRange/CodeSystemIdentifierValue: <0B, 1B, 2B, 3B, 4B, 5B>

 

Tags

Recommended

 

2.3.15     SizeEstimation

Elements

SizeEstimation (0..1)

SizeEstimation/AverageSizeOfGranulesSampled (0..1)

SizeEstimation/AverageCompressionInformation (0..*)

SizeEstimation/AverageCompressionInformation/Rate   (0..1)

SizeEstimation/AverageCompressionInformation/Format (0..1) <ASCII, NetCDF-4, ESRI Shapefile, GeoTIFF, Native>

 

Description

When a user uses   the EDSC   to find variables a user would like to know what is available and what is the estimated download size of the resulting product. The SizeEstimation element with its subelements provide metadata to support the size estimation capability.

 

Tags

Recommended

 

2.3.16     MeasurementIdentifiers

Elements

MeasurementIdentifiers (0..*)

MeasurementIdentifiers/MeasurementName (0..1)

MeasurementIdentifiers/MeasurementName/Object (1)

MeasurementIdentifiers/MeasurementName/Quantity (0..1)

MeasurementIdentifiers/MeasurementSource (1)

 

Description

Elements in this category are used for search purposes. The measurement name is structured according to the form defined by Scott Peckham in the CSDMS Naming Convention. This is: <<object, quantity>>, and it is for this reason that the measurement name element contains the Object and Quantity sub elements. Every standard name has an object part that describes a particular object and a quantity part that describes a particular attribute of the object that can be quantified. These names are sorted alphabetically and other sorting methods can be added later. More discussion on MeasurementName valid values is given in the MeasurementName object specification. The source of the names can be identified by the MeasurementSource element.   When using the MeasurementSource element to identify measurement names source from CSDMS system, use: CSDMS. When sourcing measurement names from other sources such as CF convention, use "CF".

 

In consultation with the GCMD team, it is recommended that MeasurementIdentifier's valid values should be enumerations in KMS and not keywords. It is also recommended that the MeasurementIdentifier's valid values are managed via the current ESDIS Standards Office (ESO) process but not until the valid values have matured.

 

Tags

Recommended, Controlled Vocabulary

 

2.3.16.1    MeasurementName

Element Specification

MeasurementIdentifiers/MeasurementName   (0..1)

MeasurementIdentifiers/MeasurementName/Object (1)

MeasurementIdentifiers/MeasurementName/Quantity (0..1)

 

Description

The names of the measurement may be taken from a variety of sources. These include, but are not limited to: Community Surface Dynamics Modeling System (CSDMS) Cross Domain Naming Conventions or Climate and Forecast (CF) Standard Name Convention, British Oceangraphic Data Centre (BODC). According to the CSDMS Basic Rules, every standard name has an object part that describes a particular object and a quantity part that describes a particular attribute of that object that can be quantified with a number. These names are sorted alphabetically, but other sorting methods can be added later.

 

  • Names are of the form: <object>__<quantity>.
  • Names shall contain only lowercase letters and numbers along with the Standard Names separator characters (_, -, ~, __).
  • The Standard Names separators:
    • _: delimiter separate words of a name.
    • -: join multi-word objects, quantities, adjectives, etc.
    • ~: join an adjective to a noun (the noun comes first following by or more adjectives).
    • __: separate an object from a quantity.
    • _of_: apply a math operation to the subsequent quantity.
  • Qualifiers that make an object or quantity more specific are added to the left of the base object or quantity (with increasing specificity).

 

CSDMS Standard Names may be further grouped by category: Atmosphere, Oceans, Radiation, Sea Ice, Soil, Snow, Topography. This is defined more fully in the CSDMS WIki: https://csdms.colorado.edu/wiki/CSN_Basic_Rules. Please see Appendix B   for more details.  

 

Sample Values:

  • "land_subsurface_water_sat-zone_top", "CSDMS"
  • "land_surface", ;"CSDMS"
  • "land_surface_air", "CSDMS"
  • "land_surface_air_flow", "CSDMS"
  • "land_surface​_air_heat-incoming-latent", "CSDMS"
  • "land_surface_air-incoming-sensible", "CSDMS"
  • "specific_humidity", "CSDMS"
  • "specific_humidity-standard_error", "CSDMS"
  • "specific_humidity-detection_minimum" "CSDMS"

 

This list may be supplemented further by standard names sourced from the CF Standard Names: http://cfconventions.org/Data/cf-standard-names/docs/guidelines.html. These can also be expressed in the form: <object>__<quantity> with care. Species, such as vapor and   sulfur, can be quantified by the following terms:   at_cloud_top, at_convective_cloud_top, at_cloud_base, at_convective_cloud_base, at_freezing_level, and at_ground_level. Fluxes, such as radiative_flux, can be quantified by   the term   at_top_of_atmosphere_modelat_sea_level and can be expressed as: "radiative_flux-at_top_of_atmosphere_model" with   "CF"   and "radiative_flux-at_sea_level with   "CF". Physical Quantities such as   temperature, pressure, humidity, and entropy, which are commonly used in mathematics, science, and engineering, can be expressed using the CF convention names such as electrical charge, or scientific symbol, q, and quantified by terms such as error_limit, detection_limit. It   can be expressed as: "q-error_limit" with   "CF" and "q-detection_limit" with "CF". MeasurementNames' values will come from the KMS.

 

Tags

Recommended, Controlled Vocabulary

 

2.3.16.1.1             Object [R]

Element Specification

MeasurementIdentifiers/MeasurementName/Object (1)

 

Description

The name of the object of measurement. The object part describes a particular object which is being measured.

 

Sample Values:

land_subsurface_water-sat, land_surface, land_surface_air, land_surface_air_flow, land_surface_air_heat, specific_humidity, radiative_flux, q.

 

The following represent the named object term in the <<object, quantity>> structure:

 

  • " land_subsurface_water_sat -zone_top"
  • " land_surface "
  • " land_surface_air "
  • " land_surface_air_flow "
  • " land_surface​_air_heat -incoming-latent"
  • " land_surface_air -incoming-sensible"
  • " specific_humidity "
  • " specific_humidity- standard_error"
  • " specific_humidity- detection_minimum"
  • " radiative_flux -at_top_of_atmosphere_model"
  • " radiative_flux -at_sea_level"
  • " q -error_limit"
  • " q -detection_limit"

 

Tags

Required

 

2.3.16.1.2             Quantity

Element Specification

MeasurementIdentifiers/MeasurementName/Quantity (0..1)

 

Description

The name of the quantity of measurement. The quantity part describes a particular attribute of that object that can be quantified with a number.

 

Sample Values:

zone-top, incoming-latent, incoming-sensible, standard_error, detection_minimum, at_top_of_atmosphere_model, at_sea_level, error_limit, detection_limit.

 

These represent the named quantity term in the <<object, quantity>> structure.

 

  • "land_subsurface_water_sat- zone_top "
  • "land_surface_air_heat- incoming-latent "
  • "land_surface_air- incoming-sensible "
  • "specific-humidity- standard_error "
  • "specific-humidity- detection_minimum"
  • "radiative_flux- at_top_of_atmosphere_model "
  • "radiative_flux- at_sea_level"
  • "q- error_limit "
  • "q- detection_limit "

 

Tags

Recommended

 

2.3.16.2    MeasurementSource [R]

Element Specification

MeasurementIdentifiers/MeasurementSource (1) <"CSDMS", "CF", "BODC">

 

Description

The source of the measurement names include, but are not limited to: Community Surface Dynamics Modeling System (CSDMS) Cross Domain Naming Conventions, Climate and Forecast (CF) Standard Name Convention, or British Oceanographic Data Centre. See Appendix C for more on the sources of measurement names and the recommended governance approach.

 

Sample Values:

CSDMS, CF, BODC

 

Tags

Required, Controlled Vocabulary

 

2.3.17     Sets

Elements

Sets (0..*)

Sets/Name (1)

Sets/Type (1)

Sets/Size (1)

Sets/Index (1)

 

Description

Typically, science variables have quality variables associated with them and can also   include other types.   This element allows for variables to be grouped together as a set. The set is defined by the name, type, size, and index. The Set class is flexible enough to also   include compound variables (a variable that groups related variables together to describe a phenomenon). The data provider will provide the set name, the set type - which is usually the theme of the group or just use the default string of General, the set size - which is the total number of variables in the set, and the index - which is just the numbering scheme for each variable in the set.

 

Sample Values:

This example shows what a variable set would look like for variables common to the 'Data_Fields' group, within the MOD11A1 collection. The set class would be populated in the following way for the variable named 'LST_Day_1km'.

 

"Sets": [{

  "Name": "Data_Fields",

  "Type": "MODIS 1km gridded",

  "Size": 15 ,

  "Index": 7

}]

 

This example shows what a variable set would look like for variables common to the AIRX3STD gridded data field group', within the AIRX3STD collection. The set class would be populated in the following way for the variable named 'EmisIR_A_ct'.

 

"Sets": [{

  "Name": "AIRX3STD",

  "Type": "AIRS+AMSU Level 3 Gridded",

  "Size": 867,

  "Index": 13

}]

 

Each variable in the set is numbered by Index, and the size of the set. So this is the 14th variable in a set of 867 variables if the numbering starts at 0.

 

For a phenomenon example, take the MOD08 v006 collection shown in Figure 50.

 

Figure 50 . Subset variable choices for the MOD08 collection

 

T he variables can be grouped which pertain to 'Total Ozone' in the following way:

 

Variable: {"Name": "Total_Ozone_Confidence_Histogram",  

...

"Sets": [{"Name": "Total Ozone","Type": "Data_Field","Size": 9   ,"Index": 0}],

...

}

 

Variable: ("Name": "Total_Ozone_Histo_Intervals",  

...

"Sets": [{"Name": "Total Ozone","Type": "Data_Field","Size": 9   ,"Index": 1}],

...

}

 

Variable: ("Name": "Total_Ozone_Histogram_Counts",  

...

"Sets": [{"Name": "Total Ozone","Type": "Data_Field","Size": 9   ,"Index": 2}],

...

}

 

Variable: ("Name": "Total_Ozone_Maximum",  

...

"Sets": [{"Name": "Total Ozone","Type": "Data_Field","Size": 9   ,"Index": 3}],

...

}

 

Variable: ("Name": "Total_Ozone_Mean",  

...

"Sets": [{"Name": "Total Ozone","Type": "Data_Field","Size": 9   ,"Index": 4}],

...

}

 

Variable: ("Name": "Total_Ozone_Minimum",  

...

"Sets": [{"Name": "Total Ozone","Type": "Data_Field","Size": 9   ,"Index": 5}],

...

}

 

Variable: ("Name": "Total_Ozone_QA_Mean",  

...

"Sets": [{"Name": "Total Ozone","Type": "Data_Field","Size": 9   ,"Index": 6}],

...

}

 

Variable: ("Name": "Total_Ozone_QA_Standard_Deviation",  

...

"Sets": [{"Name": "Total Ozone","Type": "Data_Field","Size": 9   ,"Index": 7}],

...

}

 

Variable: ("Name": "Total_Ozone_Standard_Deviation",  

...

"Sets": [{"Name": "Total Ozone","Type": "Data_Field","Size": 9   ,"Index": 8}],

...

}

 

In general, v ariables are organized in a specific way within the structure of a data set. The examples shown above are for HDF4 structures. The arrangement of these structures varies considerably between HDF4, HDF5 and NetCDF-4, and NetCDF-CF.

 

In the following HDF5 example s , the v ariables for this SMAP_L3_SM_P data set are organized into two sets. The first set , shown in Figure 51, contains the variables representing the morning (AM) crossing and the second se t, shown in Figure 52, contains variables representing the afternoon (PM) crossing.

 

Figure 51 . SMAP_L3_SM_P Variables Representing the Morning (AM) Crossing

 

Figure 52 . SMAP_L3_SM_P Variables Representing the Afternoon (PM) Crossing

 

The benefit of using the Set class is to enable the CMR to preserve the order of the variables within the structure of the granule file.

 

Tags

Recommended

 

2.3.18     SamplingIdentifiers

Elements

Sampling Identifiers (0..*)

Sampling Identifiers /SamplingMethod (1)

Sampling Identifiers /MeasurementConditions (1)

Sampling Identifiers /ReportingConditions (0..1)

 

Description

Elements in this category are used for capturing information associated with sampling, including the method of sampling and the conditions at the time of measurement and reporting. SamplingMethod describes the name of the sampling method used for the measurement. An example of the SampleMethod includes   'radiometric detection within the visible and infra-red ranges of the electromagnetic spectrum'. MeasurementConditions and ReportingConditions are useful metadata for field campaign data sets.   MeasurementConditions describes the conditions at the time the observation or measurement was recorded and the ReportingConditions describes the conditions over which the observation or measurement are valid. For example, MeasurementConditions could be 'Sampled Particle Size Range: 90 - 600 nm' and the   ReportingConditions   could   be 'STP: 1013 mb and 273 K'.

 

Tags

Recommended

 

2.3.19     AcquisitionSourceName [R]

Elements

AcquisitionSourceName (1)

 

Description

This element documents the instrument short name or simulation short name that recorded the values for this variable. This element along with Name, Units, and Dimensions is used to establish uniqueness between variables for a data provider.

 

Tags

Required


Appendix A                   Tags Glossary

 

Table 5 lists all tags used in this model and provides a description of the tags' usage.

 

Table 5 . Tags Glossary

Tag Name

Description

Required

This element is required.

Free Text Search

This element will be indexed by the CMR as part of the Free Text Search.

Controlled Vocabulary

This element will have a vocabulary that will be used to validate the value. This will most likely be done via a vocabulary management service.

Recommended

This element is recommended.

 


Appendix B                   Keywords and Measurements Governance Structure

 

The Governance Structure shown in Figure 5 3   is recommended for the selection of keywords and measurements. ESDIS chairs each of the measurement or keyword selection councils and provides overall science guidance, and the DAAC/Data Providers serve as the decision authority for the metadata associated with data sets sourced from their DAAC/Project.

 

Figure 53 . Suggested Governance Structure

 

Adding Keywords or Measurements are expected to be done   by the Metadata Curator, via a GUI. Keywords are to be sourced from the GCMD Keywords and are controlled. What is being proposed here is not too different from the existing method used in the EDSC UI, with the exception being that the Keyword will be used for discovery at the Variable level, as opposed to the Collection or Granule level, which is currently the case. The challenge with Measurements is that they are uncontrolled. The concept is to start with a pre-seeded list of suggested measurements and, over time an alphabetically ordered list can be collected, by certain users, and by their use of a metadata management tool. The guidelines for adding keywords or measurements can be achieved by following the suggested steps below.

 

Keywords

Keywords may be selected from the GCMD Keywords set.

The GCMD Keywords are already subject to a strict governance process:

  1. Review the controlled keyword/guidelines located at: http://gcmd.nasa.gov/learn/rules.html
  2. Verify that the keyword does not already exist.
  3. Map these to the appropriate variables.
  4. Include a definition of the controlled keyword.

 

Measurements

Measurements may be selected from an array of standard sources, such as CSDMS, CF Conventions, etc.

The process by which measurements may be selected is simple.

  1. Determine level: i.e. Atmosphere, Oceans, Land (highest) or Atmosphere Air Temperature (mid), or Atmosphere Air Temperature Saturated Adiabatic Lapse Rate (lowest), etc.
  2. Determine whether the measurement is missing, and a new one is needed. For example, if we have Atmosphere Air Column Water Vapor and the next tag is Atmosphere Air Flow Azimuth Angle of Bolus Velocity, then Atmosphere Air Carbon Dioxide (and its derivatives) are missing.
  3. Select the most appropriate measurement to suit the need. If the measurement does not exist, apply crosswalk to another standard such as   CSDMS to CF convention Standard Names.
  4. Add to the measurements list stored in the CMR so that all future users can use this measurement.
  5. Map these to the appropriate variables.
  6. Include a definition of the uncontrolled measurement.


Appendix C                   Analysis of CSDMS and CF Standard Names as a Source of Tagging

Analysis of CSDMS Standard Names as a source of tagging

The Community Surface Dynamics Modeling System (CSDMS)   modeling framework provides mechanisms that allow models and data sets from different contributors (i.e. from different geoscience domains: hydrology, oceanography, meteorology, seismology) to coexist. Each geoscience domain has its own descriptive vocabulary to describe specific variables.   This presents a problem when trying to use the data sets together or when trying to discover the same variables that have different names. How do you find a variable by name when one knows it by a different name? The framework defines a unique and holistic approach to the semantic mediation problem by offering a set of standardized and precise descriptions for each variable and by giving a number of options to resolving which names and abbreviations are to be used for a variable.

 

The naming conventions of the CSDMS Standard Names are based on object-oriented principles and group the variable names by the following categories: Atmosphere, Atoms, Automobiles, Basins, Bedrock, Channel, Chocolate, Compounds and Mixtures, Earthquakes, Glaciers, Materials, Models, Molecules, Oceans, Planets, Projectiles, Radiation, River Deltas, Sea Ice, Snow, Soil, Sea Floor Debris, Topography and Water Tank. Only a subset of these categories are suitable for the Earth Observing System (EOS) mission and they include the following: Atmosphere, Basins, Channel, Earthquakes, Glaciers, Oceans, Radiation, River Deltas, Sea Floor Debris, Sea Ice, Soil, Snow, and Topography.   These Standard Names can be chosen as the primary source of tagging, since each group is highly relevant to the science domains which are covered by the EOS data sets and those likely to be covered in the future. The CSDMS Standard Names exhibit the Object Name plus Model Name Pattern structure to the name. An example of an Object Name is atmosphere_water.   Examples of the corresponding Model Name Patterns are: domain time integral of precipitation leq volume flux, icefall mass per volume density, precipitation duration, precipitation leq volume flux, and precipitation mass flux. The first example of combining Object Name plus Model Name yields the resultant standard name: atmosphere water domain time integral of precipitation leq volume flux. An example of a missing name would be: atmosphere water precipitation, or the more common term, precipitation.

 

Analysis of CF Standard Names as a source of tagging

CF conventions for climate and forecast metadata are designed to promote the processing and sharing of files created with the netCDF Application Programmer Interface. The CF conventions generalize and extend the Cooperative Ocean/Atmosphere Research Data (COARDS) conventions. Most of the CF standard names have been derived from guidelines which have drawn on the European Center for Medium-Range Weather Forecasts (ECMWF), and National Centers for Environmental Prediction GRIdded Binary file format (NCEP GRIB) tables, the Program for Climate Model Diagnosis and Intercomparison (PCMDI) and GCMD. CF standard names consist of lower-letters, digits and underscores, and begin with a letter. Upper case is not used and US spelling is used, e.g. vapor, sulfur. The CF Standard Names can be chosen as a source of tagging supplementary to CSDMS. Examples of CF Standard Names are: precipitation amount, and precipitation flux, and precipitation flux onto canopy are included in the CF Standard Names and not in the CSDMS Standard Names. In this simple example, both CSDMS and CF Standard Names may be used as a source of tagging for search terms to locate all variables associated with the measurement: precipitation.


Appendix D                   Definitions of Terms

CONCEPT :

 

Measurement  

 

DEFINITION :

 

The act or process of measuring an observable property, usually geophysical, geo-biophysical, physical, or chemical. In the case of air temperature, for instance, the object of the measurement is air and the property being measured is temperature. For models, it is a simulated observable property.
Using Scott Peckham's model as a basis for a measurement naming convention, the Measurement names can be expressed as: <<object, quantity>>, object = "Aerosol", quantity = "Optical Depth".

Examples: Aerosol Optical Depth, Air Temperature, Surface Albedo, Solar Irradiance, Surface Reflectance, Atmospheric Moisture, Methane Concentration, Sulphur Dioxide Concentration, Ozone Concentration.

 

CONCEPT :

 

Variable

 

DEFINITION :

 

A named set of data that contains the recorded values of a measurement. In this context, the variable is described by its name and characteristics. For instance, a variable contained within the MYD08_M3V5 dataset is called: Optical_Depth_Land_Maximum. There are other variables in the set, including variables which contain information about geographic position and quality.

 

The description of the variable may include what was intended to be measured, i.e., the observable property, and how the variable was measured, such as measurement technique and the instrument used.

 

Variables may be classified as science variables, quality variables and ancillary variables (or other, when one of these classifications cannot be used). A variable can also be the output of a model.

 

Examples: Aerosol Optical Depth 550nm (Dark Target), Aerosol Optical Depth 550nm (Deep Blue, Land Only), Air Temperature (Daytime/Ascending), Air Temperature at 2m, Air Temperature at Surface (Daytime/Ascending), Air Temperature at Surface (Nightime/Descending), Relative Humidity (Daytime/Ascending), Relative Humidity (Nightime/Descending), Water Vapor Mass Mixing Ratio (Daytime/Ascending), Water Vapor Mass Mixing Ratio (Nightime/Descending), Methane Total Column (Nightime/Descending), SO2 Column Mass Density, SO2 Column Mass Concentration, Ozone - reported in parts per billion by volume.


 

Appendix E                   Abbreviations and Acronyms

AESIR

Application friendly EOSDIS Science Information Retriever

BODC

British Oceangraphic Data Centre

CF

Climate and Forecast metadata

CMR

Common Metadata Repository

COARDS

Cooperative Ocean/Atmosphere Research Data Service

CSDMS

Community Surface Dynamics Modeling System

DAAC

Distributed Active Archive Center

ECMWF

The European Center for Medium-Range Weather Forecasts

ECS

EOSDIS Core System

EDSC

Earthdata Search Client

EED

EOSDIS Evolution and Development

EOS

Earth Observing System

EOSDIS

Earth Observing System Data and Information System

ESDIS

Earth Science Data and Information System

ESO

Earth Science Office

GCMD

Global Change Master Directory

GES DISC

Goddard Earth Sciences Data and Information Services Center

GIBS

Global Imagery Browse Services

GRIB

GRIdded Binary file format

HDF

Hierarchical Data Format

ISO

International Organization for Standardization

KMS

Keyword Management System

LAADS

Level-1 and Atmosphere Archive and Distribution System

MMT

Metadata Management Tool

MODIS

Moderate Resolution Imaging Spectroradiometer

NASA

National Aeronautics and Space Administration

NCEP

National Centers for Environmental Prediction

PCMDI

Program for Climate Model Diagnosis and Intercomparison

PI

Principal Investigator

UI/UX

User Interface/User Experience

UML

Unified Modeling Language

UMM

Unified Metadata Model

UMM-C

Unified Metadata Model - Collections

UMM-G

Unified Metadata Model - Granules

UMM-S

Unified Metadata Model - Services

UMM-Var

Unified Metadata Model - Variables

URL

Uniform Resource Locator

XML

Extensible Markup Language

XPath

XML Path Language

XSLT

Extensible Stylesheet Language Transformations