Interim Collection Comparison Report for Big Earth Data Initiative
Metadata Source: CMR Metadata Collections
Metadata Dialect: ISO 19115-2
Evaluation Target: UMM-Collection metadata profile
The Unified Metadata Model Collection (UMM-Collection) profile describes documentation concepts that are considered important for collection level metadata. The profile is includes three documentation levels (Required, Recommended and Optional).
Metadata Collection
- Alaska Satellite Facility (ASF)
- Crustal Dynamics Data Information System (CDDIS)
- Global Hydrology Resource Center (GHRC)
- Goddard Earth Sciences Data and Information Center (GES_DISC)
- Level 1 and Atmosphere Archive and Distribution System (LAADS)
- Land, Atmosphere Near real-time Capability for EOS (LANCEMODIS)
- Land, Atmosphere Near real-time Capability for EOS (LANCEAMSR2)
- Langley Research Center (LARC)
- Langley Research Center (LARC_ASDC) Atmospheric Science Data Center
- Land Process DAAC - EOS Core System (LPDAAC_ECS)
- National Snow and Ice Data Center Version 0 (NSIDCV0)
- National Snow and Ice Data Center EOS Core System (NSIDC_ECS)
- Ocean Biology Processing Group (OBPG)
- Oak Ridge National Laboratory (ORNL)
- Physical Oceanography DAAC (PODAAC)
- Socioeconomic Data and Applications Center (SEDAC)
- U.S. Geological Survey Earth Resources Observation Systems (USGS_EROS)
- (AU_AADC)
- ESA
- EUMETSAT
- ISRO
- JAXA
- LM_FIRMS
- NOAA_NCEI
- USGS_LTA
Overview
We examined over 15,000 metadata records from 26 collections extracted from the Common Metadata Repository (CMR) during April 2017. The links below connect to tables in google sheets which provide the average number of occurrences of UMM-Collection elements in each of these collections. A value of 1 or more typically (although not necessarily) indicates that the element is included one or more times in each record in a collection. A value < 1.0 is typically the percentage of records in a collection that include the metadata element. Cells with pink backgrounds indicate values of 0, meaning the element is completely missing from the collection.
The table below shows how CMR metadata collections document contact information. The following contact types are considered: Metadata Contact, Point of Contact, Resource Contact, Distributor Contact and Processor.
Table 1: Contact Information in the CMR
Observations
- The Metadata Contact organization exists in all CMR metadata collections
- The Point Of Contact organization exists in all CMR metadata collections
- Resource Contact information is missing from the majority of CMR metadata collections.
- Distributor Contact information is included in 72% of NASA metadata collections.
- Processor Contact information is included in 72% of NASA metadata collections.
The table below shows where Identifiers are used in CMR metadata collections. Identifiers enable specific metadata concepts such platforms, instruments, Sensors, etc to be unambiguously identified.
Table 2: Identifiers in the CMR
Observations
- All CMR metadata collections include a Resource Identifier
- All CMR metadata collections include Instrument and Platform Identifiers.
- Associated resource identifiers (AggregationInfo) are included to some degree in 5 CMR metadata collections
The table below shows where Citations are used in CMR metadata collections.
Table 3:
GeographicElements in the CMR - No Attributes
Citations in the CMR with Attributes
Observations
- Instrument Citations and Resource Citations exists in all CMR metadata collections
The table below shows where Online Resource elements are used in CMR metadata collections.
Table 4: OnlineResources in the CMR
Observations
- Online Resource URLs for distributorTransferOptions and Keywords exists in all CMR metadata collections
The table below shows where geographic elements are used in CMR metadata collections.
Table 5:
GeographicElements in the CMR - No Attributes
GeographicElements in the CMR - With Attributes
Observations
- Bounding Box extent elements are used in all CMR metadata collection with the exception of OMINRT
- 18 out of 26 CMR collection are using the extentTypeCode element with a value of 1
- 8 NASA collections are using the gmd:geographicIdentifier/gmd:description. Commonly recurring values include:
- 'HORIZONTALTILENUMBER - Horizontal tile number of a grid, which increases from left to right'
- 'VERTICALTILENUMBER - Vertical tile number of a grid, which increases from top to bottom'
- 'TileID - MODIS Land tile identification number which represents a geographical area on the surface of the Earth bounded by latitude and longitude coordinates'
- 'Universal Transverse Mercator (UTM)ZoneIdentifier'
- 7 NASA collections include a Geographic Identifier code with the most common value 'Universal Transverse Mercator (UTM)'
The table below shows where temporal elements are used in CMR metadata collections.
Table 6:
TemporalElements in the CMR - No Attributes
TemporalElements in the CMR - With Attributes
Observations
- All CMR collections include a TimePeriod begin position element, whereas 24 of 28 CMR collections include a TimePeriod end position element
- 7 CMR collections include a TimePeriod endPosition element with an attribute value of indeterminatePosition='Now'
- 1 CMR collection (SEDAC) is using the timePosition element with the most common @framevalue of 'Eastern-Daylight' and field value a DateTime value the likely begin time.
The table below shows where Online Resource elements are used in CMR metadata collections.
Table 7:
TemporalElements in the CMR - No Attributes
Dates in the CMR - With Attributes
Observations
- Online Resource URLs for distributorTransferOptions and Keywords exists in all CMR metadata collections