Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Section
bordertrue
Column
width50%

Description of Problem

  • Roles:
    • pick lists include roles that apply to organizations and roles that apply to individuals
    • role values are not the same for DIF and for ECHO
    • current collection records in CMR have non-standardized values for roles (including different variations of the same word (ARCHIVER, archive, archive data center), different words for same concept (PRODUCER, PROCESSOR?)
    • ECHO 10 records have Organization role built into field names (Processing Center, Archive Center) rather than as field values
    • EDSC shows Processing Center and Archiving Center on Collection Information panel, but these are blank for DIF collections
  • Contact Types:
    • contact type lists associated with organizations and personnel are not standardized
    • contact type lists associated with organizations and personnel do not include 'modern' values
  • Normalization:
    • organization and personnel information are repeated in multiple CMR collection records; in GCMD, they are normalized

                  (The normalization issues will be addressed later, in

Jira
serverEarthdata Ticketing System
columnskey,summary,type,created,updated,due,assignee,reporter,priority,status,resolution
serverId9a2ac141-7181-31f1-a247-ccbc66e20158
keyECSE-91
 and
Jira
serverEarthdata Ticketing System
columnskey,summary,type,created,updated,due,assignee,reporter,priority,status,resolution
serverId9a2ac141-7181-31f1-a247-ccbc66e20158
keyECSE-99
 )  


Column

JIRA Linkage

Jira
serverEarthdata Ticketing System
columnskey,summary,type,created,updated,due,assignee,reporter,priority,status,resolution
serverId9a2ac141-7181-31f1-a247-ccbc66e20158
keyECSE-92

Jira
serverEarthdata Ticketing System
columnskey,summary,type,created,updated,due,assignee,reporter,priority,status,resolution
serverId9a2ac141-7181-31f1-a247-ccbc66e20158
keyECSE-75

 

Background

  


 

Approach

For UMM and CMR, we propose the following fundamental tenets to guide the resolution of issues related to additional attributes and acquisition information:

  • Focus on the CMR community and their metadata uses.   Metadata standards, recommendations, and guidance only make sense in the context of a community and their use cases for the metadata.  When deciding between options for a UMM metadata structure, we choose the one that makes sense for the CMR community.  We learn from other communities, such as the ISO community, but our focus is on the metadata needs of the CMR community.

  • Learn from ISO, but don’t blindly reuse ISO metadata structures.  Whenever possible,  have a single UMM metadata structure that might map to multiple locations in ISO.   We don’t need the flexibility and complexity of ISO to support the rich metadata needs of the CMR.   

  • Whenever possible,  specific typed metadata structures should be created from the  classification  of additional attributes.   For example,  in UMM-C-136 our approach would move information from additional attributes to enhance the structure of Platform/Instrument characteristics  instead of moving the characteristics into new additional attributes.   Over time, this tenent will produce a richer UMM structure that evolves as our understanding of metadata, applications, and missions evolves.

Recommendations


Recommendations

  1. Roles
    a. RECOMMENDATION 1:  Distinguish between Roles for Organizations (call them Data Centers) and Roles for Personnel or Personnel Groups (call them Data Contacts) , i.e.,
    Change ResponsibilityRoleEnum to:

    DataCenterRoleEnum and

    DataContactRoleEnum 


    Current list in UMM-Common schema ResponsibilityRoleEnum mixes the three:

    "enum": ["RESOURCEPROVIDER", "CUSTODIAN", "OWNER", "USER", "DISTRIBUTOR", "ORIGINATOR", "POINTOFCONTACT", "PRINCIPALINVESTIGATOR", "PROCESSOR", "PUBLISHER", "

    Roles
    a. RECOMMENDATION 1:  Distinguish between Roles for Organizations (call them Data Centers) and Roles for Personnel or Personnel Groups (call them Data Contacts) , i.e.,
    Change ResponsibilityRoleEnum to:

    DataCenterRoleEnum and

    DataContactRoleEnum 

    Current list in UMM-Common schema ResponsibilityRoleEnum mixes the three:

    "enum": ["RESOURCEPROVIDER", "CUSTODIAN", "OWNER", "USER", "DISTRIBUTOR", "ORIGINATOR", "POINTOFCONTACT", "PRINCIPALINVESTIGATOR", "PROCESSOR", "PUBLISHER", "AUTHOR", "SPONSOR", "COAUTHOR", "COLLABORATOR", "EDITOR", "MEDIATOR", "RIGHTSHOLDER", "CONTRIBUTOR", "FUNDER", "STAKEHOLDER"]  (NOTE:  This corresponds to the CI_RoleCode code list in the UMM-Common document (Figure 3).  

     

    See also http://www.ngdc.noaa.gov/metadata/published/xsd/schema/resources/Codelist/gmxCodelists.xml#CI_RoleCode

     

     

     


    b. RECOMMENDATION 2: Assign the VALUES to DataCenterRoleEnum and DataContactRoleEnum as follows:

     

    i.  For Organizations (Data Centers), use:   Archiver  , Processor, Distributor, and Originator

    i.e., DataCenterRoleEnum = "enum": ["ARCHIVER", "DISTRIBUTOR", "ORIGINATOR", "PROCESSOR"] 

    (Note:  GCMD  Organization Type ENUM is:

    < xs:enumeration value="DISTRIBUTOR"/>

    < xs:enumeration value="ARCHIVER"/>           

     < xs:enumeration value="ORIGINATOR"/>

     < xs:enumeration value="PROCESSOR"/>

    Jira
    serverEarthdata Ticketing System
    columnskey,summary,type,created,updated,due,assignee,reporter,priority,status,resolution
    serverId9a2ac141-7181-31f1-a247-ccbc66e20158
    keyCMR-2706
     Error because Organization was set to ARCHIVER.  

    ii.  for Personnel (DataContact), use:

     

    DataContactRoleEnum = "enum": ["DATA CENTER CONTACT", "TECHNICAL CONTACT", "SCIENCE CONTACT", "INVESTIGATOR","METADATA AUTHOR", "USER SERVICES", "SCIENCE SOFTWARE"] 

...


Jira
serverEarthdata Ticketing System
columnskey,summary,type,created,updated,due,assignee,reporter,priority,status,resolution
serverId9a2ac141-7181-31f1-a247-ccbc66e20158
keyECSE-75
Jira
serverEarthdata Ticketing System
columnskey,summary,type,created,updated,due,assignee,reporter,priority,status,resolution
serverId9a2ac141-7181-31f1-a247-ccbc66e20158
keyUMMC-412
Jira
serverEarthdata Ticketing System
columnskey,summary,type,created,updated,due,assignee,reporter,priority,status,resolution
serverId9a2ac141-7181-31f1-a247-ccbc66e20158
keyUMMC-435

 

 

d. RECOMMENDATION 4:

Map ECHO 10 Organization metadata to UMM-C as follows:

/Collection/ProcessingCenter=<value> maps to DataCenter/Role=PROCESSOR and DataCenter/ShortName = <shortname corresponding to Processing Center value) and  DataCenter/LongName = <longname corresponding to Processing Center value)

-a247-ccbc66e20158
keyECSE-75
/Collection/ArchiveCenter= <value> maps to  DataCenter/Role=ARCHIVER and DataCenter/ShortName = <shortname corresponding to Archive Center value) and DataCenter/LongName = <longname corresponding to Archive Center value)

Jira
serverEarthdata Ticketing System
columnskey,summary,type,created,updated,due,assignee,reporter,priority,status,resolution
serverId9a2ac141-7181-31f1-a247-ccbc66e20158
keyCMRUMMC-600

ISSUE:   Write a ticket to determine how to map the <value> for Processing Center or Archive Center in the ECHO record to the corresponding Shortname  and Longname  (NEED A LIST OF ACTUAL VALUES IN ECHO 10 RECORDS FOR PROCESSING CENTER AND ARCHIVE CENTER)

 EXAMPLES:

In UAT, for Collection with Collection Shortname AQUARIOUS_L4_OISSS_IPRC_7DAY_V4, Version 1:

<ArchiveCenter>PO.DAAC</ArchiveCenter>  should translate to DataCenter/Role=ARCHIVER and DataCenter/ShortName = NASA/JPL/PODAAC

<ProcessingCenter>IPRC/SOEST University of Hawaii, Manoa</ProcessingCenter> should translate to DataCenter/Role=PROCESSOR and DataCenter/ShortName = UHI/SOEST/IPRC

 

 

 e. RECOMMENDATION 5:

412


Jira
serverEarthdata Ticketing System
columnskey,summary,type,created,updated,due,assignee,reporter,priority,status,resolution
serverId9a2ac141-7181-31f1-a247-ccbc66e20158
keyUMMC-435

 

 

d. RECOMMENDATION 4:

Map ECHO 10 Organization metadata Map DIF 10 metadata to UMM-C as follows:

UMM-C DataCenter/Role  maps to DIF 10 Organization/Organization_Type

UMM-C DataCenter/ShortName maps to DIF 10 Organization/Organization_Name/Short_Name

 

EXAMPLE:

<Organization><Organization_Type>DISTRIBUTOR</Organization_Type> <Organization_Name> <Short_Name>CA/EC/MSC</Short_Name> </Organization_Name> maps to

 DataCenter/Role=DISTRIBUTOR and DataCenter/ShortName = CA/EC/MSC

 

 

 

2.   Contact Type  

RECOMMENDATION:  Create a schema ENUM list which combines the current GCMD list with three additional contact types proposed by MMT developers:  Email, Facebook, Twitter 

 a.  The GCMD has the following ENUM list for Contact Type. 

<xs:enumeration value="Direct Line"/>           

<xs:enumeration value="Primary"/>            

<xs:enumeration value="Telephone"/>            

<xs:enumeration value="Fax"/>            

<xs:enumeration value="Mobile"/>            

<xs:enumeration value="Modem"/>            

<xs:enumeration value="TDD/TTY Phone"/>            

<xs:enumeration value="U.S. toll free"/>            

<xs:enumeration value="Other"/>

 

This recommendation has already been implemented on the MMT using

Jira
serverEarthdata Ticketing System
columnskey,summary,type,created,updated,due,assignee,reporter,priority,status,resolution
serverId9a2ac141-7181-31f1-a247-ccbc66e20158
keyMMT-538

 

3.  Normalization

RECOMMENDATION:  Ultimately, store all Data Center and Data Contact information once in the CMR, rather than in every Collection record for those Data Centers and Data Contacts.     Flesh out this recommendation in the existing ECSE tickets:

/Collection/ProcessingCenter=<value> maps to DataCenter/Role=PROCESSOR and DataCenter/ShortName = <shortname corresponding to Processing Center value) and  DataCenter/LongName = <longname corresponding to Processing Center value)

/Collection/ArchiveCenter= <value> maps to  DataCenter/Role=ARCHIVER and DataCenter/ShortName = <shortname corresponding to Archive Center value) and DataCenter/LongName = <longname corresponding to Archive Center value)

Jira
serverEarthdata Ticketing System
columnskey,summary,type,created,updated,due,assignee,reporter,priority,status,resolution
serverId9a2ac141-7181-31f1-a247-ccbc66e20158
keyCMR-600

ISSUE:   Write a ticket to determine how to map the <value> for Processing Center or Archive Center in the ECHO record to the corresponding Shortname  and Longname  (NEED A LIST OF ACTUAL VALUES IN ECHO 10 RECORDS FOR PROCESSING CENTER AND ARCHIVE CENTER)

 EXAMPLES:

In UAT, for Collection with Collection Shortname AQUARIOUS_L4_OISSS_IPRC_7DAY_V4, Version 1:

<ArchiveCenter>PO.DAAC</ArchiveCenter>  should translate to DataCenter/Role=ARCHIVER and DataCenter/ShortName = NASA/JPL/PODAAC

<ProcessingCenter>IPRC/SOEST University of Hawaii, Manoa</ProcessingCenter> should translate to DataCenter/Role=PROCESSOR and DataCenter/ShortName = UHI/SOEST/IPRC

 

 

 e. RECOMMENDATION 5:

Map DIF 10 metadata to UMM-C as follows:

UMM-C DataCenter/Role  maps to DIF 10 Organization/Organization_Type

UMM-C DataCenter/ShortName maps to DIF 10 Organization/Organization_Name/Short_Name

 

EXAMPLE:

<Organization><Organization_Type>DISTRIBUTOR</Organization_Type> <Organization_Name> <Short_Name>CA/EC/MSC</Short_Name> </Organization_Name> maps to

 DataCenter/Role=DISTRIBUTOR and DataCenter/ShortName = CA/EC/MSC

 

 

 

2.   Contact Type  

RECOMMENDATION:  Create a schema ENUM list which combines the current GCMD list with three additional contact types proposed by MMT developers:  Email, Facebook, Twitter 

 a.  The GCMD has the following ENUM list for Contact Type. 

<xs:enumeration value="Direct Line"/>           

<xs:enumeration value="Primary"/>            

<xs:enumeration value="Telephone"/>            

<xs:enumeration value="Fax"/>            

<xs:enumeration value="Mobile"/>            

<xs:enumeration value="Modem"/>            

<xs:enumeration value="TDD/TTY Phone"/>            

<xs:enumeration value="U.S. toll free"/>            

<xs:enumeration value="Other"/>

 

This recommendation has already been implemented on the MMT using

ECSE-91
Jira
serverEarthdata Ticketing System
columnskey,summary,type,created,updated,due,assignee,reporter,priority,status,resolution
serverId9a2ac141-7181-31f1-a247-ccbc66e20158
key
Jira
ECSE-99
serverEarthdata Ticketing System
columnskey,summary,type,created,updated,due,assignee,reporter,priority,status,resolution
serverId9a2ac141-7181-31f1-a247-ccbc66e20158
keyMMT-538

 

 

 Changes to UMM-C fields and UMM-Common fields:

 

See the proposed class diagram for the element definitions, their types and cardinality in Lucidchart (you may need to log in to Lucidchart).

...

Jira
serverEarthdata Ticketing System
columnskey,summary,type,created,updated,due,assignee,reporter,priority,status,resolution
serverId9a2ac141-7181-31f1-a247-ccbc66e20158
keyUMMC-336

 

Mappings to DIF, ECHO, ISO

The DIF-UMM-ECHO_Mapping.xlsx file can be found here: https://git.earthdata.nasa.gov/projects/EMFD/repos/unified-metadata-model

Jira
serverEarthdata Ticketing System
columnskey,summary,type,created,updated,due,assignee,reporter,priority,status,resolution
serverId9a2ac141-7181-31f1-a247-ccbc66e20158
keyECSE-45

 

Interoperability Considerations

During the original UMM-C review, it was decided that Organization and Personnel elements would be combined into a merged element called Responsible Party, similar to the ISO 19115-2 field CI_ResponsibleParty.  Later during the UMM-Common review it was decided to separate role with party to allow for components or xlinks and for reusability of the party element.

...

Mappings from DataCenter and DataContact elements to ISO elements will be provided in the UMM-C/UMM-Common documents.

 

Changes to CMR

The new elements will need to be added and possibly indexed if the old elements were indexed.  Several names have changed and the translation code will have to be implemented.

...

Jira
serverEarthdata Ticketing System
columnskey,summary,type,created,updated,due,assignee,reporter,priority,status,resolution
serverId9a2ac141-7181-31f1-a247-ccbc66e20158
keyCMR-3233

 

What should the MMT forms look like

 

Jira
serverEarthdata Ticketing System
columnskey,summary,type,created,updated,due,assignee,reporter,priority,status,resolution
serverId9a2ac141-7181-31f1-a247-ccbc66e20158
keyMMT-489

...

Jira
serverEarthdata Ticketing System
columnskey,summary,type,created,updated,due,assignee,reporter,priority,status,resolution
serverId9a2ac141-7181-31f1-a247-ccbc66e20158
keyMMT-690

 

 

How should the values in these fields be presented to the user on the EDSC

 

Jira
serverEarthdata Ticketing System
columnskey,summary,type,created,updated,due,assignee,reporter,priority,status,resolution
serverId9a2ac141-7181-31f1-a247-ccbc66e20158
keyEDSC-999

...

b. Also display ServiceHours, Contact_Instructions, and RelatedURL for each Data Contact, if these fields are present

 

How should pick lists / controlled vocabulary be handled?

 

Data Center FieldCurrent Source of Pick list values (MMT)Proposed Source of Pick list valuesIs this field used for EDSC Faceted Search?
DataCenter/Role Schema ENUM - current ResponsibilityRoleEnum

Schema ENUM - proposed DataCenterRoleEnum

["ARCHIVER", "DISTRIBUTOR", "ORIGINATOR", "PROCESSOR"] 

no 
DataCenter/ShortName KMS (Data Center)KMS (Data Center) yes
DataCenter/LongName KMS, or auto fill from KMS after selection short nameKMS, or auto fill from KMS after selection short name no
DataCenter/ContactInformation/ContactMechanism/Type GCMD list plus current MMT valuesSchema ENUM - proposed (GCMD list plus current MMT values) no
DataCenter/ContactInformation/Address/Country ISOISO no
DataCenter/ContactInformation/Address/StateProvince ISOISO  no

DataCenter/ContactInformation/RelatedURLs

 

 auto fill from KMS after selecting short name or long nameauto fill from KMS after selecting short name or long name no
Data Contact FieldCurrent Source of Pick list values (MMT)Proposed Source of Pick list valuesIs this field used for EDSC Faceted Search?

DataContact/Role

DataCenter/DataContact/Role

 Schema ENUM - current ResponsibilityRoleEnum

Schema ENUM - proposed DataContactRoleEnum

 

no 

DataContact/ContactInformation/ContactMechanism/Type

DataCenter/DataContact/ContactInformation/ContactMechanism/Type

 GCMD list plus current MMT values Schema ENUM - proposed (GCMD list plus current MMT values)no 

DataContact/ContactInformation/Address/Country

DataCenter/DataContact/ContactInformation/Address/Country

 

 ISOISO no

DataContact/ContactInformation/Address/StateProvince

DataCenter/DataContact/ContactInformation/Address/StateProvince

 ISOISO  no

Reconciliation of Existing Metadata values with new rules

Jira
serverEarthdata Ticketing System
columnskey,summary,type,created,updated,due,assignee,reporter,priority,status,resolution
serverId9a2ac141-7181-31f1-a247-ccbc66e20158
keyUMMC-427

...

Correct organization roles that are really personnel roles

 

Impact of Changes on Ingest of Granule Metadata 

 None - Organization and Personnel information can be changed in a Collection record without impacting the Collection's granule records.

 

Approvals