Element Description
A link used to directly obtain the data. This is different from a GET SERVICE URL, which relates to methods of sub-setting and/or transforming the data before obtaining it. For details concerning GET SERVICE URLs, please see the GET SERVICE wiki page.
Best Practices
The GET DATA Related URL metadata element allows for the linkage of a metadata record to a location on the web where data may be directly accessed. As mentioned on the Related URLs wiki page, there are several sub-elements which are used to identify the purpose of the URL. For GET DATA links specifically, best practices for these elements include:
URL Content Type: The URL Content Type is a keyword which, at a high level, describes the content of a link. This is a controlled vocabulary field maintained as an enumeration list within the UMM-Common schema. For GET DATA URLs, the URL Content Type should always be "DistributionURL".
URL Type: The URL Type is a keyword which specifies the content of a link. URL Type keywords are maintained in the Keyword Management System (KMS). For GET DATA URLs, the URL Type should always be "GET DATA".
URL Subtype: The URL Subtype is a keyword which further specifies the content of a link. Together, the URL Type and Subtype keywords create a keyword hierarchy which is used to identify the URL. Providing a Subtype for GET DATA URLs is optional, but should be used when applicable. Currently (as of 7/25/2018) the following Subtype keywords are valid under GET DATA: <insert list> Should we include examples/ a brief description of each subtype?
Description: While not required, it is highly recommended that a description be provided for each URL provided in the metadata. The description should be kept brief and explain to the user that the link goes to a data access point. The descriptions should be unique to the link. While descriptions can be repeated for the same type of URL across different metadata records, it is generally advised that the same description not be repeated within the same metadata record. I.e. the description should be used to further differentiate two GET DATA URLs with the same URL Type and Subtype.
A GET DATA URL is required for all NASA data sets. For NASA EOSDIS data, data access should be behind URS authentication. For NASA EOSDIS data, it is also recommended that data access not be provided via FTP protocol, in favor of the HTTPS protocol.
There are several sub-elements specifically designated for GET DATA URLs. The following provides best practices for each of the sub-elements:
RelatedUrls/GetData/Format: The format of the data provided via the associated URL. Providing the format is required??. Format is a controlled vocabulary field and should be chosen from the <insert GCMD data format keyword list>. If data is provided in a compressed file format, recommend listing the format of the data once it is uncompressed.
RelatedUrls/GetData/MimeType: The mime type of the associated URL. Mime Type is a controlled vocabulary field and should be chosen from the <insert GCMD mime type keyword list>. Providing a Mime Type is optional.
RelatedUrls/GetData/Size: Really only makes sense to provide if the link is a direct download. Is this going to remain a required field?
RelatedUrls/GetData/Unit: Really only makes sense to provide if the link is a direct download. Is this going to remain a required field?
RelatedUrls/GetData/Fees: The fee (if any) for ordering the data. The fee should be a number in U.S. dollars. This is an optional field.
RelatedUrls/GetData/Checksum: Does it only make sense to provide a checksum if the link directly downloads a file? Or should you provide one if the link just takes you to another web page e.g. a 'data tree' type of page? Would like to provide some guidelines for when use of this field is encouraged.
Examples:
URL: https://hydro1.gesdisc.eosdis.nasa.gov/data/FLDAS/FLDAS_VIC025_C_EA_M.001/
URL Content Type: DistributionURL
URL Type: GET DATA
URL Subtype: DATA TREE
Description: Use the link to access the data via HTTPS. Files are organized by date.
Format: NetCDF-4
Mime Type: text/html
Size: Is this element getting updated?
Unit: MB
Fees: 0
URL: https://daac.ornl.gov/cgi-bin/download.pl?ds_id=465&source=dsviewer
URL Content Type: DistributionURL
URL Type: GET DATA
URL Subtype: DIRECT DOWNLOAD
Description: Downloads the NPP Boreal Forest: Canal Flats, Canada, 1984, R1 data set directly to your workstation.
Format: Text File
Size: 91.8
Unit: KB
Fees: 0
Checksum: f2 aa 78 d6 82 5e c4 2d 78 35 81 a8 d5 ea 1f 68
Element Specification
An unlimited amount of Related URLs may be listed (Cardinality: 0..*)
Model | Element | Type | Usable Valid Values | Constraints | Required? | Cardinality | Notes |
---|---|---|---|---|---|---|---|
UMM-Common | RelatedUrls/URL | String | n/a | 1 - 1024 characters | Yes | 1 | The GET DATA URL should point the user to a location where data files may be directly downloaded. |
UMM-Common | RelatedUrls/Description | String | n/a | 1 - 4000 characters | No | 0..1 | It is strongly recommended that a description be provided for each URL. |
UMM-Common | RelatedUrls/URLContentType | Enumeration | CollectionURL PublicationURL DataCenterURL DistributionURL DataContactURL VisualizationURL | n/a | Yes | 1 | "DistributionURL" is the only valid option for links used to obtain the data. |
UMM-Common | RelatedUrls/Type | String | KMS controlled | n/a | Yes | 1 | "GET DATA" should be provided as the Type. |
UMM-Common | RelatedUrls/Subtype | String | KMS controlled | n/a | No | 0..1 | The Type and Subtype are part of a keyword hierarchy specified in the KMS. Any Subtype listed get after GET DATA in the keyword list is a valid option. If none of the available Subtypes are appropriate for the URL, then it is okay to leave the Subtype field blank. |
UMM-Common | RelatedUrls/GetData/Format | String | KMS controlled | n/a | Yes | 1 | Are we keeping this field? The format of the data provided via the associated URL. |
UMM-Common | RelatedUrls/GetData/MimeType | String | KMS controlled | n/a | No | 0..1 | The mime type of the associated URL. |
UMM-Common | RelatedUrls/GetData/Size | Number | n/a | n/a | Yes | 1 | Really only makes sense to provide if the link is a direct download. Is this going to remain a required field? The size of the data obtained via the associated URL. |
UMM-Common | RelatedUrls/GetData/Unit | Enumeration | KB MB GB TB PB | n/a | Yes | 1 | Really only makes sense to provide if the link is a direct download. Is this going to remain a required field? Unit is required if information is provided in the 'Size' element. |
UMM-Common | RelatedUrls/GetData/Fees | String | n/a | 1 - 80 characters | No | 0..1 | The fee (if any) for ordering the data. The fee should be a number in U.S. dollars. |
UMM-Common | RelatedUrls/GetData/Checksum | String | n/a | 1 - 50 characters | No | 0..1 | Does it only make sense to provide a checksum if the link directly downloads a file? Or should you provide one if the link just takes you to another web page e.g. a 'data tree' type of page? Would like to provide some guidelines for when use of this field is encouraged. |
Metadata Validation and QA/QC
All metadata entering the CMR goes through the below process to ensure metadata quality requirements are met. All records undergo CMR validation before entering the system. The process of QA/QC is slightly different for NASA and non-NASA data providers. Non-NASA providers include interagency and international data providers and are referred to as the International Directory Network (IDN).
Please see the expandable sections below for flowchart details.
Dialect Mappings
UMM Migration
None
Future Mappings
History
UMM Versioning
Version | Date | What Changed |
---|---|---|
1.10.0 | 5/2/2018 | <> |
1.9.0 |
ARC Documentation
Version | Date | What Changed | Author |
---|---|---|---|
1.0 | 6/13/18 | Recommendations/priority matrix transferred from internal ARC documentation to wiki space | Jeanne' le Roux |