Chapter 2: Getting Started
This chapter will discuss
- creating and managing user accounts
- CMR session management - creating, using, and deleting tokens to provide authorization
If searching for and retrieving publicly available data is the only desired operation, this section can be skipped and the reader can go straight to Chapter 3.
User Accounts
User accounts are used to get access to restricted data, manage privileges, or to interact with other services and tools provided by the CMR or ECHO. User accounts for the CMR system are created and manage by the EARTHDATA Login (URS) system. If you need an account and don't already have one, please click on EARTHDATA Login to create one. Once created you can always go back to EARTHDATA Login to manage it. If you are part of a Data Provider group or other team the team administrator can set up permissions for you to access their restricted data. If you need special privileges you can always contact the CMR operational team at support@earthdata.nasa.gov and they can help you.
Creating and Managing CMR Sessions
The CMR uses tokens in request messages - the http call to CMR - to validate per request who the requester is and what privileges they have. For most searches, a token is not needed because the metadata records are open to everyone. When certain metadata records are restricted a token is needed so that privileged users can see and access those records. A Session is nothing more than a series of requests that use the same token meaning that you can use the same token for many requests before you delete it. All tokens expire at the end of a time period; At the time of this writing the duration is 30 days. Because the token is used to track your session, it must be protected by client applications with the same level of security that you use for your login name and password.
To conduct a session the normal steps are:
- Create a token
- Do one or more of the following in any order:
- Search for records
- Retrieve records
- Delete the token
Create a Token
Now that the client partner has created a token, they can search and retrieve records, and conduct other functionality through the CMR or ECHO APIs. This functionality is covered in later chapters of this document. Once finished interacting with the CMR the token can be deleted.
Delete the Token
CMR Environment URLs
All of the examples provided below are using the Systems Integration Test environment. To run the commands in the other environments just replace the SIT URL part with either the UAT or OPS to use the API in the respective environments.
CMR Environment | Base API URL |
---|---|
Operational (OPS) | |
User Acceptance Test (UAT) | |
Systems Integration Test (SIT) |
Interacting with the CMR
The REST way
You can interact with any CMR-REST resource using guest by simply GETing, POSTing, PUTing or DELETEing the resource. Some of these operations will fail if guest does not have the authority to perform them. In cases, where you want to use a registered user you should acquire a CMR token above and attach it as a header to the request you make. For example,
Request headers: Content-Type: application/xml CMR-Token: CMR-TOKEN-ID Request: GET https://api.cmr.nasa.gov/cmr-rest/providers Response headers: Status Code: 200 OK Response Body: <?xml version="1.0" encoding="UTF-8"?> <providers type="array"> <provider> ... </provider> </providers>
The SOAP way
Interacting with the rest of the CMR API follows the same pattern as logging in and logging out except that it requires that you pass a valid CMR token to each operation. The following example shows logging in, retrieving the version number of the CMR system and logging back out.
// Login String token = authenticationService.login("jdoe", "mypass", new ClientInformation("A Client", "192.168.1.1"), null, null); // Print out CMR version number. System.out.println("CMR's version number: " + authenticationService.getCMRVersion(token)); // Logout using token from previous login authenticationService.logout(token); token = null;
Chapter 3: Searching for earth science metadata
There are several ways to search in the CMR: using the RESTful service API, using the Alternative Query Language (AQL), and Something else. The most popular way is to use the RESTful API. The API documentation is located at https://cmr.earthdata.nasa.gov/search/site/search_api_docs.html. The first way is to use the RESTful interface. The
Headers
Headers are a part of HTTP requests and for the CMR they provide information such as the content of the message (content type), tokens to allow increased privileges (Echo-Token), the format of the data that gets returned (accept), etc. Content-Type is a standard HTTP header that specifies the content type of the body of the request for POST messages. Ingest supports the following content types for ingesting metadata.
Content-Type headers
DIF 10 | application/dif10+xml |
DIF 9 | application/dif+xml |
ECHO 10 | application/echo10+xml |
ISO 19115 (MENDS) | application/iso19115+xml |
ISO 19115 SMAP | application/iso:smap+xml |
Standard | Content-Type |
---|
If the caller wishes to control in what format - XML or JSON - the data gets returned they can use the Accept header. The following table lists the valid values. If this header is not used, XML results will be returned.
Accept Headers
Type Received | Accept HeaderValue |
---|---|
XML | application/xml |
JSON | application/json |
The Echo-Token allows the CMR to know who is making a request. The Token is in the format of XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX. A token must first be generated as described in the next section. Once the requester has the token, the token can be placed into the http header for the necessary API calls.
Following are some examples for using the headers. The purple part of the example will be explained in this section, the rest will be described later:
The following curl command requests that a metadata record called something1 be validated. In this request the content-type which is the specification and format of something1.xml file being validated is using the echo10 specification with the XML format. The accept header states that we want the results in the XML format. The Echo-Token header allows the CMR to know who is making the request for authorization purposes.
curl -v -XPOST -H "Content-type: application/echo10+xml" -H "Accept: application/xml" -H "Echo-Token: 75E5CEBE-6BBB-2FB5-A613-0368A361D0B6"
-d @../../Downloads/records/something1.xml https://cmr.sit.earthdata.nasa.gov/ingest/providers/<Provider ID>/validate/collection/something1
The following curl command requests that a metadata record called something1 be validated. In this request the content-type which is the specification and format of something1.xml file being validated is using the ISO 19115 SMAP specification with the XML format. The accept header states that we want the results in the JSON format. The Echo-Token header allows the CMR to know who is making the request for authorization purposes.
curl -v -XPOST -H "Content-type: application/iso:smap+xml" -H "Accept: application/json" -H "Echo-Token: 75E5CEBE-6BBB-2FB5-A613-0368A361D0B6"
-d @../../Downloads/records/something1.xml https://cmr.sit.earthdata.nasa.gov/ingest/providers/<Provider ID>/validate/collection/something1
Table 2: Query Return Result Types
Value | Description |
---|---|
RESULTS | Returns the detailed metadata for items that match the query directly in the response. When using this option, you may choose to limit the actual metadata values returned. In addition, the CMR will return a result set identifier (ResultSetGUID) for subsequent retrievals of the results or for paging through the result list. Note that CMR Operations limits the maximum number of items returned to 2,000 at a time. The complete results are stored in your result set which you can retrieve by using the GetQueryResults operation. |
RESULT_SET_GUID | Returns the result set guid of the results that are stored on the server. The CMR will generate a result set but will not return any results. You must subsequently retrieve the results using the GetQueryResults operation. |
HITS | Returns the number of hits (matches) to the query and a ResultSetGUID for the results stored on the server. The CMR will generate a result set but will not return any results. The number of records may be a statistically determined for large result sets. You must subsequently retrieve the results using the GetQueryResults operation. Hits is a relatively expensive operation therefore if the client only needs to know if some data exists, it is faster to simply query for ITEM_GUIDS with a small iterator size. |
ITEM_GUIDS | Returns the Catalog Item GUIDs that match the specified query. Note: No ResultSetGUID is returned since results do not persist in the system. All the GUIDs of the granules/collections that satisfy the query are returned to the client. It is the client‘s responsibility to request the metadata for each individual granule/collection using the GetCatalogItemMetadata operation discussed later. |
Formatting Your Query Results
You can specify a subset of the information in the result set by using different parameters for the operations
ExecuteQuery and GetQueryResults.
The following elements are used to specify the format and content of a result set:
Table 3: Result Set Content Elements
Argument | Description |
---|---|
IteratorSize | Specifies the number of results to be returned from a single operation. This does not limit the number of items a query may match (see MaxResults) but limits to 2,000 the number of matching items returned in the result set, starting from the Cursor position. This field is only used if the result type is set to RESULTS. |
Cursor | Specifies the index of the first record to be returned in the result set. For example, a value of 5 will return results starting from the fifth record. If none is specified, it defaults to 1. If you repeat the same query later, use the same Cursor value. This field is only used if the result type is set to RESULTS. |
MetadataAttributes | Specifies which fields of the CMR Metadata you actually want to return. By only requesting the parts of the metadata you are interested in, you can increase query performance substantially. By default, the CMR returns all of the metadata for each item. This field is only used if the result type is set to RESULTS. |
Metadata results are returned as XML that conforms to one of the following DTDs:
Granule metadata conforms to the Granule Results DTD—refer to Appendix F, Results DTDs (also located at: http://api.cmr.nasa.gov/cmr/dtd/CMRGranuleResults.dtd).
Collection metadata conforms to the Collection Results DTD Appendix F, Results DTDs (also located at: http://api.cmr.nasa.gov/cmr/dtd/CMRCollectionResults.dtd).
Metadata attributes are made up of two values: the XML metadata attribute name and a primitive type name. The CMR currently ignores the type name. The allowable metadata attribute names are specified in the appropriate DTD (CMRGranuleResults.dtd for granules and CMRCollectionResults.dtd for collections). If you specify a metadata attribute name that has sub-attributes, all of the sub-attributes will be included as well. For example, if you specify Platform, the following elements will be included in the metadata:
- Platform
- PlatformShortName
- Instrument
- InstrumentShortName
- Sensor
- SensorShortName
- SensorCharacteristics
- SensorCharacteristicName
- SensorCharacteristicValue
OperationMode
When you specify a sub-attribute, the CMR will return the ―parent‖ attribute in the hierarchy as well as the sub-attribute. This allows you to ensure that data are correctly scoped. For example, if you specify Sensor, the following elements will be included in the metadata:
- Platform
- PlatformShortName
- Instrument
- InstrumentShortName
- GranuleUR
- GranuleURMetaData
Detailed spatial attributes cannot be used as MetadataAttributes; only their containing element may be specified. For example, you cannot use BoundingBox as a MetadataAttribute, but you can use HorizontalSpatialDomainContainer. The following spatial elements cannot be specified as MetadataAttributes:
- Point
- Circle
- BoundingRectangle
- GPolygon
- Polygon
- PointLongitude
- PointLatitude
- CenterLongitude
- CenterLatitude
- Radius
- WestBoundingCoordinate
- NorthBoundingCoordinate
- EastBoundingCoordinate
- SouthBoundingCoordinate
- Boundary
- ExclusiveZone
- SinglePolygon
- MultiPolygon
- OutRing
- InnerRing
Specifying GranuleURMetaData as a MetadataAttribute is equivalent to not specifying any MetadataAttributes; the result set includes all the elements in the result DTD.
The following code snippet shows how to execute a query for all of the metadata for matching items.
// Execute a query to get results QueryResponse response = catalogService.executeQuery(userToken, queryString, ResultType.RESULTS, 10, // Iterator 0, // Cursor 3000, // max results null); // no metadata attributes specified MetadataAttribute[] attributes = new MetadataAttribute[] { new MetadataAttribute( "HorizontalSpatialDomainContainer", null) }; QueryResponse response = catalogService.executeQuery(userToken, queryString, ResultType.RESULTS, 10, // Iterator 0, // Cursor 3000, // max results attributes);
Handling Large Result Sets
Given the CMR's large store of Earth Science data, it is possible for queries to return very large result sets. The CMR supports retrieving the results from a query in a number of ways. The simplest is to ask the CMR to return the results directly from the ExecuteQuery request by passing RESULTS as the ResultType. However, to prevent a single query from monopolizing CMR resources, the CMR limits the number of results available in response to a query. By default, this limit is 2,000 items. CMR Operations may change this limit depending on CMR usage patterns.
For larger results, the CMR supports a paging mechanism. This allows you to page through the available data in page sizes that you select (up to the CMR Operations configurable limit). For all ResultTypes, the CMR will create and store a result set and return the corresponding GUID. You can page through the result set using the GetQueryResults operation. The arguments to GetQueryResults are similar to ExecuteQuery with the exception that you specify the result set GUID rather than a new query.
Result sets may change after they are created. Providers are continually changing the data they have registered in the CMR. New records may appear or may be removed from a result set. Because of this, you should watch the fields Cursor and CursorAtEnd when paging through a large result set:
Cursor specifies the index of the first record to be returned in the result set. For example, a value of 5 will return results starting from the fifth record. If none is specified, it defaults to 1. If you repeat the same query later, use the same Cursor value.
Use CursorAtEnd to determine when you have reached the end of the result set. This Boolean field is TRUE if the returned results were the last available results in the result set.
The following code illustrates paging through a result set and displaying it to the user.
final int ITERATOR_SIZE = 10; try { CatalogServiceLocator catalogServiceLocator = new CatalogServiceLocator(); CatalogServicePort catalogService = catalogServiceLocator.getCatalogServicePort(); QueryResponse response = catalogService.executeQuery(userToken, userQuery, ResultType.RESULT_SET_GUID, 0, 0, 1000, null); String resultSetGuid = response.getResults().getResultSetGuid(); // begin paging through results int cursor = 1; boolean atEnd = false; while (!atEnd) { //Get next ITERATOR_SIZE results QueryResults results = catalogService.getQueryResults(userToken, resultSetGuid, null, ITERATOR_SIZE, cursor); //Print out results System.out.println(results.getReturnData()); //Set cursor to next index cursor = results.getCursor(); //Check if at end of result set atEnd = results.isCursorAtEnd(); } System.out.println("All results retrieved"); } catch (CMRFault e) { e.printStackTrace(); } catch (ServiceException e) { e.printStackTrace(); } catch (RemoteException e) { e.printStackTrace(); }
Like ExecuteQuery, GetQueryResults takes an array of MetadataAttributes. Internally, the CMR only stores in a result set the item IDs that match a given query. This means that you may pull different metadata from a single result set with each call by varying what you pass to the MetadataAttribute array without needing to re-query the CMR . It is highly recommended you use the MetadataAttribute array to restrict the information the CMR returns and thus improve performance.
Visibility of Results
When you execute a query, the query is applied to all the data in the CMR. However, when the results are retrieved, you may not see all of the items. What you can see depends on the rules defined by the Data Partners and the privileges granted to you.
Restricted Items
If a particular item in your result set is restricted for you (i.e., you are not allowed to see it), based on your privileges, it will not be returned.
Deleted Items
It is possible that between the time you execute a query and the time you view the results some of the matched items may have been deleted from the CMR or restricted due to a request from the Data Partner who owns the metadata. In that case, the item will not be returned in your result set. For more information about notification of deleted or restricted order items, refer to section 7.8.1, Restricted or Deleted Order Items.
Querying for Orderable Data
The CMR allows you to exclude from your query data that cannot be ordered. Refer to section 1.1.1.
Searching for Orbit Data
4.4.1 Backtrack Orbit Search Algorithm
Orbit searching is by far the most accurate way to search for level 0-2 orbital swath data. Unfortunately orbital mechanics is a quite difficult field, and the most well known orbit model, the NORAD Propagator, is quite complex. The NORAD Propagator is designed to work with a wide range of possible orbits, from circular to extremely elliptical, and consequently requires quite a bit of information about the orbit to model it well.
To facilitate earth science, the orbits of satellites gathering earth science data are quite restricted compared to the variety of orbits the NORAD Propagator is designed to work with. Generally, the earth science community would like global coverage, with a constant field of view, at the same time every day. For this reason, most earth science satellites are in a sun-synchronous, near-polar orbit. Even missions that are not interested in global coverage, e.g., the Tropical Rainfall Measuring Mission (TRMM), are still interested in having a constant field of view so the coverage of the sensor is at a constant resolution. For this reason, ALL earth science satellites are in circular orbits.
The Backtrack Orbit Search Algorithm, designed and developed by Ross Swick, exploits this fact to simplify the orbit model by modeling an orbit as a great circle under which the Earth rotates. This reduces the number of orbital elements required for the model from 22 to three. Moreover, the NORAD Propagator is designed to predict future orbits based on current status, and consequently must be reinitialized periodically to correct for cumulative error as the model spins forward. As the name implies Backtrack spins the orbit backwards, and in practice spins backwards at most one orbit, so there is no cumulative error.
For more information on Backtrack, please see http://geospatialmethods.org/bosa/.
Figure 2. Typical Orbit Path Represented on a Globe and the same Path on a Map
Backtrack orbit model
Three parameters to define an orbit:
- Instrument swath width (in kilometers)
- Satellite declination or inclination (in degrees)
- Satellite period (in minutes)
Orbit data representation
Three parameters to represent orbit data:
- Equatorial crossing longitude
- Start circular latitude (or start latitude and start direction)
- End circular latitude (or end latitude and end direction)
How the CMR Searches for Orbit Data
- The user specifies a regular spatial window
Figure 3. Spatial Window
< granuleCondition > < spatial > < IIMSPolygon > < IIMSLRing > < IIMSPoint lon = "-90" lat = "49" /> < IIMSPoint lon = "-90" lat = "39" /> < IIMSPoint lon = "-70" lat = "39" /> < IIMSPoint lon = "-70" lat = "49" /> < IIMSPoint lon = "-90" lat = "49" /> </ IIMSLRing > </ IIMSPolygon > < SpatialType > < list > < value >ORBIT</ value > </ list > </ SpatialType > </ spatial > </ granuleCondition > |
- Backtrack then calculates from both ascending and descending a path for equatorial longitude crossings and start/end circular latitudes according to user's query window.
Sample queries
The following are sample queries that you can execute against the CMR. Note that the provider and the datasets used in these samples are representative only; you should modify the query to suit your needs.
<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE query PUBLIC "-//CMR CatalogService (v{*}10{*})//EN" "http://api.cmr.nasa.gov/cmr/dtd/IIMSAQLQueryLanguage.dtd"> <!- Search for collections from ORNL_DAAC that have parameter value that contains 'IMAGERY'--> <query> <for value="collections"/> <dataCenterId> <list> <value>ORNL_DAAC</value> </list> </dataCenterId> <where> <collectionCondition> <parameter> <textPattern>'%Imagery%'</textPattern> </parameter> </collectionCondition> </where> </query>
<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE query PUBLIC "-//CMR CatalogService (v{*}10{*})//EN" "http://api.cmr.nasa.gov/cmr/dtd/IIMSAQLQueryLanguage.dtd"> <!-- Search for collections from GSFCECS and ORNL_DAAC that have processing level 1A or 2 --> <query> <for value="collections"/> <dataCenterId> <list> <value>GSFCECS</value> <value>ORNL_DAAC</value> </list> </dataCenterId> <where> <collectionCondition negated="y"> <processingLevel> <list> <value>'1A'</value> <value>'2'</value> </list> </processingLevel> </collectionCondition> </where> </query>
<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE query PUBLIC "-//CMR CatalogService (v{*}10{*})//EN" "http://api.cmr.nasa.gov/cmr/dtd/IIMSAQLQueryLanguage.dtd"> <!-- Search for collections from ORNL_DAAC with: temporal range: periodic range between Jan 1, 1990 and Dec. 31 1998from the 1st to the 300th day of each year, AND spatial extent: bounding box 60S, 70W to 60N, 70E. --> <query> <for value="collections"/> <dataCenterId> <value>ORNL_DAAC</value> </dataCenterId> <where> <collectionCondition> <temporal> <startDate> <Date YYYY="1990" MM="01" DD="01"/> </startDate> <stopDate> <Date YYYY="1998" MM="12" DD="31"/> </stopDate> <startDay value="1"/> <endDay value="300"/> </temporal> </collectionCondition> <collectionCondition negated="n"> <spatial operator="RELATE"> <IIMSPolygon> <IIMSLRing> <IIMSPoint long='-10' lat='85'/> <IIMSPoint long='10' lat='85'/> <IIMSPoint long='10' lat='89'/> <IIMSPoint long='-10' lat='89'/> <IIMSPoint long='-10' lat='85'/> </IIMSLRing> </IIMSPolygon> </spatial> </collectionCondition> </where> </query>
<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE query PUBLIC "-//CMR CatalogService (v{*}10{*})//EN" "http://api.cmr.nasa.gov/cmr/dtd/IIMSAQLQueryLanguage.dtd"> <!-- Search for collections from ORNL_DAAC with temporal range: periodic range between Jan 1, 1990 and Dec. 31 1998 from the 1st to the 300th day of each year, AND some days of January. source name: L7 or AM-1 AND spatially covering any 'temperate' region or USA --> <query> <for value="collections"/> <dataCenterId> <list> <value>ORNL/value> </list> </dataCenterId> <where> <collectionCondition> <temporal> <startDate> <Date YYYY="1990" MM="01" DD="01"/> </startDate> <stopDate> <Date YYYY="1998" MM="12" DD="31"/> </stopDate> <startDay value="1"/> <endDay value="300"/> </temporal> </collectionCondition> <collectionCondition negated='n'> <sourceName> <list> <value>'L7'</value> <value>'AM-1'</value> </list> </sourceName> </collectionCondition> <collectionCondition> <spatialKeywords> <list> <value>'temperate'</value> <value>'USA'</value> </list> </spatialKeywords> </collectionCondition> <collectionCondition> <temporalKeywords> <textPattern>'%january%'</textPattern> </temporalKeywords> </collectionCondition> </where> </query>
<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE query PUBLIC "-//CMR CatalogService (v10)//EN" "http://api.cmr.nasa.gov/cmr/dtd/IIMSAQLQueryLanguage.dtd"> <query> <for value="collections"/> <dataCenterId> <value>ORNL_DAAC</value> </dataCenterId> <where> <collectionCondition> <additionalAttributeNames> <list> <value>'Provider_Specific_Attribute_1'</value> <value>'Provider_Specific_Attribute_3'</value> </list> </ additionalAttributeNames > </collectionCondition> </where> </query>