This document provides a look and guide to the Common Metadata Repository (CMR) from the perspective of the Client Partner.
Programming examples use a fixed width font, have upper/lower lines separating them
from the rest of the text, and are in this color font.
Comments (denoted by // within examples)
Best practices or warnings appear in italicized, boxed text.
The primary reason for designing the CMR was to increase access to Earth Science data and services by providing a system with a machine-to-machine interface, that is, an Application Programming Interface (API). The CMR functions as a metadata clearinghouse of Earth Science metadata for a wide variety of partners, enabling the science community to exchange information. Data Partners provide the Earth Science community with metadata representing their Earth Science data holdings. CMR technology in turn provides services for Client Partners and Data Partners and supports efficient discovery and access to Earth science data. The CMR also functions as an order broker for the data, and offers services applied to that data. The CMR provides a portal on the internet where CMR clients can search the metadata for information they wish to order. The CMR also functions as an internet portal where CMR clients can search the metadata for information they wish to retrieve.
Client applications can access data holdings via order distribution or online access. Data Partners retain complete control over what metadata are represented in the CMR including inserting new metadata, modifying existing metadata and removing old metadata, and controlling access to their metadata.
Usually performed in the order shown below:
In Chapter 4
In Chapter 7
In Chapter 8
Since the CMR uses platform-independent web service definitions for its API, there are no requirements for a client programming language. All examples in this document are in snippets of Java code; however, the code samples provided could be translated to any web service capable language.
As a CMR Data Partner, you need to be familiar with basic software development and Service Oriented Architecture (SOA) concepts such as:
As a REST-API user, you will need:
As a SOAP-API user, you will need:
NASA's Earth Science Data and Information System (ESDIS) has built the CMR based on Extensible Markup Language (XML) and Web Service technologies. The CMR interfaces with clients and users through its series of Application Program Interfaces (APIs). The CMR is an open system with published APIs available to the CMR Development and User community.
As the CMR is a middleware application, interacting with it means interacting with the CMR API. There is typically a user-focused client application interacting with CMR's API on behalf of an end user. This client may be a generic, query and order-based client, or may be specific to an end user's research, mission, or general area of interest. The CMR incorporates a Universal Description, Discovery, and Integration (UDDI) registry to facilitate registration, discovery, and invocation of services related to the ECHO holdings.
Internally, the CMR specifies APIs and provides middleware components, including data and service search and access functions, in a layered architecture. The figure below depicts the ECHO system context in relation to its public APIs.
All CMR metadata are stored in an Oracle database with spatial extensions. The metadata model is derived primarily from that used by the Earth Observing System Data and Information System (EOSDIS) Core System (ECS). For more details about the CMR model, refer to the Earth Science Metadata Model chapter of the Data Partner Guide
Key features of the CMR architecture are:
Oracle enables the the CMR system to interact with spatially enabled Earth science metadata by use of spatial extensions into the system and business logic within the system that understands how to interact with that metadata. In addition, a second CMR interface (Ingest) allows metadata updates to go directly into the database, bypassing the message-passing API. The File Transfer Protocol (FTP) server is configured to receive these update files, which are expressed in XML conforming to three schemas, one for granules (or inventory), one for collections (or datasets), and one for browse. Note: For ECHO 10.0 and later, these formats are schemas; for legacy ECHO, these formats are DTDs.
The schemas are defined on the ECHO 10.10 Ingest DTD/Schemas page of the ECHO website: http://api.echo.nasa.gov/echo/apis.html
Oracle's spatial capabilities support queries for CMR metadata whose spatial extents are described within the system. A Data Partner can define the spatial extent of a granule or a collection with different spatial constructs (for example: point and polygon). A Client Partner can then construct a search using a point, a line, or a polygon (or multiple polygon) spatial type, and the CMR responds with data whose spatial regions intersect the described region. The CMR provides services for interacting with its Catalog of metadata. Queries can be performed in a number of ways; result formats can be specified, and the resulting data sets can be incrementally accessed so that large return sets can be handled gracefully. The CMR also supports constructing, submitting, and tracking orders for the data that the metadata represents. The CMR supports both an embedding of a Uniform Resource Locator (URL) within the metadata for accessing the data (which the client simply accesses via Hypertext Transfer Protocol [HTTP]), and a more complicated order process in which quotes and order options are accommodated.
The CMR incorporates the ECS concept of granules and collections and defines separate DTDs for updating each, under the assumption that granules will indicate which collection is considered their ―primary‖ collection. ―Primary collection‖ "primary collection." "Primary Collection" means the collection that owns the granule.
A collection is a grouping of granules that all come from the same source, such as a modeling group or institution. Collections have information that is common across all the granules they "own" and a template for describing additional attributes not already part of the metadata model.
A granule is the smallest aggregation of data that can be independently managed (described, inventoried, and retrieved). Granules have their own metadata model and support values associated with the additional attributes defined by the owning collection.
A third type of metadata is browse metadata, which provide a high-level view of granule or collection metadata and cross-referencing to other granules or collections. Browse metadata are not spatially enabled but are still useful.
The CMR system supports Secure Sockets Layer (SSL)-based communication, which a client must use to pass passwords or other sensitive information securely. Internally, the systems are firewalled to prevent unintended access.
The CMR system supports clients capable of initiating an HTTP connection from a variety of programming languages
The CMR provides an infrastructure that allows various communities to share tools, services, and metadata. As a metadata clearinghouse, it supports many data access paradigms such as navigation and discovery. As an order broker, the CMR forwards orders for data discovered through the metadata query process to the appropriate Data Partners for order fulfillment. As a service broker, the CMR decentralizes end user functionality and supports interoperability of distributed functions.
Although this Guide focuses on the needs of Client Partners, the CMR supports the following different, nonexclusive types of Partners:
To address the CMR system vision, the CMR has responded to a set of system drivers, that is, reasons for upgrading. These drivers, derived from functional, organizational, and operational concerns expressed by the user community, determined the architectural approach and the types of technical solutions used in building the CMR system.
The primary goal of ECHO is to enable organizations to participate in making their resources and capabilities available to the Earth Science community. To facilitate participation by these organizations, ECHO has:
While aggressive in the capabilities it is targeted to support, ECHO minimizes the Cost to Field by continually evaluating performance and functionality against costs, for example, licensing of Commercial Off-the-Shelf
(COTS) applications, amount of custom code required, hardware platform requirements, and complexity of networking and installation.
Once fielded, ECHO seeks to minimize the cost to operate the system by making it easier to use, thereby minimizing the load on operations staff.
ECHO is being built with long-term extensibility foremost in mind. To enable emerging techniques and strategies for Earth Science research, ECHO has:
There are three ECHO Systems that you, as a Client Partner, have access to:
ECHO Operations This is the current operational system for ECHO and is available to all users.
Location: http://api.echo.nasa.gov/echo/index.html
ECHO Partner Test This is an operational system used only by the ECHO partners where they can test their data and services prior to making the final changes in the operational system
Location: http://api-test.echo.nasa.gov/echo/index.html
ECHO Testbed This is a test system area used by partners and ECHO testers to test before changes to the ECHO system go operational