The COG data format builds on the established GeoTIFF format that has been accepted as a ESCO standard. Similar to how GeoTIFF built off the existing standard TIFF images by adding geo-referencing information, COG builds off of GeoTIFF to add features needed to optimize data use in a cloud based environment.
Cloud Optimized GeoTIFF relies on two complementary pieces of technology.
The first is the ability of a GeoTIFF to not only store the raw pixels of the image, but to also organize those pixels in ways that are more efficient for cloud storage and retrieval.
Traditional row by row organization COG tile organization
The second is HTTP GET range requests, that let clients ask for just the portions of a file that they need. Together these enable fully online processing of data by COG-aware clients, as they can stream the right parts of the GeoTIFF as they need it, instead of having to download the whole file.
A COG is a regular GeoTIFF file with three features enabled:
1. Internal overviews added
2. Internal tiling is enabled and tile size is properly defined
3. Internal compression is enabled
COG is currently a OGC draft candidate (http://docs.ogc.org/DRAFTS/21-026.html).
Options to Validate files:
COG Validator tool
Support from commonly used tools
COG-aware software is available both from the perspective of a data provider and data user. Currently tools that offer COG support include:
Geoserver: An open source geospatial data server offers a COG plugin that enables COG support to existing data stores. The plug in allows for image index configuration as well as range reading capability for HTTP and some standard cloud architectures.
QGIS: An open source desktop Geographic Information System (GIS). QGIS can support streaming COG files directly from their online location without having to download the files first. An example of this can be seen in this tutorial.
GDAL: An open source data translator library for raster and vector formats. Offers built-in COG data driver. Allows software developers to build applications from the read and write support that is available for COG.
NASA Implementations of COG
From LP DAAC:
HLSL30.002 - https://doi.org/10.5067/HLS/HLSL30.002 (C2021957657-LPCLOUD)
HLSS30.002 - https://doi.org/10.5067/HLS/HLSS30.002 (C2021957295-LPCLOUD)
ECO_L2T_LSTE.002 - https://doi.org/10.5067/ECOSTRESS/ECO_L2T_LSTE.002 (C2076090826-LPCLOUD)
ECO_L1CT_RAD - https://doi.org/10.5067/ECOSTRESS/ECO_L1CT_RAD.002 (C2595678301-LPCLOUD)
OPERA_L3_DIST-ALERT-HLS_PROVISIONAL_V0 - https://doi.org/10.5067/SNWG/OPERA_L3_DIST-ALERT-HLS_PROVISIONAL_V0.000 (C2517904291-LPCLOUD)
Efficient Imagery Data Access
COG-aware software can stream just the portion of data that it needs, improving processing times and creating real-time workflows previously not possible
Reduced Duplication of Data
Accessing COG’s with cloud workflows enables diverse software to all access a single file online instead of needing to copy and cache the data. This allows users who are interested in a subset of data by geographic or time limits to not have to download and store all the data.
Traditional GIS software is able to treat Cloud Optimized GeoTIFF’s just like normal GeoTIFF’s, so data providers need only produce one format.
Cloud Optimized GeoTIFF does not scale indefinitely. It is useful for small to medium-sized rasters, larger national sized rasters would not be well suited to storage in a COG.
COG skips the need for an API by allowing direct access to the raster data. While this makes things simple for COG file access, it is not flexible and means that data stored in vector or another raster format would have to be retrieved by the client in another manner. Making the overall solution less interoperable. A solution that includes a server, api and client is better suited to solving data distribution of various types.