Background: The Data Quality Working Group analyzed a number of use cases to highlight issues related to EOSDIS data during 2014-2015. Following this analysis, approximately 100 recommendations were made for improvement, of which 12 were deemed to be high priority. Among those 12, four were selected to be Low Hanging Fruit (LHF) recommendations since there appeared to be existing solutions that could address them.  During 2015-2016, the DQWG identified the following list of solutions that have the potential for being adopted across EOSDIS to address the LHF recommendations.

This list is intended to be maintained, augmented and continually updated to address all the DQWG recommendations (REF) in due course.

The table below shows only selected attribute columns of each solution record. For the complete detailed Solutions Master List, please download Excel spreadsheet attached to this page: NASA_ESDIS_Data_Quality_Solutions_Master_List_v20180327.xlsx.

Please contact NASA Official (Stephen Berrick) if you have any questions, comments, or new solution recommendations.

No.Solution Name

Solution Summary

(used to derive relevance)

Implementation StrategyBenefits of Proposed Implementation SolutionsSolution Point of ContactReference URLs

1

Collaboratory for quAlity Metadata Preservation (CAMP) - ASDC

CAMP is currently being developed and expanded upon for the ASDC metadata reconciliation efforts. As development progresses, the ASDC will leverage this platform as a centralized repository for metadata entry/revisions, new data submission requests, and interoperability for both internal (i.e. OPenDAP) and external (i.e. CMR REST API) systems to streamline metadata management and increase transparency for the data ingest process. The end goal is to provide a UI for direct metadata entry by ASDC members and data providers. Validate CMR Compliance.

Metadata for current ASDC (BEDI identified) data products has been imported into the CAMP Database and are within weeks of being validated by science teams. Depending on required CMR fields, there may be fields added.

Facilitate DAAC - PI Communication;

Support Metadata Creation (dataset-level)


  1. Confidence in metadata accuracy
  2. Quick and easy to provide metadata
  3. Metadata completeness

https://camp.larc.nasa.gov/


2

Metadata Compliance Checker (PO.DAAC)

Provides tool for both DAACs and Data Producers to evaluate metadata standards compliance at granule level. Multiple forms of compliance check: ACDD, CF, ... quality flags, completeness/compliance, ...netCDF/HDF/OPeNDAP, Target at data producers as major user community. Output report from the checker will contain useful information and be exposed to end users? validate time against ISO 8601

Standards Compliance Checking and Reporting (granule-level);

Support Metadata Creation (granule-level)


http://podaac-uat.jpl.nasa.gov/mcc/

3

ATRAC (NOAA/NCEI/NCDC)

Provides open web form for metadata entry by data producer which is interfaced with a backend metadata archive database maintained by the data center.

Note: Whether this resource/tool is developed directly by ESDIS or by a DAAC, the important aspect is that the DAAC must have immediate access to the metadata that is input by this tool for the purpose of verifying accuracy and completeness.

Support Metadata Creation (dataset-level);

Standards Compliance Checking and Reporting (dataset-level);


https://www.ncdc.noaa.gov/atrac/index.html

4

ORNL DAAC Ingest Automation System (SAuS)Tool is developed by ORNL DAAC and provides a more automated workflow for data submissions intended to increase efficiency of DAAC/Producer communications regarding new datasets or new versions of datasets. Tool could be optimized or extended to include additional information exchange for data quality and or quality flag information. Core functions include: 1) Track data ingest; 2) Automate ingest; 3) Streamline communication; 4) Central management system

Facilitate DAAC - PI Communication;

Support Metadata Creation (dataset-level);


https://git.earthdata.nasa.gov/projects/DAACSUB/repos/daac-ingest-dashboard/browse

ORNL DAAC Ingest Automation Swimlanes

Presentation at 2015 ESIP Summer Meeting

5

Ocean CO2 Metadata Collection Form

Collection-level metadata collection form developed by ORNL for oceanic in situ observation datasets tailored for CO2 collection. Could potentially be extended to include satellite datasets.

Same as the metadata editor in the ORNL DAAC Ingest Automation System

Metadata creation support (dataset-level)
http://mercury.ornl.gov/OceanOME/

6

Data Quality Guide DocumentA standardized template document design to provide users with familiar and comparable data quality guidance for all data sets sharing a common measurement parameter. Data quality templates for MEaSUREs to fill out.

Guidance,  Instruction, and Dissemination (for data users)


A few examples - found by Google search for AIRS, CERES and MODIS Data Quality.
  1. http://docserver.gesdisc.eosdis.nasa.gov/repository/Mission/AIRS/3.3_ScienceDataProductDocumentation/3.3.5_ProductQuality/V6_L2_Quality_Control_and_Error_Estimation.pdf
  2. https://eosweb.larc.nasa.gov/project/ceres/quality_summaries/CER_SSF_Terra_Edition3A.pdf
  3. http://ceres.larc.nasa.gov/dqs.php
  4. https://lpdaac.usgs.gov/sites/default/files/public/modis/docs/MODIS_LP_QA_Tutorial-1b.pdf
  5. https://globalmonitoring.sdstate.edu/sites/default/files/QA_paper.pdf
  6. http://modis-atmos.gsfc.nasa.gov/_docs/QA_Plan_C6_Master_2015_05_05.pdf

7

ACT-America Science Data Working Group

A Science Data WG, including participants (funded by the project) from data centers (ORNL DAAC and ASDC) and different research groups, was formed in the ACT-America project to 1) coordinate data management activities with instrument teams, modelers, remote sensing, and external data sources and 2) ensure data, products, and information required to address science questions are available in harmonized forms when needed. Telecons are held periodically to exchange any data-related thoughts between research groups and the data centers.

Currently solution is applicable to modeling (ORNL DAAC) and Airborne observations (ASDC) components of ACT-America data management; But can be applicable to others.



Facilitate DAAC - PI Communication

1. Coordinate data management activities with instrument teams, modelers, remote sensing, and external data sources

2. Ensure data, products, and information required to address science questions are available in harmonized forms when needed.

N/A

8

(NASA) Science Advisory TeamA NASA assigned team to review data for each project/product, such as "NASA SAT MEaSUREs WELD". These scientists would be assigned to a project/product team, are recognized as experts in the specified field(s), and serve to advise the verification and quality of final distributed products.

Data Quality Information (science perspective);

Guidance, Instruction, and Dissemination

  1. Provides early adopters to data products from NASA Earth Science remote sensing projects.
  2. Provides beta testers for MEaSUREs ESDRs.

9

Data Quality Section in Data Management PlanRecommend including a section on data quality in the Data Management Plan to be created for each project, such as MEaSUREs, after the award, as a living document to be updated as more details about the data are identified. (It is possible that the initial version of the DMP is prepared before details are known, since it is to be delivered early in the project).

Guidance,  Instruction, and Dissemination (for data producers and DAACs);

Facilitate DAAC - PI Communication;

Dissemination on Data Quality Information;



http://science.nasa.gov/media/medialibrary/2012/05/07/Data_Mgmt_Plan_guidelines-20110111.pdf - document has brief comment in section 3.2.3: "This section should also describe project
requirements and plans for
assuring and documenting
data quality including validation
and release of products to the archive system." HS3 and CARVE DMP's have material on data quality while AirMOSS DMP does not.

10

DAACs DMP (or Data Management Guidlines)


Some DAACs (e.g., PODAAC,SEDAC, ...) write their own DMPs for specific datasets or a collection of datasets for the purpose of managing datasets throughout their lifecycle. PO.DAAC is currently finalizing a standardized template for the DAAC-specific DMP. The SEDAC Data Nomination template is used internally and contains sections to capture data quality information.

The ORNL DAAC doesn't have DMPs for specific datasets. Instead, it provides general guidance for data providers to conduct data management and prepare for data archival.

Guidance,  Instruction, and Dissemination (for data producers and DAACs);

Facilitate DAAC - PI Communication


David Moroni

Yaxing Wei (knowledge authority)

ORNL DAAC

Data Management Guidance for Data Providers: http://daac.ornl.gov/PI/pi_info.shtml

11

Kayako


Several DAACs have integrated Kayako, a customer service software, into their Websites to replace old ways of conducting user support. User questions and feedbacks for different DAACs are now managed consistently.

User Services (Help Desk);

Knowledgebase (for data users)

Kayako provides an integrated system for ESDIS and individual DAACS to easily track and coordinate user questions and feedbacks related to data products, websites, tools, etc.

It also allows individual DAACs to easily compile knowledge bases and FAQs by pulling past user support records from Kayako system.

Tammy Walker

Contact US on http://daac.ornl.gov/ and http://daymet.ornl.gov/

https://support.earthdata.nasa.gov

12

Daymet Website

The ORNL DAAC developed a project website dedicated for Daymet: http://daymet.ornl.gov . It is different from the landing pages of Daymet data sets. This website provides information about Daymet data description, documentation, visualizations, data access tools and services, publications using Daymet data, Daymet-related tools contributed by the users community, and news update.

Data quality information (program-specific collection);

Daymet website can be considered as one way to convey data product, including quality, information to data users.

Daymet is becoming probably the most popularly used data product recently. The Daymet website helps a lot, even though it's hard to quantify its impact on this popularity.

Yaxing Weihttp://daymet.ornl.gov/

13

Identify different ways in which DAACs are conveying data quality information

Identify different ways in which data quality information (e.g. quality flags and known issues) is being conveyed by various DAACs. Understand why they need to be different. To the extent possible arrive at common approaches. At least a minimal common set of items should be shown on data quality pages at the DAACs.

Data quality information (dataset-level);

Different approaches for data quality information (dataset-level)

Although most user guides contain some information on data quality, it would be good to provide guidance so that it is consistent and complete as possible.Hampapuram Ramapriyan

14

FAQ Development and Analysis (UserVoice)

Populate a set of FAQs for each new data set upon release by anticipating possible questions that users might ask. From FAQ, identify data sets receiving excessive questions as those to be considered for dissemination of additional or enhanced documentation.

  SEDAC example

User Services (Help Desk);

Knowledgebase (for data users)



http://sedac.uservoice.com/knowledgebase

15

NASA GSFC Data Quality Screening Service

A tool developed by Christopher Lynnes & user-1aaa1 for GES-DISC.

"DQSS is designed to screen data using both ontology based criteria and user selections of quality criteria (such as minimal acceptable QualityLevel). Data that do not pass the criteria are replaced with fill values, resulting in a file that has the same structure and is usable in the same ways as the original."

This service can be utilized before data ingest for the distributor. This service can also be utilized by the public - to further screen the product's quality.


Data quality screening (granule-level filtering)

Provides DAACs a tool to understand quality attributes for overall documentation to product validation.

Provides Users a tool to better understand how data decisions regarding quality were established.

http://opensource.gsfc.nasa.gov/projects/DQSS/

16

CF granule metadata Implementation of CF Conventions for quality variables to require flag_values, flag_mask, flag_meanings CF attributes

Guidance and instruction;

Data quality and information


http://cfconventions.org/

17

Document Error Sources/Limitations/Quality AssessmentProvide guidance to DAACs on including detailed information in product user guides that describes the limitations &/or quality of the data

Data quality and information;

User Services

Although most user guides contain some information on data quality, it would be good to provide guidance so that it is consistent and complete as possible.

http://nsidc.org/data/docs/daac/smap/sp_l2_smp/index.html#errorsource

(would be good to get examples from other DAACs)

18

LP DAAC Project Lifecycle Plan (PLP)

This document is written from the point of view of the LP DAAC, advocators for products as they move through the lifecycle from Inception to Active
Archive to Long Term Archive, and advocators of products that adhere to interoperability standards.

Product capture is the first step in providing community-wide access to data and information.

PO.DAAC has a very similar policy that covers a series of project lifecycle planning documents and artifacts known as the "Dataset Lifecycle Policy".

Guidance and instructionDAAC Scientist is part of the NASA funded dataset development - with focus on guidance and communication from the project start.

http://pubs.usgs.gov/of/2014/1139/pdf/ofr2014-1139.pdf

http://podaac.jpl.nasa.gov/PO.DAAC_DataManagementPractices#Dataset%20Lifecycle

19

EUFAR Metadata CreatorOnline metadata authoring tool that creates INSPIRE-compliant metadata in XML for the EU Facility for Airborne Research. But only free text for quality input.Metadata creation supportFacilitates entry of metadata and produces output that is standards compliant in content and format.http://176.31.165.18:8080/emc-eufar/

20

ISO Data Quality elementsA webpage describing elements of the ISO 19157 data quality metadata standard

Guidance and instruction; Metadata creation support

Until there is a NASA profile of the ISO metadata standard, metadata authors need guidance on how to express quality in ISO. This provides a guide.

https://geo-ide.noaa.gov/wiki/index.php?title=ISO_Data_Quality

ECHO Data Quality Metadata in ISO

21

schema for ISO metadata, including Data Qualityzip file containing schema for all 19115 and related metadata ISO standardsMetadata creation support If authoring metadata conforming to ISO standards (without a tool, or in customizing an existing tool) one need the schema for the standard.http://standards.iso.org/iso/19115/19115.zip

22

NCO Utilities for granule level metadata authorship, editing, and standardizationallows addition/modification of quality attributes in netCDF filesMetadata creation supportFacilitates creation and modification of metadata that complies with CF conventions. Specific to netCDF and HDF. Being expanded under EarthCube award "Advancing netCDF-CF for the Geoscience Community" http://nco.sourceforge.net/

23

AADC Metadata XML conversion scriptpy script that loops over metadata DIF XML files and converts them to other XML formats using XSL files.Metadata creation supportThis script would be useful for converting existing GCMD DIF records to, e.g. ISO.https://github.com/AustralianAntarcticDataCentre/metadata_xml_convert

24

PO.DAAC User ForumsThe PO.DAAC has established a user forum to service user inquiries on all data issues including data quality concerns. This forum is URS-compliant and also provides the ability to directly create a Kayako ticket for timely help desk support.

User Services (Help Desk);

Knowledgebase (for data users)

Provides FAQ's, data recipes, discussions on data quality issues, and discipline-specific discussion threads.https://podaac.jpl.nasa.gov/forum/

25

Virtual Quality Screening ServiceProvides an interface to screen L2/L3/L4 SMAP and GHRSST physical retrieval observations using quality information (variables) contained within the granules. Provides a data extraction method once the quality screening filters have been defined. Returns only the quality filtered data.

Data Quality Information Representation;

Guidance, Instruction, & Dissemination

User extracts only the data that meets their quality specifications set using quality flags, bit flags, or other variables.http://podaac-access.jpl.nasa.gov

26

MODIS Python Toolbox for ArcGISData values in MODIS quality layers are store as bit-packed integer values. To get at the information stored in the data values, users must first converted the integer value to its binary representation then interpret each specified bit combinations (bit words) which characterize particular quality attributes. The MODIS Python Toolbox contains a tool (DecodeQuality) that decodes MODIS quality layers, and returns individual thematic GeoTIFFs for each quality attribute.Data quality informationProvides thematic GeoTIFFs for each quality attribute contained in the original bit-packed data value.https://git.earthdata.nasa.gov/projects/LPDUR/repos/arcgis-modis-python-toolbox/browse




































  File Modified
Microsoft Excel Spreadsheet NASA_ESDIS_Data_Quality_Solutions_Master_List_v20180327.xlsx Mar 27, 2018 by Yaxing Wei

  • No labels

2 Comments

  1. Yaxing Wei , Is this something that should be updated? Significant improvements have been made to CAMP since it was initially added to this list, however, it has not been updated in this list and I just found a document that was created this year that references the content on this page as is even though it's 4 years outdated.

    1. Tiffany Trapasso  Thanks Tiffany. Yes, it's meant to be an active list. Could you please send me some information about the improved CAMP? I will work with Steve to get this list updated.