You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 23 Next »

 

Use Case: I need to provide information about the quality of my data and how it was measured.

Overview


A principle goal of metadata is to ensure that the data they describe can be independently understood and used effectively. Data quality measures and reports play a critical role in achieving this goal. Connecting these to the metadata record is clearly important. 

The approach to including quality information in ISO 19115 metadata records is similar to the approach used in ECHO and improves on it considerably. It includes the capability to describe quality measures that are used and the techniques used to apply them. Understanding how to take advantage of this flexibility and implementing systems that maximize the value that this capability provides will certainly be a challenge for the environmental data community.

The Data Quality Section of the ISO standard supports flexibility at several levels. The DQ_Data_Quality object (see Figure) includes two sections: scope, and element. A metadata record can have any number of associated DQ_DataQuality objects.

Presentations available that describe the ISO data quality standard and compare it with the DIF and ECHO approaches.

Recommendations


An important initial goal is to connect users to existing literature that includes quality information. This should be done at the collection level using standaloneQualityReports. A complete report includes a description of the scope of the quality information with spatial and temporal extent information, an abstract, and a complete citation.

Conceptual Model


This Figure shows an overview of the data quality information included in the ISO 19157 standard. The elements are:

Scope: quality information can be provided at many different level of detail from an entire series down to specific attributes in particular regions or time periods. The scope attribute lets users know what the current information pertains to.

StandaloneQualityReport: quality information exists in papers and reports that are already part of the scientific literature. Users can find this information using citations included in the metadata record.

Report: quality reports hold the bulk of the structured quality information in the metadata record. They describe the quality measure, how it was applied, and the result of the application.

An overview of the complete data quality standard is also available.

 

Scope:


The DQ_Scope object enables a complete description of the scope of a data quality report whether it pertains to a complete data series or to a particular temporal and spatial extent for a particular parameter in a dataset.

level:

The level element of the scope provides a general description of the scope using a codelist of standard terms. The level can be used in an initial search of the quality information to answer questions like "what kinds of quality information are available for this collection or granule".

attribute

dimensionGroup

model

aggregate

attributeType

feature

tile

product

collectionHardware

featureType

metadata

collection

collectionSession

propertyType

initiative

coverage

dataset

fieldSession

sample

application

series

software

document

 

nonGeographicDataset

service

repository

 

levelDescription:

In some cases, the general description provided by the codelist is not enough. In those cases the levelDescription can be used to provide more detail. For example, the scope code might be "attribute", indicating that the report concerns an attribute of the dataset, and the levelDescription would provide the name(s) of the specific attributes covered, i.e. precipitation rate.

extent:

In some cases, the quality information may apply to a particular spatial and temporal extent of the dataset. This extent can be described at several different levels of detail using the EX_Extent object. The extent should be described quantitatively, if possible, or with a text description if quantitative information is not possible.

StandAlone Report


ISO 19157 recognizes that important data quality information can exist outside of the conceptual framework of the model and that it may be helpful to provide that information as a supplement to the metadata. The standaloneQualityReport was added to enable connections between the metadata and these reports. It includes an abstract (CharacterString) that provides a brief description of the quality report in the metadata record and a the citation (CI_Citation) that provides the reference that a user needs for the complete report.

As an example, consider this data quality description from a GCMD metadata record: 

Abstract: The fire training-set may also have been biased against savanna and savanna woodland fires since their detection is more difficult than in humid, forest environments with cool background temperatures [Malingreau, 1990]. There may, therefore, be an under-sampling of fires in these warmer background environments.

Citation: Malingreau J.P, 1990, The contribution of remote sensing to the global monitoring of fires in tropical and subtropical ecosystems. In: Fire in Tropical Biota, (J.G. Goldammer , editor), Springer Verlag , Berlin: 337-370.

Report


Quality reports are encapsulated in the 19115 DQ_Element class that includes three kinds of information, the measure, the evaluationMethod and the result. It is expected that an organization uses a standard set of quality measures that are managed centrally and have unique names, descriptions, and identifiers. The nameOfMeasure and measureDescription are included in the metadata record and should help users understand the nature of the measure. The measureIdentification is an identifier that can be used to find a more detailed description of the measure.

The evaluation method provides information on how the quality measure was applied for this report. These methods can also be standardized across an organization. Citations are available to describe the specific proceedure and more complete references.

Finally, the result reports the results of the application of the measure in this specific case. Several types of reports are included in the standard (see below).

 

DQ_MeasureReference [0..1]
+ measureIdentification: MD_Identifier [0..1]
+ nameOfMeasure: CharacterString [0..*]
+ measureDescription: CharacterString [0..1]

DQ_EvaluationMethod [0..1]
+ dateTime: DateTime [0..*]
+ evaluationMethodDescription: CharacterString [0..1]
+ evaluationProceedure: CI_Citation [0..1]
+ referenceDoc: CI_Citation [0..*]
+ evaluationMethodType: DQ_EvaluationMethodTypeCode [0..1]

DQ_Result [1..*]
+ dateTime: DateTime [0..*]
+ resultScope: DQ_ScopeCode [0..1]

Implementation (XML)


Implementation (NcML)


 

Usage


 

Crosswalks


 

Notes


 

  • No labels