Child pages
  • How can I determine the NativeId of a collection?
Skip to end of metadata
Go to start of metadata
  • No labels

6 Comments

  1. You can search for the collection using the collection search API and request for results back using the .umm-json extension. For example given the ShortName and VersionId you could submit the following search:

    curl -H "Cmr-Pretty: true" "https://cmr.sit.earthdata.nasa.gov/search/collections.umm-json?short_name=MOD03&version=5"
    {
      "hits" : 1,
      "took" : 8,
      "items" : [ {
        "meta" : {
        "revision-id" : 11,
        "deleted" : false,
        "format" : "application/echo10+xml",
        "provider-id" : "LAADS",
        "native-id" : "MODIS/Terra Geolocation Fields 5-Min L1A Swath 1km V005",
        "concept-id" : "C24935-LAADS",
        "revision-date" : "2015-02-24T13:25:22Z",
        "concept-type" : "collection"
      },
      "umm" : {
        "entry-title" : "MODIS/Terra Geolocation Fields 5-Min L1A Swath 1km V005",
        "entry-id" : "MOD03_5",
        "short-name" : "MOD03",
        "version-id" : "5"
        }
      } ]
    }

     

     

     

     

  2. If you are going to use the native-id with the CMR API and the native id contains spaces, don't forget to encode the nativeid:

    For example using the Native ID of

    MODIS/Terra Geolocation Fields 5-Min L1A Swath 1km V005

    The curl command would be:

    curl -XPOST -H "Content-type: application/echo10+xml" -H "Accept: application/xml" -d @record.xml https://cmr.earthdata.nasa.gov/ingest/providers/SOMEPROVIDER/collections/MODIS%2FTerra%20Geolocation%20Fields%205-Min%20L1A%20Swath%201km%20

  3. If there is an alternative to using native-id to ingest or delete collections, please let me know.

    Having the API depend on a secret key value that could be lost and is difficult to query for isn't ideal.  I figured mine out yesterday (chosen by persons doing the CMR reconcile) by searching earthdata and finding MODAPS/LAADS Reconciliation Merging Status.

    Is there a way to change native-ids? With ECHO I would always do a query on the ECHO system first to get the DatasetIds before I could use them to reconcile.  And I never cared for the long text format.  So it would be nice to replace those in the future with a simple function of shortname and versionid.  If I delete the old collection, would I then be allowed to ingest the same shortname+version (EntryId) with a new native-id?  Or would the non-uniqueness with a deleted collection still be flagged?  (Not an ideal solution anyway.)

    Finally I'll note that the CMR search_api_docs states  "The UMM JSON format is only applicable to collection searches. It is an alpha feature and subject to change in the future."   Which seems to mean I shouldn't rely on it?  It would be nice to get native-id back in dif10 and echo10 format, even if it required specifying a query parameter akin to include_granule_counts.

    thanks,

    Neal Devine

     

    1. One advantage of the native id is that the CMR doesn't care what value you use. It can be any particular value that is meaningful to you such as the dataset id, file name on disk, or a combination of fields like short name and version. Or it could be generated GUID that you track on your side. Previously ECHO forced you to always use dataset id. The native id should always be retrievable via the API so if you don't know an id that was previously used you can find it via CMR Search. You've raised a good point about the UMM JSON search response format being in alpha support and we do plan to change it in the future. Adding native id to more formats is a good idea. I've filed CMR-2432 - Getting issue details... STATUS  for this. It should be fairly easy for us to add it to these other response formats.

      Do you see any issues still with being able to keep track of or find the native id if we add native id support to more search response formats?

      Native id cannot currently be changed. The workaround you described of deleting the collection and re-adding with a new native id will work though. Ideally you would choose an identifier that does not ever need to change. That's a downside of using dataset id or other field from the metadata as the native id. Using a GUID that you track on your side or lookup through CMR Search would avoid the need to change the Native Id. 

  4. I didn't chose any of the LANCEMODIS or LAADS Provider native-ids!

     

    I used your suggestion to change two LANCEMODIS version "6NRT" native-id so that all 6NRT native-id have format ${shortname}_C6_NRT which isn't as nice as ${shortname}_${version} but is at least formulaic.  I've since added granules to 6NRT collections, so I can't use the delete/reinsert method to change the native-id again.  I think the "5NRT" collections are the same or can be made the same before we use them.  At which time I'll delete the old LANCEMODIS version "5" collections after ten days, so their native-ids don't matter much.

     

    For LAADS Provider the native-ids (from DataSetIds) are LongNames with spaces, and the collections have lots of granules.  It would be nice to change the native-ids to something like EntryId, but as I don't expect to delete many collections and haven't seen anything else besides validate/ that requires native-id, I can live with random values if I can retrieve them from CMR.  Still a feature for changing native-id would be nice.

     

    umm-json format is not available for granules, but I assume you will add some way to retrieve native-id with other metadata in granule search results in the echo10 format, etc.  When reconciling vs CMR granule metadata, a Provider would currently need to retrieve the granule native-id from CMR to delete granules in CMR and not in the providers catolog.

     

    Granule delete could be an issue.  It looks like ECHO REST uses GranuleUR at the end of the delete URL, and I think all my GranuleUR were sent with format "$Provider:$numericId".  I assume that when ingested from ECHO to CMR you use GranuleUR as native-id - please tell me that is true.  If you didn't use GranuleUR as native-id, then I won't be able to delete those granules.  Either way, a delete method that uses GranuleUR as the identifier would be nice (like http.../delete_ur/$GranuleUR).

     

    A granule metadata uniqueness issue might occur (similar to collection issue requiring delete/reinsert).  When we reprocess we usually overwrite the existing LAADS numericId(GranuleUR) having the same granule metadata.  But if the GranuleUR changes but some unique metadata key (shortname+Version+temporal+spatial+etc) exists in CMR with the old GranuleUR, and somehow LAADS fails to delete the old native-id in CMR first, then the ingest of the new product might fail because although all the metdata is the same as the existing product in CMR, it can't overwrite the different native-id?

     

     

  5. I filed a new feature request for changing the native id  CMR-2472 - Getting issue details... STATUS . I made sure that CMR-2432 indicates we should support retrieving the native id of granules. 

    I think we should probably meet to discuss your last question. I really want to understand it before I answer. I'll send you an email to discuss times.