Child pages
  • CMR Health During Outage?
Skip to end of metadata
Go to start of metadata

Will the CMR "health" parameter function during an outage? For example, today we were notified that CMR would be down on a couple occasions tomorrow for "Earthdata Login Contingency Failover" testing. Will curl -i -XGET https://cmr.earthdata.nasa.gov/search/health command return a status besides 200 during these times? Will it always return something regardless of what problems CMR is having? I want to develop a test to determine if CMR is operating before doing granule searches, so I can prevent the user from performing a search and having our application lockup because it isn't getting a return. If "health" isn't the right approach, what do you recommend?

  • No labels

2 Comments

  1. As long as the CMR hosts are reachable you should be able to request health. The CMR will return a non-successful status code (503) if any of its dependencies aren't available. The CMR doesn't directly depend on Earthdata login so it will probably return a 200 even if Earthdata Login is unavailable. Here's more information on the health response: https://cmr.earthdata.nasa.gov/search/site/search_api_docs.html#check-application-health

    I don't think it's necessary to check application health before executing a query. Any time you make a request to an external system from your application there's a chance that something could go wrong. (Though we strive to make the CMR very reliable.) It could be on the CMR side or on the network in between. Preventing your application from waiting for a response that might never return would best be handled by using asynchronous execution (on another thread or using a callback in JavaScript) and adding a timeout to the network request. Additionally you should handle error responses from the CMR. You can check the status code to see if something went wrong with the query. A 400 would indicate there was some bad data in your request. A 500 would indicate the CMR had an internal error. (You should report those as CMR bugs)

    If you wanted to be really robust in addition to what I've described above you could retrieve CMR health periodically to disable portions of your application. You would probably not want to execute that before every query. It would make the user experience too slow and would be over doing it to check for each user and for each query. It would be better to have a single thread from your application check the CMR every few seconds and set a flag that changes the application based on a successful response or not. I would only do this if you've already added the async execution, timeouts, and error response handling.