Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Info

Work in progress


Earthdata Search Access (https://search.earthdata.nasa.gov) is a modern web application allowing users to search, discover, visualize, refine, and access NASA Earth Observation data using ESDIS’ wide array of service offerings. It's goal is to ease the technical burden on data users by providing a high-quality application that makes it simple to interact with NASA Earth observation data, freeing them to spend more effort on innovative endeavors.  It also provides a way for EOSDIS to showcase its service offerings, including CMR and GIBS. It consists of a serverless application backed by a database that communicates with CMR for search resolution and Earthdata Login for user/session management.

Environment

Architecture 

Lucidchart
lcId56f7d42f-08c1-41b8-ae3a-cb091ed76791
rich-viewertrue
autoUpdatetrue
autofittrue
nameEDSC 2.0 Serverless
width1936
pages-to-display
documentTokenc3ae2c3f-7018-484a-924e-0583c1e0f12a|106473568||ntWEAgoM1CMpAA7uEzMTMt8zpez57XJE099UAw0dLFc=
idc3ae2c3f-7018-484a-924e-0583c1e0f12a
alignLeft
height1494

Components

  • Earthdata Search Access is built using Node JS, React and the Serverless framework
  • Amazon RDS database
  • CMR search and ordering (external component dependency)
  • GraphQL which sits on top of CMR but allows for an improved data retrieval experience.
  • Data Provider EGI ordering endpoints (external component dependency)
  • Earthdata Login Single-sign-on (external component dependency)

Deploying

Generally, building and deploying will be done from Bamboo. Bamboo's deployment mechanisms give the option of a script defined within the UI or a script within the repository alongside the code. Earthdata Search Access uses a script that is defined within the repository located at `bin/deploy-bamboo.sh`. The script is simply a list of commands so if Bamboo isn't an option, the commands can be ran manually via the command line. The overall process looks like this:

  1. Install necessary libraries
  2. Build static assets
  3. Deploy Infrastructure
    1. Database
    2. Queues
    3. Roles
  4. Deploy Application Resources
    1. Lambdas
    2. CloudWatch Events
    3. API Gateway
    4. CloudWatch Logs
  5. Migrate the Database
  6. Deploy the static assets to S3

Required Environment Variables For Deployments

  • API_HOST The API Gateway endpoint created during deployment. For NGAP deployments this will be a CloudFront endpoint that points to the created API Gateway.
  • AWS_ACCESS_KEY_ID AWS Access Key created in CloudTamer.
  • AWS_SECRET_ACCESS_KEY Access Token created in CloudTamer.
  • CLOUDFRONT_BUCKET_NAME NGAP dumps CloudFront to a bucket because we don't have permission to them, we use our Lambda, CloudfrontToCloudwatch, to read these logs and send them to Splunk.
  • COLORMAP_JOB_ENABLED Whether or not to run the job for creating and storing colormaps.
  • DB_INSTANCE_CLASS AWS RDS instance type to deploy for the application. https://aws.amazon.com/rds/instance-types/
  • DEFAULT_PORTAL Earthdata Search manages some UI elements and functionality based on portal configurations, this defines which one to use by default. Note: `edsc` enables features that non-eed instances may not have access to.
  • EDSC_APP_HOST URL of this application once deployed, used for linking back to the application and OAuth.
  • GRAPHQL_HOST URL of this GraphQL application used as the datasource for this application.
  • FEEDBACK_APP ID of the Earthdata Feedback App to supply to TopHat.
  • GEOCODING_INCLUDE_POLYGONS Whether or not to include polygons in search results of geocoding searches.
  • GEOCODING_SERVICE Which geocoding service to use, currently supports `google` and `nominatim`.
  • GIBS_JOB_ENABLED Whether or not to run the job that generates gibs tags on CMR collections.
  • GTM_ID Google Tag Manager ID.
  • JWT_SIGNING_SECRET_KEY The key to use while signing JWT tokens generated after authenticating to EDL.
  • LAMBDA_TIMEOUT Value set as each Lambas timeout, a handful of Lambdas have custom timeouts for specific reasons.
  • LOG_DESTINATION_ARN AWS ARN of the log subscription provided by NGAP.
  • OBFUSCATION_SPIN Value to use to obfuscate ids of records created and supplied to users.
  • OBFUSCATION_SPIN_SHAPEFILES Value to use to obfuscate ids of shapefiles created and supplied to users.
  • STAGE_NAME What label to apply to the deployment, helps separate multiple deployments within a single account. Earthdata Search Access uses the standard EED environments `sit`, `uat` and `prod`.
  • SUBNET_ID_A AWS Subnet ID provided by NGAP.
  • SUBNET_ID_B AWS Subnet ID provided by NGAP.SUBSETTING_JOB_ENABLED Whether or not to run the job that creates subsetting tags on CMR collections, applies to Customize and Stage for Delivery access methods.
  • VPC_ID AWS VPC ID provided by NGAP.

Destroying

This can be done from within the AWS console by deleting the stack from CloudFormation. You may experience issues related to an S3 bucket not being empty, you can empty the bucket and continue deleting assets if you experience this. Running the commands require valid access tokens, see the Serverless Framework documentation for methods of providing these values.

Using AWS Access Tokens (Serverless Framework)https://www.serverless.com/framework/docs/providers/aws/guide/credentials#using-aws-access-keys

// Destroy Application resources and static content
serverless destroy --stage sit

// Destroy database and application roles
serverless destroy --stage sit --config serverless-infrastructure.yml

Configuration

Earthdata Search Access is deployed to NGAP 2.0, aka AWS using the Serverless Framework.

Backup

Code Backup

Earthdata Searches Accesses code base and static assets are backed up in Github which is automatically synced with BitBucket. BitBucket is backed up on-premise as part of EED-2 infrastructure management. Should the unlikely scenario of Github becoming unavailable occur, developers are easily able to resume work and deployments using backup data from BitBucket.

Githubhttps://github.com/nasa/earthdata-search

Bitbucket:   https://git.earthdata.nasa.gov/projects/EDSC/repos/edsc-cmr-preview/browse

Data Backup

Earthdata Search's utilizes Amazon RDS which has automated backups within AWS daily, and kept for 1 day. The process for restoring from a backup can be found here: EDSC Database Backup and Restore in NGAP

Restoring (Deployments from Scratch)

Should Earthdata Search Access need to be completely recovered (static assets, Lambdas, and database), that process takes around 1 hour and requires a few tickets that requires additional resources (NGAP).

Tickets depending on other teams:

  • AWS CloudFront endpoint for S3 bucket
  • AWS CloudFront endpoint for API Gateway

Once these tickets are completed:

  • Output from the tickets (S3 bucket and API Gateway endpoint) need to be set within Bamboo.

Steps:

  1. Create the deployment bucket on AWS (`earthdata-search-[ENV]`)
  2. Deploy full application
  3. Configure CloudFront custom error pages, documented here: ReactJS Static Website Hosting
  4. Create the tickets above, the outputs from the deployment should be provided in the tickets (s3 bucket name and API Gateway ID)
  5. When tickets are completed, the values will need to be saved in Bamboo.
  6. Deploy again so that the ENV variables are provided to the lambdas.
  7. If this is a restoration instead of a new deployment, refer to EDSC Database Backup and Restore in NGAP for restoring an existing database.

Ensure that all environment values are provided, when deploying from Bamboo those values are set in the UI, but when deploying from command line they need to be provided or set.

Release Cycle

Earthdata Search Access follows the SAFe process as implemented by EED. Typically, that means we plan priorities in 3 month increments and release code every 2 weeks. If needed, Earthdata Search Access can release on-demand with appropriate notice to stakeholders.

Patching and Remediation

Earthdata Search Access has a blocking step in our deployment process that audits our libraries and dependencies. Once a vulnerability is found, steps are taken to patch and update and resolve the vulnerability immediately.

In the case of a vulnerability discovered for a resource currently deployed, a ticket is filed once the issue is identified.  Earthdata Search Earthdata Access devops consults with the security team to prioritize the remediation of the vulnerability found. Once a ticket is created, approved, and prioritized, EDSC dev team works the issue until all vulnerabilities are resolved and deploy the updated app to all operational environments.

Diagnostics

Earthdata Search Access logs to AWS CloudWatch on a per Lambda basis; this allows you for easy access to specific Lambda logging in the event that you know which Lambda is responsible for the logs you're looking for. If a wider search needs to occur, Earthdata Search Access forwards logs to Splunk which accommodates a wider array of search abilities.

Splunkhttps://logs.earthdata.nasa.gov/

Planned Maintenance

Deployments are handled via Bamboo.

Bamboohttps://ci.earthdata.nasa.gov/deploy/viewDeploymentProjectEnvironments.action?id=242712580

Hide comments