Discussion on AWS scaling - High-Level

The goal of meeting is to learn what options are available to handle bigger amount of data (e.g., TerraFusion) transformation & service faster through clustering on AWS.

Date: 2020.08.26 (Wed)

Attendees

Meeting Agenda

Here are some questions for Esri:

Scaling options and licensing
- Is it allowed to put AWS Marketplace AMI into auto-scaling group? Will the same active license work if a new instance is created by the load balancer?
  - Yes. Some customers successfully ran this scenario. License will be authorized automatically. It's always good to set threshold conservatively like CPU load = 0.6 because of boot-up time of new AWS instance.
- Will the new server instance federate automatically with the portal?
- Can Portals be put under auto-scaling group, too?
  - Doable but it is not recommended. It's better to have one robust machine (16 core can serve 1,000 users).
- How about Notebook Servers? Can they form a Dask cluster automatically?
  - This is in a to-do list for Notebook team.
How to make service respond faster
- Can PostgreSQL DB become a bottleneck if multiple instances make requests to DB that has mosaic datasets? Do you recommend the use of DB cluster?
  - Splitting and archiving large data into separate Goespation DB (2TB) is possibility.
  - Use-case based high-demand data into PostgreSQL is more sensible.
  - Databricks / Snowflake like connector & streaming is possible.
- What's the best way to distribute traffic based on service and region?
  - For example, if a user comes from east, serve the user from EC2 instance on east region.
  - If a user, regardless of user's region, asks for "A" image service, serve the user from EC2 instance on west region. If "B" service, serve the user from EC2 instance on east.
    - Use Route 53 plus CDN & CloudFront.
How to make service handle bigger data
- Can image / feature service benefit from Bigdata store? Will it be faster than using DB in creating mosaic or serving mosaic dataset?
- Is it better to store data in Parquet in cluster environment? Or is CRF already optimized for cluster environment?

Action Items for Next Week

0%

Task List

Space shortcuts

Page tree

Date: 2020.08.26 (Wed)

Attendees

Meeting Agenda

Action Items for Next Week

Task List