Cloud Architects when designing resilient, scalable, and reliable systems with examples

 

Term

Description / Purpose

Example GCP Services / Features

High Availability (HA)

System designed to operate continuously without failure for a long time.

Cloud SQL (HA Config), GKE Regional Clusters, Cloud Spanner

Fault Tolerance

Ability of a system to continue operating properly even when part of the system fails.

Compute Engine MIG, Cloud Spanner, Cloud Storage Multi-Regional Buckets

Disaster Recovery (DR)

Strategies & mechanisms to recover and restore data/systems after catastrophic failures.

Cloud Storage Backups, Snapshots, Cross-Region Replication

Redundancy

Duplication of critical components to increase reliability.

Cloud Load Balancing, Regional Disks, Multi-zonal Deployments

Failover

Automatic switching to a redundant/standby system upon failure.

Cloud SQL automatic failover, Cloud Spanner automatic failover

Load Balancing

Even distribution of traffic across multiple resources to prevent overload.

Cloud Load Balancing (HTTP(S), TCP/UDP), Internal Load Balancer

Scalability

Ability to handle increased load by adding resources.

GKE Autoscaling, Compute Engine MIG Autoscaling

Elasticity

Dynamic adjustment of resources (scale in/out) to match demand.

GKE Cluster Autoscaler, Cloud Functions Auto-Scaling

Durability

Guarantee that data will not be lost.

Cloud Storage (11 9’s durability), Persistent Disk Snapshots

Replication

Copying data across multiple locations for availability and durability.

Cloud Spanner replication, Cloud SQL Read Replicas, Multi-Regional Storage

Auto-Healing

Automatic detection and replacement of unhealthy resources.

Compute Engine Managed Instance Groups, GKE Node Auto-Repair

Geo-Redundancy

Distributing resources/data across multiple geographic locations for maximum resilience.

Cloud Storage Multi-Regional, Cloud Spanner Multi-Regional

Backup & Restore

Regular backups and ability to restore systems/data to a previous state.

Cloud SQL Backups, Persistent Disk Snapshots, GKE Backups

RTO / RPO Compliance

Recovery Time Objective (RTO) and Recovery Point Objective (RPO) metrics for availability and disaster recovery planning.

Cloud Spanner, Cloud SQL, Disaster Recovery Strategies

Zero Downtime

Achieving continuous availability without service interruption.

GKE Blue/Green Deployments, Rolling Updates

Consistency

Ensuring data accuracy across distributed systems.

Cloud Spanner (Strong Consistency), Cloud SQL

Multi-Zonal Deployment

Deploying resources across multiple zones within a region.

GKE Regional Clusters, MIGs

Multi-Regional Deployment

Deploying resources across multiple regions to maximize availability.

Cloud Storage Multi-Regional Buckets, Cloud Spanner Multi-Regional

SLA (Service Level Agreement)

Commitment on uptime & reliability levels from GCP.

Compute Engine SLA, Cloud Storage SLA, Cloud SQL SLA

 

0 REPLIES 0
Top Labels in this Space