System Design: Storage Best Practices

In this article, you'll find recommendations and best practices focused on the topic of Storage, as part of the System Design Pillar of the Google Cloud Architecture Framework. 

Throughout this article, we often refer to the select and implement a storage strategy documentation. We suggest you review this documentation to learn basic concepts before evaluating the following assessment questions and recommendations.

Storage type

How much and what types of storage do you require?

Show More
  • Select Cloud Storage when you want to store data at scale for a low cost, and access performance is not an issue.

  • Select Persistent Disk or Local SSD when your compute applications need high-performance storage.

  • Select Filestore for high-performance workloads that need read/write access to shared space.

  • In situations with high-performance computing (HPC) or high-throughput computing (HTC), refer to our documentation on using clusters for large-scale technical computing for more information.

Do you need active or archival storage?

Show More

Are you looking to host static objects for web hosting? Are you using Cloud CDN (Content Delivery Network)?

Show More
  • Use Cloud CDN to improve static object delivery. Cloud CDN uses Google’s global external HTTP(s) load balancer to provide routing, health checking, and anycast IP support. Refer to our documentation on setting up Cloud CDN with cloud buckets to know more.

What location(s) and type of data protection do you require?

Show More
  • Regional protection is available by default, where data is stored in at least two zones within the selected region.

  • Regional protection comes in two types: multi-region or dual-region. For multi-region, data is stored in two or more regions based on the broader geographical region you choose (e.g. United States). For dual-regions, data is stored in two specifically-selected regions (e.g. Tokyo and Osaka). Right now only a select combination of regions are available to select, with more customization options being planned for the near future.

  • Refer to the Cloud Storage bucket locations documentation to know more.

Storage access patterns and types of workloads

How do you plan on accessing your data?

Show More
  • Data access patterns highly correlate to how you design your system performance. Cloud Storage provides scalable storage, but isn’t an ideal choice when you’re running heavy compute workloads that need access to large amounts of data. For high-performance storage access, use Persistent Disk.

What are the object lifecycle operations ramp-up mechanisms?

Show More

Storage management

Do you store and process sensitive data? How do you monitor and manage access?

Show More
  • Make every bucket name unique across the entire Cloud Storage namespace. Do not include sensitive information in a bucket name and choose bucket and object names that are difficult to guess. Having entropy and randomness in bucket names, if possible, decreases the chance of hotspotting.

  • Ensure that your Cloud Storage bucket is not anonymously or publicly accessible.

What are your object naming conventions?

Show More
  • Using a random object name gives you the highest level of performance and avoids hotspotting. Use a longer, randomized prefix for your objects wherever possible.

Do you need to prevent data from being accessible to the public?

Show More

Do you want the requesting project to pay the access costs?

Show More
  • You can use the Requester Pays feature for Cloud Storage, along with appropriately set up billing projects, to charge the requester for operation, network, and data retrieval costs. The owner still needs to pay for any storage or deletion charges.

Key Google Cloud services

  • Cloud Storage: Object storage that’s secure, durable, and scalable

  • Persistent Disks: Reliable, high-performance block storage for virtual machine instances

  • Regional Disks: Durable storage and replication of data between two zones in the same region

  • Local SSD: Ephemeral, locally-attached block storage for virtual machines and containers

  • Filestore: High-performance, fully managed file storage

  • Cloud Storage for Firebase: Object storage for storing and serving user-generated content

  • Actifio GO: Backup, disaster recovery, migration, and test data management software as a service solutions

Resources

Cloud Storage resources

Migration resources

Persistent Disk resources

What's next?

We've just covered Storage as part of the System Design Pillar of the Google Cloud Architecture Framework. There are several other topics within the System Design Pillar that may be of interest to you:

Version history
Last update:
‎12-13-2021 03:04 PM
Updated by: