System Design: Database Best Practices

In this article, you'll find recommendations and best practices focused on the topic of Databases, as part of the System Design Pillar of the Google Cloud Architecture Framework. 

Throughout this article, we often refer to the optimize your database documentation. We suggest you review this documentation to learn basic concepts before evaluating the following assessment questions and recommendations.

Current estate

What is your current database strategy? Are you ready to migrate and modernize your open source databases?

Show More
  • It's important to understand your database strategy early on in both cases, whether you’re designing a cloud-native application from scratch in the cloud or you’re migrating an existing database to the cloud. Google Cloud provides managed database services for open source databases such as MySQL and PostgreSQL, among other database engines. Use the migration as an opportunity to modernize your database if possible, and make it future-ready for your business needs.

What is your current database estate? Are you using Oracle, SQL Server, Postgres, MongoDB, Cassandra, etc.?

Show More

How many commercial-off-the-shelf (COTS) applications do you have?

Show More
  • COTS applications require a fixed type of database along with fixed configuration. For such applications, lift and shift is usually the most appropriate migration approach.

Do you have a development team to support application migration and modernization?

Show More
  • Evaluating your existing team's capabilities and skill set is an important part of the database migration strategy. This should also be considered a factor in deciding the target database for the migration. Google has an extensive network of partners to support you in your migration journey.

High availability, disaster recovery, and compliance

What are your high availability (HA) and disaster recovery (DR) requirements?

Show More
  • Like other design decisions, there’s a trade off between reliability and cost. There are various options to design your databases to meet your HA and DR strategies. All the cloud-native database services create multiple copies of your data by design within a region or in multiple regions, depending upon the database and selected configuration. If you’re not using managed databases on Google Cloud and designing your databases on Compute Engine VMs, the same concept of multi-zone and region applies, and you need to ensure that you’re running multiple copies of your databases as per your requirements. To learn more, read this section in the Reliability Pillar on how to design for scale and high availability.

Do you have data residency requirements?

Show More
  • Data residency describes where your data physically resides at rest. You should consider specific cloud regions to deploy your databases based upon these requirements. If you’re deploying your databases in multiple regions, there may be data replication between them depending upon how you configure them. Some databases offer out of the box multi-regional replication (e.g. Cloud Spanner) so select the appropriate configuration that keeps your data within the desired regions at rest. You can also enforce data residency by using data location constraints.

What are your recovery time objective (RTO) and recovery point objective (RPO) requirements?

Show More
  • You should have a clear understanding of your RTO and RPO to efficiently design your DR strategy. Like other design decisions, there’s a trade off between the RTO/RPO and the cost of the DR solution, such that the smaller the RTO/RPO numbers, the higher the cost. In other words, if you want your system to recover faster from the disruption, your system will cost more to run. Therefore, it's important to understand the consequences of your application not being available due to the disruption and how much data loss can be tolerated. For some applications, the offered service-level agreements (SLAs) and service-level objectives (SLOs) usually define RTO and RPO. Google Cloud provides various ways to design your databases to meet your DR objectives by designing them in multiple zones and/or regions. See the disaster recovery planning guide for details.

What are your encryption requirements?

Show More
  • All data stored in Google Cloud is encrypted at rest by default using AES256, without any action required from you. The encryption requirements for your data are determined by many factors, such as security policies of your company and compliance requirements. One of the key requirements is to understand if you want to manage your encryption keys yourself or if you want to use a managed service. Once you determine what your encryption requirements are, you have various options for implementing on Google Cloud. Cloud Key Management Service (KMS) is a fully managed service to manage your encryption keys on Google Cloud. If you‘d like to manage your encryption keys and have more control over the key’s lifecycle, KMS also supports customer-managed encryption keys (CMEK).

  • If you’d like to create and manage your encryption keys outside of Google Cloud, then you have two options:

    • If you’re using a partner solution to manage your keys, use Cloud External Key Manager (EKM).

    • If you’re managing your keys on-premises and would like to use those keys to encrypt the data on Google Cloud, then you can import those keys into Cloud KMS as either KMS keys or Hardware Security Module (HSM) keys and use them to encrypt your data on Google Cloud.

Database design and scaling

Do you currently store blob types in your databases?

Show More
  • If yes, consider storing them in Google Cloud Storage rather than in databases.

Do you know the size of your databases, the peak concurrent connections, and the peak queries/second load?

Show More
  • Utilize metrics from your existing monitoring tools and environments to come up with a baseline understanding of the size and scaling requirements of your database. These metrics are useful for rightsizing and designing a scaling strategy for your database instances.

  • If it’s a new database design, then you should determine these numbers based on the expected load and traffic patterns on the serving application, and then monitor those numbers over time.

Networking and access

What are your networking requirements?

Show More

What are your database access requirements?

Show More
  • Identity and Access Management (IAM) controls access to Google Cloud services, including database services. You should apply the principle of least privilege to your users to minimize the risk of unauthorized access. For application-level access to your databases, use service accounts with the least privileges.

Automation and rightsizing

How do you automate the creation of databases?

Show More
  • One of the benefits of migrating to Google Cloud is the ability to automate your infrastructure and various other pieces of your workloads, such as compute and database layers. Google Deployment Manager and third-party tools like Terraform allow you to define your database instances as code. This enables you to apply a consistent and repeatable approach to creating and updating your databases.

How do you version control your database changes?

Show More
  • Some database services, like Cloud SQL and Spanner, support Liquibase, an open-source version control tool for databases. This allows you to keep track of your database and rollback schema changes, and do repeatable migrations.

How do you rightsize your databases?

Show More
  • Perform load tests on your database instance and fine tune it based on results to meet your application’s requirements. Initial scaling of your database should be driven by either the load testing results KPIs or from the existing monitoring KPIs from your current database.

  • While creating the database instances, start with the initial size based upon the testing results or historical monitoring metrics and test your database instances with the expected load in the cloud. Then, fine tune the instances until you get the desired results for the expected load on your database instances.

How do you scale your databases?

Show More
  • Scaling databases is different from scaling your compute layer components. Databases have state, and when one instance of your database can’t handle the load, you should think about the appropriate strategy to scale your database instances. Your scaling strategy will vary depending on the type of your database.

  • Cloud Spanner is a cloud-native relational database that allows you to horizontally scale your database instance by adding additional nodes to your database when you need to scale up the serving capacity and storage. Similarly, Cloud Bigtable also allows you to add additional nodes to scale your database.

  • Firestore is a serverless database that handles scaling automatically without any action required from the user.

  • In order to serve more queries, you can easily scale up Cloud SQL database instances vertically to give it more compute and memory capacity. In Cloud SQL, the storage layer is decoupled from the database instance and you can optionally choose to scale your storage layer automatically whenever it's nearing capacity.

Operations

What are your operational requirements for databases?

Show More
  • Monitor your database instances using Cloud Monitoring and set up alerts to notify appropriate teams. All the cloud-native databases provide several logging and monitoring metrics. Each service provides a visual dashboard for such metrics and integrates with Google Cloud’s operations suite. Cloud Spanner provides several additional query introspection tools for debugging and Root Cause Analysis (RCA) purposes, as well as Key Visualizer, a tool that provides insights into the key usage patterns in your Spanner database to help spot problems and fine tune your instance. Cloud Bigtable also provides a similar key visualizer tool out of the box to help you analyze your Cloud Bigtable instance usage patterns.

Licensing

What are the licensing requirements for your databases?

Show More
  • If you’re using the SQL Server engine of Cloud SQL, bringing-your-own-licenses (BYOL) is not a supported option at this time and your licensing cost will be based on usage per core hour. If you have existing SQL Server licenses you’d like to utilize, then running SQL Server on Compute VMs is a possible option. This flowchart can help you understand your options about Microsoft licensing on Compute Engine VMs.

  • If you’re using Oracle and migrating to the Bare Metal Solution on Google Cloud, you can BYOL.

Migration timeline, methodology, and toolset

What is your timeline for migrating or modernizing applications?

Show More
  • Data migrations usually involve multiple steps. It's important to consider the testing and validation of each migration step before moving to the next step. Also, the migration process itself will be driven by factors like if it's a homogeneous or heterogeneous migration, the types of migration tools you have, your team and skill set, if you have experience in the target database engine (in case of heterogeneous migrations), etc.

How many applications are ready for modernization?

Show More
  • An important design consideration of your migration is if you’re ready to modernize your databases and start using cloud-native databases so you can take full advantage of them. Database modernization would most likely impact your application side as well, so consider that in your overall timeline for the workload migration.

Do you have a technical team to handle migration?

Show More
  • Database migration involves setting up the target databases, schema conversion, and data replication between the source and target database, as well as debugging issues as they arise during the migration, establishing network connectivity between the application layer and the database, securing the target database, and making sure that the applications connect to the target databases seamlessly. These tasks often require different sets of skills and are accomplished by a collaboration of multiple teams in your organization, including app developers, database administrators (DBAs), infrastructure and security teams, and others. It’s important to include all these stakeholders from various teams in the migration planning. If your team lacks the skills needed to support your migration, Google has an extensive network of partners who can help you with your migration.

Have you identified the toolset required for a homogeneous or heterogeneous migration?

Show More
  • If you’re migrating to the same database engine as your current database, it's a homogeneous migration, but if the target database is different from the source database, then it's a heterogeneous migration. Usually, heterogeneous migrations involve additional steps of schema conversion from the source database to the target database engine type. Your database teams would have to determine the degree of the difficulty for the schema conversion, as they largely depend on the complexity of the source database schema.

What are your initial data load and continuous data replication requirements?

Show More
  • One of the important aspects of a database migration is the strategy to migrate the data initially, and then the continuous replication of the data from the source to the target database until the target is stabilized and the application completely switches to the new database. These requirements help in determining potential downtime during the switch to the new database, so you’ll want to make sure you plan accordingly.

  • If you’re migrating Cloud SQL, MySQL, or PostgreSQL database engines, Google’s Database Migration Service can automate this process in a fully managed way. For other types of migrations, there are many third-party tools available that you can utilize, such as those in the Cloud Marketplace.

Multi-tenancy

What are the multi-tenancy requirements for your database?

Show More
  • Multi-tenancy from the database perspective means that you’re storing data from multiple customers on a shared piece of infrastructure, in this case, a database. If you provide a software as a service (SaaS) offering to your end customers, it's important to understand how you can logically isolate datasets that belong to different customers and how you can secure them from an accessibility point of view. It’s important to understand your requirements about at which level you would need to make the separation. For relational databases like Spanner and Cloud SQL, there can be multiple approaches, such as isolating tenant’s data at the database-instance level, database level, schema level, or at the database-table level. Like other design decisions, there’s a trade off between the degree of isolation and other factors like cost and performance. IAM policies control access to your database instances.

Key Google Cloud services

  • Cloud SQL: Fully managed relational database service for MySQL, PostgreSQL, and SQL Server. Run the same relational databases you know with their rich extension collections, configuration flags and developer ecosystem, but without the hassle of self management.

  • Cloud Bigtable: Smart access control for your Google Cloud resources

  • Cloud Spanner: Fully managed relational database with unlimited scale, strong consistency, and up to 99.999% availability

  • Memorystore: Reduce latency with scalable, secure, and highly available in-memory service for Redis and Memcached.

  • Firestore: Easily develop rich applications using a fully managed, scalable, and serverless document database.

  • Firebase Realtime Database: Cloud-hosted NoSQL database that lets you store and sync data between your users in real time.

  • Open source databases: Fully managed open source databases promote innovation without vendor lock-in or high licensing fees. Google Cloud and our partners help you deploy secure open source databases at scale without managing infrastructure.

  • Bare Metal Solution for Oracle: Infrastructure to run Oracle workloads on Google Cloud

Resources

What's next?

We've just covered Databases as part of the System Design Pillar of the Google Cloud Architecture Framework. There are several other topics within the System Design Pillar that may be of interest to you:

Version history
Last update:
‎12-13-2021 03:04 PM
Updated by: