Ask Me Anything Event Recap: System Design Best Pr...

Lauren_vdv · 01-28-2022 11:23 AM

In the first session of our Architecture Framework Ask Me Anything series, we focused on how to apply the best practices and guidance outlined in the System Design Pillar of the Google Cloud Architecture Framework. @omkarsuram_, Program Lead of the Architecture Framework, led the session by outlining core system design principles, explaining how to apply them using a business example, and then answering questions live at the end.

In this blog, we share the session recording, written questions and answers, as well as supporting documentation and resources, so you can refer back to them at any time. If you have any further questions, please add a comment below and we’d be happy to help.

With this series, it's our goal to provide a trusted space where you can receive support and guidance along your cloud journey. So if you have any feedback or topic requests for our next sessions, please let us know in the comments, or by submitting the feedback form. You can keep an eye on upcoming sessions from the Cloud Events page in the Community. Thank you!

Session recording

Watch the recording: https://youtu.be/_jzHhXF-TG0

System design questions and answers

1. When should I use a NoSQL database compared to a traditional relational database?

Most relational databases are ACID compliant, which means they guarantee your transactions are delivered and processed. A relational database is a perfect candidate for use cases like recording payment transactions.

NoSQL has a flexible schema, so it can quickly adapt to add new values. A NoSQL database is a perfect candidate for storing user profiles, chats and replies, forums, etc.

The schema flexibility of NoSQL comes in handy when you’re developing new features and want to store additional values. In comparison, with a SQL database, this would require careful planning and involve a significant amount of overhead.

In Google Cloud, you have the option to use relational databases, such as Cloud SQL or Cloud Spanner, and NoSQL databases, such as Firestore and Cloud Bigtable. Learn more about which database is right for you by exploring the resources below.

2. Is it better to scale horizontally or vertically? Does it always matter?

The answer depends on how you’ve architected your application. With horizontal scaling (scaling in or out), you add additional instances/nodes, and with vertical scaling (scaling up or down), you add or subtract compute power to an existing instance/node.

When you’re running a workload, either in Compute Engine or Google Kubernetes Engine, it’s important to monitor resource utilization and performance on an ongoing basis, considering factors such as response time, CPU load, memory usage, etc.

Using Google Cloud’s operations suite, you can understand how your application behaves under varying traffic and workload demands. You’ll use this information as a baseline for determining the right size and number of instances/nodes required for your application, as to avoid your application hitting resource limits.

Google Cloud provides products and features to help you scale horizontally and vertically:

Compute Engine virtual machines and Google Kubernetes Engine (GKE) clusters integrate with autoscalers that let you grow or shrink resource consumption based on metrics that you define. Additionally, GKE features, such as Vertical Pod autoscaling, will help you optimize how your pod consumes resources dynamically.
Google Cloud's serverless platform provides managed compute, database, and other services that scale quickly from zero to high request volumes, and you pay only for what you use.
Database products like BigQuery, Cloud Spanner, and Cloud Bigtable can deliver consistent performance across massive data sizes.
Cloud Monitoring provides metrics across your apps and infrastructure, helping you make data-driven scaling decisions.

3. How do I know when I should create a managed instance group (MIG), rather than standalone VM instances?

A managed instance group (MIG) is a collection of VM instances that you can manage as a single entity. You can make your workloads scalable and highly available by taking advantage of automated MIG services, including autoscaling, autohealing, regional (multiple zone) deployment, and automatic updating.

Managing these tasks on your own can give you control over how you deploy your instances, but comes with additional operational overhead. Updating and maintaining standalone VMs attached to the load balancer will need significant time investment and this need will grow as your deployment scales. Refer to the resources below for more information about MIGs, the benefits, and example scenarios for creating them, to determine if they’re a good fit for your needs.

4. Is there a recommended solution for ingesting data from an on-premises database? I need to ingest data on a regular schedule.

Google Cloud provides out-of-the-box options, including Storage Transfer Service and BigQuery Data Transfer Service. If you’re creating backup files from your database that are on-prem, you can quickly transfer them to Cloud Storage buckets using gsutil, and schedule ingest jobs using Cloud Scheduler, Cloud Functions, and Pub/Sub. Learn more with our System Design: Analytics Best Practices.

5. Am I charged for retrieving data from the Standard Storage class of Cloud Storage? And can charges associated with data retrieval be billed to the user who accesses the data?

Whenever a user accesses a Cloud Storage resource, such as a bucket or object, there are charges associated with making and executing the request. Such charges include operation charges, network charges for reading the data, and data retrieval if the data is stored as Nearline, Coldline, or Archive Storage.

If you have data that you want to make available to users, but you don’t want to be charged for their access to that data, then the Requester Pays feature for Cloud Storage is a great option.

With Requester Pays enabled, you can require requesters to include a billing project in their requests, thus billing the requesters project. However, keep in mind that as the owner of the data, you’re still responsible for paying storage costs.

Please refer to the following resources for more details:

6. When moving existing architecture from one region to another region (e.g. East to West), what is the best practice to quickly replicate the system? Specifically with GKE.

When you’re using Google Kubernetes Engine (GKE), it’s recommended that you use regional deployments. So as you’re scaling your pods, the components are highly available.

If you’re using container registry - or a third-party solution, you can use the container image in GKE to configure an automated deployment, so that your application is rebuilt and deployed in another region, enabling you to scale quickly on demand. Having a reliable data foundation should help you make compute migrations easier.