Designing Serverless Data Architecture on Google Cloud using the Architecture Framework

jose_c · 01-07-2022 12:27 PM

Google pioneered the notion of serverless architectures back in 2008 with the introduction of App Engine, as well as innovations in serverless containers such as Cloud Run. The immense benefits of going serverless back then are perhaps even more relevant today, where speed, reliability, and scalability are paramount for any business trying to innovate at the current pace.

Google is well known for its expertise in handling massive amounts of data, fueling some of the most advanced and powerful analytics platforms in the market. A recent Gartner report named Google a leader in Cloud Database Management Systems, due to Google Cloud’s vision and strategy. With Google Cloud, any business can lay a robust foundation to achieve that same level of success. Google Cloud provides their users with a wealth of options that allow them to go fully serverless when implementing their data architectures.

Companies big and small are digitally transforming their data architecture by going serverless because of the following business benefits:

Scalability: Worrying about how to meet high traffic events like Black Friday is a thing of the past. Cloud Load Balancing and the ability to automatically spin up resources as required means demand will always be satisfied, no matter how high.
Resource efficiency: Going serverless means you’re only charged exactly the resources consumed and needed at any point. Compared to on-premises server rooms, which often lead to wasting extra unneeded resources, or losing revenue by lacking the resources necessary to adequately serve customers.
No capital investment, lower operational costs: No need to provision new server rooms and worry about projecting demand. Maintenance and upkeep costs are greatly reduced with serverless options, resulting in lower operational costs.
Global reach, lower latency: Going serverless enables businesses to run their code closer to their customers, wherever they are. They can effortlessly expand to a global scale within minutes.
Better disaster mitigation, higher security: Multi-region deployment for your data infrastructure prevents your business from stopping with a disaster - avoiding potential loss of data and productivity. Security industry experts from Google are always looking out for cyber threats, keeping Google Cloud products secure with automatic security updates.
Faster time-to-market: Developers are able to rapidly prototype, test, and deploy their solutions. Time-to-market is cut down from months to days along with the ability to set up A/B testing, green/blue, or canary deployments.
Greener solution: Google has been carbon neutral since 2007, with aims to run completely on carbon free energy by 2030.

Let’s take a look at an example architecture that showcases the ease, robustness, and scalability that Google Cloud tools provide any business to set up their data architecture.

With Google Cloud, businesses can effortlessly handle both streaming and batch data. Pub/Sub is used for streaming analytics and data integration pipelines to ingest and distribute data. It enables services to communicate asynchronously, with latencies under 100 milliseconds. Batch data can be ingested into Cloud Storage, an enterprise grade, highly available, and durable object storage service. Google Cloud serverless analytics covers data warehousing, data pipelines (ETL), and machine learning. The backbone of these products is Pub/Sub and Cloud Storage, which allows transfer of both streaming and batch data of any size and velocity.

From there, data flows into Dataflow, a fully managed, and horizontally scalable unified data processing platform. Dataflow was named a Leader in The Forrester Wave™: Streaming Analytics, Q2 2021 report. Using the widely-used Apache Beam SDK, Dataflow enables developers to conveniently build robust data pipelines.

Cloud Storage serves the purpose of a data lake, the heart of any data architecture. Compiling all of the data a business needs with seamless integration with all Google Cloud products. In this example, data flows into BigQuery, Google Cloud’s serverless, highly scalable, and cost-effective data warehouse. It was named a Leader in The Forrester Wave™: Cloud Data Warehouse, Q1 2021 report. BigQuery is central to Google Cloud’s range of analytics solutions. Scaling to petabytes of data with zero overhead, and universally-used SQL queries, BigQuery enables businesses to generate game-changing insights from the wealth of data they generate. For more information about the best practices regarding data analytics and a full view of the entire analytics platform, visit the Architecture Center’s documentation on analyzing data.

As a final step, data can be integrated into Looker, an enterprise platform for business intelligence. Looker provides a powerful way of visualizing data insights, helping the democratization of information across an entire organization.

The speed and adaptability of serverless architectures are some of the reasons why many companies are turning to Google Cloud for their projects. Applications with rapid time-to-market and unpredictable scale requirements benefit the most from serverless compute and databases. Serverless architectures have proven to be the optimal solution for many businesses and ventures, from startups to the cutting edge innovation projects from big enterprises.

This was just one example of the multitude of ways in which the range of data solutions found in Google Cloud can enable a business to create a robust, scalable, and efficient serverless data architecture. Find out more about system design and best practices in the Cloud Architecture Center.

You can also find common design questions and recommendations in the Architecture Framework Guidance section of the Community here, or feel free to ask a new question in the comments below!