Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

Reserved VMs availability for dataproc

Is the use of reserved VMs applicable for the below usecases in dataproc?

1. Dedicated cluster: this we could and will do reservation

2. On-demand cluster: this means that whenever the job runs, it will trigger to create a new cluster and after the jobs are done the cluster will be deleted (is this kind of setup also applicable for reservation?)

3. Serverless: same case as on-demand cluster but using gcp managed serverless clusters (is it also applicable for reservation?)

Please provide your guidance on the optimal approach for utilizing Reserved VMs for these use cases.

0 3 423
3 REPLIES 3

Using Reserved VMs in Google Cloud can be a cost-effective strategy for certain use cases, but it's important to understand how they align with different operational models, especially in the context of Dataproc, which is Google Cloud's managed Hadoop and Spark service. Here is a summary of the optimal approach for utilizing Reserved VMs for Dataproc use cases:

  1. Dedicated Cluster: Reserved VMs are an excellent choice for dedicated clusters in Dataproc. They offer a cost-effective solution for scenarios where you have consistent and predictable resource needs, ensuring that you have the necessary resources available while optimizing costs.

  2. On-Demand Cluster: For on-demand clusters, the suitability of Reserved VMs is less clear-cut. While they require a commitment to a certain level of usage, which may not align with the variable nature of on-demand clusters, they can still be beneficial if there is a predictable pattern of substantial and consistent usage. In cases of highly variable or unpredictable workloads, you might end up paying for resources that go unused, making on-demand instances a more flexible option.

  3. Serverless: In the case of serverless clusters in Dataproc, Reserved VMs are not applicable. While the term 'serverless' in Dataproc refers more to the abstraction of cluster management rather than a traditional pay-per-use model, the dynamic scaling and management of these clusters by Google Cloud means there is no need or ability to reserve resources in advance.

In general, Reserved VMs are most advantageous for workloads with predictable and steady usage patterns, requiring consistent performance. For workloads that are unpredictable or have variable performance requirements, on-demand or serverless clusters are likely more suitable options.

Thanks for your reply

  1. Is there any discount applicable for the reserved VMs for Dataproc? 
  2. Can you share some Google document links to create a serverless Dataproc cluster?

1. Yes, there is a discount applicable for the reserved VMs for Dataproc. You can get up to 70% off on memory-optimized machine types. You can also get up to 57% off on most other resources, such as machine types or GPUs. See Committed use discounts for Compute Engine

2. Please See for more details Overview Dataproc Serverless