Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

Dataflow Provisioning time and optimization

We are using dataflow for batch workloads which are small in future roadmap we want to enable streaming workloads .We are triggering the jobs using python based microservices. Below are the few queries that we need  assistance

1)Dataflow is taking minimum of 4min of time for provisioning we are using the below config used:minimal machine,customsdk image confined to a single regional data  resources used by or within the same project and single region for the provisioning.

Can you please suggest on how the time can be further reduced.

2)Can we configure dataflow such that the workers are not terminated after processing but can be in idle state like listeners and run the workloads based on an event or can we used the dataflow job to process multiple workloads in a sequential way on the same instances provisioned

3)can we pre-provision the resources periodically(warm-start) and process the data once the data is available

Thanks in advance for the help

1 1 304
1 REPLY 1