Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

Scheduled queries - Dependency feature

In our project, we have two script A and B used in Big Query table. However there is a dependency of script A on B as B is using the output from script A. 

We need to setup a dependency in such a way that once after the script A run is completed, script B should start running.

Do we have any feature in GCP of mentioning direct dependency so that we can use both the tables? or how to set dependencies while scheduling jobs in GCP?

Please help

0 4 653
4 REPLIES 4

There are a few ways to set up dependencies between your scripts A and B. Here are some options you can consider based on your project needs:

1. Cloud Composer 

This is a fantastic tool for managing complex workflows. You'd create a visual graph (DAG) to define the order of your scripts, making it very clear which depends on which. If you have other intricate scheduling needs or might want to add more steps later, this is the most flexible solution.

2. Cloud Functions

This is a simpler, event-driven approach. If your main goal is just to have script B start right after script A finishes, a Cloud Function can be set up to trigger automatically based on that event.

3. Cloud Scheduler and Pub/Sub

If you're already using Cloud Scheduler for scheduling, you can combine it with Pub/Sub (a messaging service) to create a link between your scripts. This gives you a bit more flexibility than Cloud Functions, as you can add other actions in between the scripts if needed.

4. Chaining Jobs Within a Script

This is the most straightforward option if your scripts are very closely tied together. You'd essentially combine them into one big script where the second part (script B) only runs if the first part (script A) succeeds.

Choosing the best approach depends on the following criteria:

  • Complexity: How many steps are in your overall process? The more complex, the more you'll lean towards Cloud Composer.

  • Flexibility: Do you need to easily change things around later? Cloud Composer and Pub/Sub are good choices here.

  • Scalability: If you expect this process to grow in the future, Cloud Composer might be a better investment.

  • Simplicity: If it's really just these two scripts, chaining them together might be the easiest way to start.

Is there any documentation available for all the above steps in google docs? If yes, please direct

  1. Cloud Composer:
  1. Cloud Functions:
  1. Cloud Scheduler:
  1. Pub/Sub:
  1. Chaining Jobs Within a Script (BigQuery Scripting):
  1. Cloud Workflows:

Sorry, but some of the URL's are showing as "invalid URL removed" above.