Cloud Scheduler Job doesn't have enough permissions to run Dataflow Pipeline

Hi there,

 

I created a Dataflow Pipeline, which works fine when I run it manually. And I see this error when Cloud Scheduler tries to invoke my pipeline. 

 

 

{
  "insertId": "ldo6vcg1s777l3",
  "jsonPayload": {
    "status": "PERMISSION_DENIED",
    "url": "https://datapipelines.googleapis.com/v1/projects/analytics-400818/locations/us-east1/pipelines/imaging-usage-daily-service-account:run",
    "@type": "type.googleapis.com/google.cloud.scheduler.logging.AttemptFinished",
    "targetType": "HTTP",
    "jobName": "projects/analytics-400818/locations/us-east1/jobs/datapipelines-imaging-usage-daily-service-account"
  },
  "httpRequest": {
    "status": 403
  },
  "resource": {
    "type": "cloud_scheduler_job",
    "labels": {
      "job_id": "datapipelines-imaging-usage-daily-service-account",
      "project_id": "analytics-400818",
      "location": "us-east1"
    }
  },
  "timestamp": "2024-01-04T09:00:00.215741857Z",
  "severity": "ERROR",
  "logName": "projects/analytics-400818/logs/cloudscheduler.googleapis.com%2Fexecutions",
  "receiveTimestamp": "2024-01-04T09:00:00.215741857Z"
}​

 

 

Please send me information on what roles I need to provide to a service account that runs my Dataflow Pipeline using Cloud Scheduler.
 
Thank you so much for your attention
0 3 211
3 REPLIES 3

Reading here ...., it would appear that the service account that tries to start a Dataflow job needs the Dataflow Developer role (roles/dataflow.developer).   Reading its summary, it declares:

Provides the permissions necessary to execute and manipulate Dataflow jobs.

It appears that the underlying permission needed is dataflow.jobs.create and that permission is included in the role described above.

This service account has a Dataflow Admin role. 

Does it relate to the Cloud Scheduler role? Is it possible? Cloud Scheduler API role 

When a request is made to start a Dataflow pipeline, the requestor needs to be authorized to start the pipeline AND the request needs to pass with it the credentials possessed by the requestor.  I am assuming that you have configured your Cloud Scheduler to use a specific service account to make the call to Dataflow and I think I am hearing you say that the service account possesses the Dataflow Admin role.   Since the request is still being denied, we need to ask ourselves what the issue may be:

1. How many projects are you working with?  Is it possible the service account is granted Dataflow Admin in project A but your Dataflow pipeline is associated with project B?

2. How exactly have you configured the Cloud Scheduler job?  Are you passing authentication information (see).

I'd suggest posting the exact configuration of your cloud Scheduler job so we can see if there are any clues to be seen there.