Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

BigQuery Data Transfer - Permission denied when transferring public dataset

I am trying to transfer a public dataset to our project. I am specifying a Scheduled Query with a service account. In my project, (the destination project) I have given this service account the bigquery Admin role. I have also made sure the BigQuery Data Transfer service agent has the service account token creator role and the bigquerydatatransfer.serviceAgent role. 

Using Terraform I have specified the following data transfer config - 

 

resource "google_bigquery_data_transfer_config" "indications_query_config" {
  depends_on = [google_project_iam_member.bq_data_transfer_permissions]

  project                = module.gcp_project.project_id
  display_name           = "Open Targets Platform Data Transfer - Indications"
  location               = "us-west1"
  data_source_id         = "scheduled_query"
  schedule               = "14 of month 02:00"
  destination_dataset_id = module.bigquery.bigquery_dataset.dataset_id
  service_account_name   = google_service_account.bigquery_data_transfer_sa.email
  params = {
    destination_table_name_template = "indication"
    write_disposition               = "WRITE_APPEND"
    query                           = "SELECT * FROM `bigquery-public-data.open_targets_platform.indication`"
  }
}

 

When this transfer runs, I get the following error message:

Error code 7 : Failed to start BigQuery job. Error: Access Denied: Table bigquery-public-data:open_targets_platform.indication: User does not have permission to query table bigquery-public-data:open_targets_platform.indication, or perhaps it does not exist. User: SA_NAME@PROJECT_ID.iam.gserviceaccount.com.

I went ahead and created a service account key for this service account. I set the env var GOOGLE_APPLICATION_CREDENTIALS as the path to this key and ran the following python code:

 

from google.cloud import bigquery
import google.auth
SCOPES = [
    'https://www.googleapis.com/auth/cloud-platform'
]
CREDENTIALS, GCP_PROJECT = google.auth.default(scopes=SCOPES)
client = bigquery.Client(credentials=CREDENTIALS, project=GCP_PROJECT)

# Perform a query.
QUERY = (
    'SELECT * FROM `bigquery-public-data.open_targets_platform.indication`'
    'LIMIT 100')
query_job = client.query(QUERY)  # API request
rows = query_job.result()  # Waits for query to finish

for row in rows:
    print(row.id)

 

When I do this, I am able to query the public dataset. I can also use the debugger to confirm that the bigquery client is indeed using the service account as credentials. 

Any idea whats going wrong here? 

Solved Solved
1 4 4,937
1 ACCEPTED SOLUTION

TL;DR - Give the service agent BigQuery Data Viewer in the project. Subscribe to the public dataset to create a linked dataset in your project. Create/ensure a destination dataset has the same location as the linked dataset (regional vs. multi-regional). Create a scheduled query.

View solution in original post

4 REPLIES 4