Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

Ingest Data from GCS to Big Query. Error - google.api_core.exceptions.Forbidden: 403 POST

I am trying to ingest data from GCS to Big Query. I can do it when I run it locally but when I put my function inside a vertex AI component and run it using a pipeline I get the above error. Moreover, the project it states is not even my project id. The service account I am using has permission - bigquery.jobs.create

ERROR - google.api_core.exceptions.Forbidden: 403 POST https://bigquery.googleapis.com/bigquery/v2/projects/xc526656e7cecfc24p-tp/jobs?prettyPrint=false: Access Denied: Project xc526656e7cecfc24p-tp: User does not have bigquery.jobs.create permission in project xc526656e7cecfc24p-tp.

0 6 928
6 REPLIES 6

I think we have to look carefully at where your job is running that is attempting to interact with BigQuery.  If the job is running in a serverless or managed server environment, the chance are high that the job is running as a Google Cloud service account and that service account has not been granted permissions on your target BigQuery dataset or table.    Maybe post back the specific details of where you are running your work.  You said "... put my function inside a Vertex AI component ...".  Which specific component and what pipeline?   Assuming that you are using Vertex AI Pipelines, have a look at this article:

https://cloud.google.com/vertex-ai/docs/pipelines/configure-project#service-account

and specifically, bullet 4 which reads:

Grant your service account access to any Google Cloud resources that you use in your pipelines.

Hey kolban, the problem got resolved just be using bigquery.Client(project=project)

 

Earlier I didn't specify (project=project )

 

But it was working when not using vertex ai. So I guess we need to specify project to big query when using vertex ai or something like or if you can explain better. 

Can you please explain why it works when i specified bigquery.client (project=project)??

When you specify a project, you are overriding the default.  This makes me think that the default isn't what we thought it was.  By naming a specific project, your request to run a query was then run within the context of that named project ... and in that named project, your identity had the "BigQuery User" IAM role and was allowed to submit jobs to BigQuery.

But when I was running it out component of Vertex AI pipelines, it was running even without specifying the project context. 

That's what made it difficult to point out. As it shouldn't have run in both the cases. That's what I feel. 

To get to the bottom of it in depth, you'll have to post the details of the environments which worked and failed.  You said when you ran it locally ... it worked.  I am imagining a laptop with the Google Cloud SDK installed and you have run gcloud init or gcloud set project and set your default project.  When you run your app there, the default context is correct.  Now, when you run in a Vertex pipeline, the chances are high that you are running in a managed environment.  That's where we would need to dig deeper in depth.  I don't fully understand the pipeline environment in which you are running.  What we'd have to do is identify that and then go and grunge through the manuals/docs in detail.