Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

Sharing External BigQuery Tables as Authorized datasets(views) without Granting Bucket permisions

Hello,

I am seeking guidance on an issue I've encountered while working with Google BigQuery and Google Cloud Storage (GCS).

Overview:

  • I've set up an external table in BigQuery, which sources data from a Parquet file stored in GCS. The specific GCS path.
  • To maintain data security and control access, I've implemented an authorized view in BigQuery.
  • However, when third-party users attempt to access the data through this authorized view, they encounter an error suggesting a lack of storage.objects.get permissions for the GCS bucket.

Current Challenge:

  • It appears that despite using authorized views in BigQuery, third-party users require direct storage.objects.get permission on the GCS bucket where the file is stored.
  • This requirement seems to counter the purpose of using authorized views, as I prefer not to grant direct storage access.

Inquiries:

  1. Is it a standard requirement for users to have direct storage.objects.get permissions on the GCS bucket, even when accessing data through authorized BigQuery views?
  2. Are there alternative methods to share external tables (linked to GCS files) using authorized views or datasets, without granting direct permissions on the GCS bucket with out materializing the table? maybe Analytic HUB?

Any insights, best practices, or suggestions regarding this would be greatly appreciated.

Thank you for your time and assistance.

Kind regards,

Ricardo Novas

Solved Solved
0 1 1,997
1 ACCEPTED SOLUTION

Hi Ricardo,

Yes, for external tables in BigQuery that reference data stored in GCS, such as Parquet files, it is typically required for users to have storage.objects.get permissions on the GCS bucket. This is because BQ external tables do not store the data themselves; instead, they read it directly from the storage source, in this case, GCS. When a user queries an authorized view linked to an external table, BQ accesses the data from GCS, necessitating the appropriate GCS permissions.

However, there are several methods and best practices to manage and control access to your data in BigQuery and GCS:

  1. IAM Table-Level or View-Level Access Control: This is effective for native BQ tables but may not fully address the need for GCS permissions for external tables. It offers granular control over data access but can be complex to manage.

  2. IAM Roles: Assigning predefined IAM roles is a convenient way to manage permissions. While not as granular as table-level or view-level access control, it simplifies permission management.

  3. Materializing Data in BigQuery: To avoid direct GCS access, consider periodically importing data from GCS into a native BigQuery table. This allows you to leverage authorized views effectively.

  4. BigQuery Data Transfer Service: Automate the transfer of data from GCS to BigQuery. This approach is useful if real-time data access is not critical.

  5. VPC Service Controls: For enhanced security, VPC Service Controls create a secure perimeter around your data resources, adding an additional layer of protection.

  6. GCS ACLs: While ACLs offer another layer of access control, they are generally less flexible and granular compared to IAM and might not be suitable for all use cases.

View solution in original post

1 REPLY 1

Hi Ricardo,

Yes, for external tables in BigQuery that reference data stored in GCS, such as Parquet files, it is typically required for users to have storage.objects.get permissions on the GCS bucket. This is because BQ external tables do not store the data themselves; instead, they read it directly from the storage source, in this case, GCS. When a user queries an authorized view linked to an external table, BQ accesses the data from GCS, necessitating the appropriate GCS permissions.

However, there are several methods and best practices to manage and control access to your data in BigQuery and GCS:

  1. IAM Table-Level or View-Level Access Control: This is effective for native BQ tables but may not fully address the need for GCS permissions for external tables. It offers granular control over data access but can be complex to manage.

  2. IAM Roles: Assigning predefined IAM roles is a convenient way to manage permissions. While not as granular as table-level or view-level access control, it simplifies permission management.

  3. Materializing Data in BigQuery: To avoid direct GCS access, consider periodically importing data from GCS into a native BigQuery table. This allows you to leverage authorized views effectively.

  4. BigQuery Data Transfer Service: Automate the transfer of data from GCS to BigQuery. This approach is useful if real-time data access is not critical.

  5. VPC Service Controls: For enhanced security, VPC Service Controls create a secure perimeter around your data resources, adding an additional layer of protection.

  6. GCS ACLs: While ACLs offer another layer of access control, they are generally less flexible and granular compared to IAM and might not be suitable for all use cases.