Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

Policy Tags

Hi,

Is there any way to add Policy Tags on external tables in BigQuery?

0 6 809
6 REPLIES 6

@ms4446 is this one you can help with? 🤞

Hi @dellaC & @Roderick ,

Unfortunately, you cannot directly add Policy Tags to standard external tables in BigQuery. While you still cannot directly add Policy Tags to standard external tables in BigQuery, there are ways to achieve control over the data using the methods below. The primary reason is that BigQuery doesn't directly manage the data and the underlying storage system.

  • Workarounds and Best Practices:

    1. Authorized Views: The most common and recommended approach is to leverage authorized views. Here's how:

      • Create an authorized view over the external table, selecting the columns you need.
      • Apply policy tags to the columns within the authorized view.
      • Grant users permissions on the authorized view to enforce column-level access control.
    2. BigLake Tables: For more seamless integration and advanced security features, consider using BigLake tables. BigLake:

      • Allows you to treat external data sources almost as native BigQuery tables.
      • Supports direct application of policy tags for column-level security on external data.
      • Provides the potential for row-level access control for an extra layer of granularity.
    3. Source-Level Permissions: Always remember that a fundamental layer of security lies within the storage system itself (e.g., Google Cloud Storage). Use IAM roles and permissions to control access at the data source.

Important Considerations:

  • Evolving Features: Keep an eye on BigQuery and Data Catalog release notes, as features related to external data management and access control are continuously evolving.
  • Use Case: Your choice of strategy might depend on:
    • The level of security granularity you need.
    • Whether you're working with a variety of data sources.
    • Your overall data governance framework.

Hi @ms4446 and @Roderick,

Even if my external table is from Cloud Storage?

If this is not possible, is there a possibility of performing columnar control of tables in Cloud Storage, like in BigQuery?

Hi @ms4446 and @Roderick,

Even if my external table is from Cloud Storage?

If this is not possible, is there a possibility of performing columnar control of tables in Cloud Storage, like in BigQuery?

@dellaC ,

Even with external tables pointing to Cloud Storage, directly applying policy tags to the columns of these tables in BigQuery is still not supported. BigQuery's policy tags are designed primarily for data fully managed within the BigQuery environment.

Unfortunately, Google Cloud Storage doesn't offer native columnar control in the same way that BigQuery does. Here's why:

  • Object Storage vs. Columnar Storage: Cloud Storage is an object storage service, optimized for storing and retrieving complete files (objects). BigQuery, on the other hand, is designed for analytics and uses a columnar storage format to optimize queries that focus on specific columns within tables.
  • Data Formats: If you need columnar control, consider these approaches:
    • Structured Formats: Store your data in formats that support some degree of columnar access, such as Parquet or ORC. These formats often have metadata and compression schemes that allow tools to skip unnecessary data while reading.
    • Querying Tools: Use tools capable of working with data in Cloud Storage in a more analytical fashion. Some tools might be able to provide a level of columnar-like access even when the underlying data isn't stored strictly in columns.

Again, below are alternative approaches:

  1. BigQuery: If possible, loading your Cloud Storage data into BigQuery (even temporarily for analysis) would allow you to leverage all the benefits of columnar storage and policy tags for fine-grained access control.

  2. BigLake: For a more unified data lake approach, BigLake allows you to work with external data sources (including Cloud Storage) almost like native BigQuery tables and apply column-level policy tags directly.

 

@ms4446 We are using Apache Iceberg BQ tables with BQMS. Data is getting stored to GCS buckets. Do we have an option to use Policy tag, description and partition enforce on these external tables?

Please provide guidance on it.