Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

Following BQ Clustering recommendations does not work - UPDATE SET CLUSTER_FIELD does not exist

Following the https://cloud.google.com/bigquery/docs/apply-partition-cluster-recommendations#apply_cluster_recomme... documentation leads to the 2nd step where you'll need to run this SQL command:

`UPDATE DATASET.TABLE SET CLUSTER_COLUMN=CLUSTER_FIELD WHERE true`

Where it returns with error that the cluster_field doesn't exist.

0 1 517
1 REPLY 1

The error "UPDATE DATASET.TABLE SET CLUSTER_COLUMN=CLUSTER_FIELD WHERE true" suggests a syntax or logical error in your SQL command. It appears that CLUSTER_FIELD is being used as a placeholder and should be replaced with an existing column name in your table. This error does not necessarily indicate that the clustering recommendation by BigQuery is not applicable to your table.

Here are some considerations and steps you can take:

  1. Check if Table is Already Clustered: To determine if your table is already clustered, you can use the query:

SHOW CREATE TABLE DATASET.TABLE;

If the output includes a CLUSTER BY clause, your table is already clustered.

  1. Consider Table Size: While clustering is more effective for larger tables, it can still be applied to smaller tables. The impact on performance varies depending on the specific use case.

  2. Frequency of Table Access: Clustering can benefit both frequently and infrequently accessed tables by improving performance and reducing costs when they are accessed.

  3. Partitioning and Clustering: Clustering can be applied to both partitioned and non-partitioned tables. However, combining partitioning with clustering often yields better performance improvements.

  4. Identifying Clustering Keys: To find potential clustering keys, focus on columns frequently used in WHERE clauses and with high cardinality (many unique values). You might use a query like:

SELECT column_name, COUNT(*) AS num_rows FROM DATASET.TABLE WHERE column_name IS NOTNULL GROUP BY column_name ORDER BY num_rows DESC;
  1. Adding Clustering to a Table:Currently, BigQuery does not support adding clustering to an existing table directly via an ALTER TABLE statement. To cluster an existing table, you would typically create a new table with the desired clustering and copy the data over. This can also be done through the BigQuery UI or command-line tools.