Bigquery Biglake Metastore Linking Fails

JayDev · 11-03-2023 07:04 AM

Hi Team,

We have recently started working with Biglake metastore with Iceberg tables. I am using Apache Spark stored procedures to create catalogs and manage databases and tables.

While metastore, database, and table creation work fine, Data and metadata get loaded on GCS, but the BigQuery linking fails with internal server errors.

I was following this document, and my query looks exactly like this. In fact, it fails when i run the external table command to link it to bigquery table.

Spark Table Creation & Linking:

sparkSession.sql("CREATE TABLE IF NOT EXISTS spark_iceberg_catalog.catalog_db.products (name STRING, category STRING, brand STRING) USING iceberg TBLPROPERTIES(bq_table='biglake.products', bq_connection='projects/my-project/locations/us-east1/connections/my-connection');")

Bigquery External Table Query:

CREATE EXTERNAL TABLE `biglake.products` WITH CONNECTION `projects/my-project/locations/us-east1/connections/my-connection` OPTIONS(format='ICEBERG', uris=['blms://projects/my-project/locations/us-east1/catalogs/spark_iceberg_catalog/databases/catalog_db/tables/products']);

Can you please assist?

JayDev

ErrorMessage:

An internal error occurred and the request could not be completed. This is usually caused by a transient issue. Retrying the job with back-off as described in the BigQuery SLA should solve the problem: https://cloud.google.com/bigquery/sla. If the error continues to occur please contact support at https://cloud.google.com/support. Error: 80324028

"code" : 400,
"errors" : [ {
"domain" : "global",
"message" : "The job encountered an internal error during execution and was unable to complete successfully.",
 "reason" : "jobInternalError"
 } ],
 "message" : "The job encountered an internal error during execution and was unable to complete successfully.",
 "status" : "INVALID_ARGUMENT"

ms4446

The error message you're receiving indicates that an internal error occurred while processing your BigQuery job. This error is typically caused by a transient issue, such as a temporary network outage or server overload. The recommended course of action is to retry the job with back-off, as described in the BigQuery SLA. Back-off means increasing the delay between retries to reduce the likelihood of overloading the system.

To resolve this issue, follow these steps:

Retry the job with back-off: Implement an exponential back-off strategy by starting with a short delay (e.g., 5 seconds) and doubling it with each retry until the job succeeds or reaches a maximum delay threshold.
Verify the job configuration: Double-check that all configurations, including dataset and table names, are correct.
Review the query syntax: Use the BigQuery query validator, accessible from the BigQuery web UI, to check for syntax errors.
Inspect the input data: Ensure the data is correctly formatted, though this is less likely to be the cause given the nature of the error.
Gather detailed information: If you need to contact Google Cloud support, prepare by collecting error messages, job IDs, and a description of the troubleshooting steps you've already taken.

Transient errors are not uncommon in distributed systems, and a back-off strategy is often effective. However, if the problem continues, Google Cloud support will be able to assist further