Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

Issue with properly setting up a Spark Session to my Apache Iceberg BigLake tables

Hello all,

I've successfully set up an Iceberg table with a BigLake Metastore as per this documentation:

https://cloud.google.com/bigquery/docs/iceberg-tables

Everything works, I can see the Iceberg table in BigQuery and the metadata being stored in the specified GCS bucket.

What I now wanted to do is to use my Dataproc cluster with Jupyter installed and to establish a session with the BigLake Metastore and to access my created Iceberg tables.

Unfortunately nothing works and I'm pretty lost where my setup fails or what I'm missing.

Here's my setup in the Notebook:

from pyspark.sql import SparkSession
from pyspark.sql.catalog import Catalog

# Initialize Spark session
spark = SparkSession.builder \
.appName("IcebergExample") \
.config("spark.sql.catalog.biglakecatalog", "org.apache.iceberg.spark.SparkCatalog") \
.config("spark.sql.catalog.biglakecatalog.warehouse", "gs://path/iceberg") \
.config("spark.sql.catalog.biglakecatalog.gcp_location", "europe-west1") \
.config("spark.sql.catalog.biglakecatalog.gcp_project", "playground") \
.config("spark.sql.catalog.biglakecatalog.catalog-impl", "org.apache.iceberg.gcp.biglake.BigLakeCatalog") \
.config("spark.jars.packages", "org.apache.iceberg:iceberg-spark-runtime-3.3_2.12:1.2.0") \
.config("spark.sql.catalog.biglakecatalog.blms_catalog", "biglakecatalog") \
.config("spark.jars", "gs://spark-lib/biglake/biglake-catalog-iceberg1.2.0-0.1.1-with-dependencies.jar") \
.getOrCreate()

But when I run spark.sql("SHOW DATABASES").show(), I only get following output:

+---------+
|namespace|
+---------+
|  default|
+---------+

I also can't create a new namespace or database, as I'm being met with an error then, which looks like it didn't properly recognize the created Iceberg Catalog:

AnalysisException: org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:Unable to create database path file:/spark-warehouse/test.db, failed to create database test)

Anybody knows where I went wrong?

0 0 786
0 REPLIES 0