Hi All,
I am trying to connect to a SparkSession on Vertex AI's Workbench JupyterLab, but receive this error. Locally, my JAVA_HOME system environments and path environments are already set, and can work when I run Jupyter locally. But only on Vertex AI's Workbench JupyterLab I get this error.
Code:
from pyspark.sql import SparkSession
spark = SparkSession.builder \
.appName('Jupyter BigQuery Storage')\
.config('spark.jars', 'gs://spark-lib/bigquery/spark-bigquery-latest_2.12.jar') \
.getOrCreate()
Full Error:
--------------------------------------------------------------------------- RuntimeError Traceback (most recent call last) /tmp/ipykernel_3404/1949393828.py in <module> 9 spark = SparkSession.builder \ 10 .appName('Jupyter BigQuery Storage')\ ---> 11 .config('spark.jars', 'gs://spark-lib/bigquery/spark-bigquery-latest_2.12.jar') \ 12 .getOrCreate() 13 /opt/conda/lib/python3.7/site-packages/pyspark/sql/session.py in getOrCreate(self) 226 sparkConf.set(key, value) 227 # This SparkContext may be an existing one. --> 228 sc = SparkContext.getOrCreate(sparkConf) 229 # Do not update `SparkConf` for existing `SparkContext`, as it's shared 230 # by all sessions. /opt/conda/lib/python3.7/site-packages/pyspark/context.py in getOrCreate(cls, conf) 390 with SparkContext._lock: 391 if SparkContext._active_spark_context is None: --> 392 SparkContext(conf=conf or SparkConf()) 393 return SparkContext._active_spark_context 394 /opt/conda/lib/python3.7/site-packages/pyspark/context.py in __init__(self, master, appName, sparkHome, pyFiles, environment, batchSize, serializer, conf, gateway, jsc, profiler_cls) 142 " is not allowed as it is a security risk.") 143 --> 144 SparkContext._ensure_initialized(self, gateway=gateway, conf=conf) 145 try: 146 self._do_init(master, appName, sparkHome, pyFiles, environment, batchSize, serializer, /opt/conda/lib/python3.7/site-packages/pyspark/context.py in _ensure_initialized(cls, instance, gateway, conf) 337 with SparkContext._lock: 338 if not SparkContext._gateway: --> 339 SparkContext._gateway = gateway or launch_gateway(conf) 340 SparkContext._jvm = SparkContext._gateway.jvm 341 /opt/conda/lib/python3.7/site-packages/pyspark/java_gateway.py in launch_gateway(conf, popen_kwargs) 106 107 if not os.path.isfile(conn_info_file): --> 108 raise RuntimeError("Java gateway process exited before sending its port number") 109 110 with open(conn_info_file, "rb") as info: RuntimeError: Java gateway process exited before sending its port number
Do let me know if you have advice or help, thank you!
User | Count |
---|---|
2 | |
2 | |
1 | |
1 | |
1 |