We have just created Dataproc on GCE (2.2.39-debian12) with Zeppelin component, we are trying to create table in Spark SQL from data available in avro format by running the following block in Zeppelin.
%spark.sql
CREATE TABLE daily_stats USING AVRO OPTIONS (path "gs://bucket/path/to/data/*.avro");
after a couple of Interpreter's failure, I tried to access spark-sql cli through the VM
spark-sql --packages org.apache.spark:spark-avro_2.12:3.5.0
it seems like the spark-avro_2.12 was never downloaded - and it just started to download - even though I made sure it add it to spark.jar.packages in spark's Interpreter config through Zeppelin's UI
I managed to create the table and run some queries without any issues - the same in Zeppelin Notebook after the package was downloaded in my cli session - but I noticed that running a query that returns a couple of thousand in rows results in the following loop of errors and Zeppelin being stuck
INFO [2024-11-11 18:18:53,498] ({JobStatusPoller-paragraph_1731338412819_1362046418} NotebookServer.java[onStatusChange]:1987) - Job paragraph_1731338412819_1362046418 starts to RUNNING
INFO [2024-11-11 18:19:27,168] ({qtp524223214-19} NotebookServer.java[onClose]:472) - Closed connection to 127.0.0.1:39186 (1009) Text message size [1048533] exceeds maximum size [1024000]
INFO [2024-11-11 18:19:33,866] ({qtp524223214-19} NotebookServer.java[onOpen]:244) - New connection from 127.0.0.1:51090
INFO [2024-11-11 18:19:35,467] ({qtp524223214-13} NotebookServer.java[onClose]:472) - Closed connection to 127.0.0.1:51090 (1011) EofException
INFO [2024-11-11 18:19:41,757] ({qtp524223214-14} NotebookServer.java[onOpen]:244) - New connection from 127.0.0.1:51102
INFO [2024-11-11 18:19:42,689] ({qtp524223214-17} NotebookServer.java[onClose]:472) - Closed connection to 127.0.0.1:51102 (1011) EofException
INFO [2024-11-11 18:19:48,680] ({qtp524223214-17} NotebookServer.java[onOpen]:244) - New connection from 127.0.0.1:56664
INFO [2024-11-11 18:19:49,621] ({qtp524223214-19} NotebookServer.java[onClose]:472) - Closed connection to 127.0.0.1:56664 (1011) EofException
INFO [2024-11-11 18:19:55,783] ({qtp524223214-14} NotebookServer.java[onOpen]:244) - New connection from 127.0.0.1:52348
INFO [2024-11-11 18:19:56,773] ({qtp524223214-16} NotebookServer.java[onClose]:472) - Closed connection to 127.0.0.1:52348 (1009) Text message size [1061535] exceeds maximum size [1024000]
INFO [2024-11-11 18:20:02,772] ({qtp524223214-13} NotebookServer.java[onOpen]:244) - New connection from 127.0.0.1:39742
INFO [2024-11-11 18:20:04,113] ({qtp524223214-19} NotebookServer.java[onClose]:472) - Closed connection to 127.0.0.1:39742 (1009) Text message size [1061535] exceeds maximum size [1024000]
I tried to increase ZEPPELIN_WEBSOCKET_MAX_TEXT_MESSAGE_SIZE - but it didn't work somehow it is never being reflected and I'll be still receiving this message - I tried to change this variable because the value in the message 1024000 is the default value for this environment variable. - I tried to change this in zeppelin-env.sh and zeppelin-site.xml but none has helped.