Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

Vertex AI - running pyspark in jupyter notebook

Hi,

I am currently converting some Alteryx workflows into pyspark and wanted to test the codes for debugging and results validation.

For the testing purpose, I've created a jupyter notebook instance in workbench, but the thing is I am not able to install pyspark in the vertex AI jupyter notebook.

I did a lot of searching as well but haven't got a good resource yet.

It will be really helpful if you could guide me.

Thanks

0 2 1,766
2 REPLIES 2

This looks like a walk through of how to run a PySpark job against Dataproc in a Jupyter notebook. If this doesn't match what you are looking for, can you post the exact steps / recipes you have followed so far and where you are blocked?

You can also use the Jupyter notebooks under Vertex Workbench Instances and enable the Dataproc plug-in (it's enabled by default). They are a little bit more feature-rich than the notebooks that you can find under Dataproc. Instructions here