Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

Submit PySpark job on Dataproc Servless

How to submit a PySpark job on Dataproc Servless ?

I need to submit not just a single Python file, but an entire Python project. In addition to main.py, I need to include other files like config.json, requirements.txt, and additional Python files that main.py references and imports.

For example, I have this project structure, where main.py imports helper, logger, etc., and uses config.json for initial configurations. Also, I need the packages in requirements.txt to be installed.

In short, I need the job to run main.py, but the entire project must be available for it to execute on Dataproc Servless.

project/
├── main.py
├── file1.py
├── file2.py
├── config.json
├── requirements.txt

├── utils/
│ ├── helper.py
│ └── logger.py

├── services/
│ ├── service1.py
│ └── service2.py

0 1 691
1 REPLY 1