How to submit a PySpark job on Dataproc Servless ?
I need to submit not just a single Python file, but an entire Python project. In addition to main.py, I need to include other files like config.json, requirements.txt, and additional Python files that main.py references and imports.
For example, I have this project structure, where main.py imports helper, logger, etc., and uses config.json for initial configurations. Also, I need the packages in requirements.txt to be installed.
In short, I need the job to run main.py, but the entire project must be available for it to execute on Dataproc Servless.
project/
├── main.py
├── file1.py
├── file2.py
├── config.json
├── requirements.txt
│
├── utils/
│ ├── helper.py
│ └── logger.py
│
├── services/
│ ├── service1.py
│ └── service2.py