Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

spark log files are missing

aed
Bronze 1
Bronze 1

newbie

I am unable to debug my spark jobs. They fail with an  OOM exception. The only tool I have to debug are log files. I am having a heck of a time with dataproc spark logging. My my job completes I get a msg on the job details tab . Unfortunately the I can not debug my driver and work logs, the files are missing

Google Cloud Dataproc Agent reports job failure. If logs are available, they can be found at: https://console.cloud.google.com/dataproc/jobs/a222e057148b4878b2f516f07ca368b9?project=dataprocspar...ion=us-central1 gcloud dataproc jobs wait 'a222e057148b4878b2f516f07ca368b9' --region 'us-central1' --project 'dataprocspark-328421' https://console.cloud.google.com/storage/browser/anvil-gtex-v8-hg38-edu-ucsc-kim-lab-spark/google-cl... gs://anvil-gtex-v8-hg38-edu-ucsc-kim-lab-spark/google-cloud-dataproc-metainfo/2aada858-aa7d-4727-8547-2b119e5bcef4/jobs/a222e057148b4878b2f516f07ca368b9/driveroutput

Today I started using https://cloud.google.com/dataproc/docs/guides/logging?authuser=1#job_logs_in

When I create my cluster I set
--properties dataproc:dataproc.logging.stackdriver.job.driver.enable=true,dataproc:dataproc.logging.stackdriver.job.yarn.container.enable=true

I also set
--max-idle 10m

If the cluster is up I can get to the Dataproc Job driver and YARN container logs are listed under the Cloud Dataproc Job resource.

Once the cluster terminates. This no longer works. For the life of me, I can not figure out how to use log explorer https://cloud.google.com/logging/docs/view/building-queries?_ga=2.93830367.-351085001.1615423034

I create a cluster for each job. There are only a hand full of logs.

- it is super slow so as to be unusable. My job fails in about 30 mins.
- I want to look at my driver and yarn/work log files individually
- I want to select warnings and errors in chronologic order
- at this point, I think it would be faster and easier if I could just download the logs and use grep

I really wish there was an easy to to get to the log files from the dataproc job details page

Any suggestions would be greatly appreciated

 

Kind regards

Andy

0 2 1,920
2 REPLIES 2

aed
Bronze 1
Bronze 1

I think I also need to set

dataproc:dataproc.logging.stackdriver.enable=true

dataproc:jobs.file-backed-output.enable=true

will try again tomorrow

You can check the "/var/log/google-fluentd/buffers" directory, and this is where stackdriver logs are staged before being sent to the server. The files in this directory come and go quickly. If you would have more than a couple of files there, then it may be a connectivity issue with the stackdriver server.

You may may find the following helpful:

[1] https://cloud.google.com/dataproc/docs/guides/logging

[2] https://coderedirect.com/questions/426368/output-from-dataproc-spark-job-in-google-cloud-logging