Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

GCP Batch no cloud logging when task is running

Hi, when I run GCP batch jobs with cloud logging, there is only logging available when a task stopped running (either failed or succeeded), how can I see the log as the task is running? This would be very helpful since

1.it allows us to monitor the running of tasks 

2.if a task fails with exit status 137 (memory usage issue), all intermediate logging will be lost. If we can see the intermediate logs as the task is running, this can be prevented.

Thanks!

Solved Solved
1 8 1,753
1 ACCEPTED SOLUTION

Looks like there is some buffering mechanism in python. By default, Python buffers its stdout. When the buffer fills up or when the program finishes (and the buffer is flushed), you'll see the output.

To work around I can think of the following options

Option 1: add environment variable ”PYTHONUNBUFFERED=1“ to the Batch job spec like the following example

{
  "taskGroups": [
    {
      "taskSpec": {
        "environment" : {
          "variables" : {
             "PYTHONUNBUFFERED" : "1"
          }
       },
        "runnables": [
          {
            "container": {
              "imageUri": "YOUR_IMAGE_URL",
              "commands": ["/bin/sh", "-c","/execute.py"]
            }
          }
        ]
      }
    }
  ],
  "logsPolicy": {
    "destination": "CLOUD_LOGGING"
  }
}

Option 2 Run the python script in unbuffered mode using the `-u` option. e.g. if you are running using python3 you can do.

"runnables": [
     {
        "container": {
           "imageUri": "YOUR_IMAGE_URL",
           "commands": ["/bin/sh", "-c", "python3", "-u", "/execute.py"]
         }
      }
]

Option 3 set the environment in your Dockerfile, but this requires much effort since you may need to build image again.

ENV PYTHONUNBUFFERED 1

 

View solution in original post

8 REPLIES 8

Moreover, the CLOUD_LOGGING seems to be out of order and could not be used to interpret the timestamp between log messages.

Hi gradientopt,

1. For task running logging, in your runnable, you can change you code to print logs you wanted to show on cloud logging to stdout/stderr. Batch will forward all the stdou/stderr to cloud logging immediately.For example, if you are running a shell script, just do echo "the log you want to print" will do the trick.

2. For this issue,  you can print out more logs in your running code. But this do require changing your code.

For the CLOUD_LOGGING out of order issue, if you are running multiple tasks in parallel, different tasks logging could be present intertwined. If that's not the case, could you give me your jobUID and region, I can do some check for you.

 

Thanks for the reply!

1. The task I am running is a python executable and in my python script,  my statement such as print('hello world') still only appears after the task completes. Is this expected? I thought print('hello world') already prints by default to stdout

2.I am referring that, for the same task the logs are out of order.

Okay, Could I know what type of runnable are you using? Container or Script. If you can share the jobUID and region or any fake job spec, that'll be really helpful for root causing the issue. 

It is a docker container job that launches a python script. I just sent you the jobUID and region via dm!

Looks like there is some buffering mechanism in python. By default, Python buffers its stdout. When the buffer fills up or when the program finishes (and the buffer is flushed), you'll see the output.

To work around I can think of the following options

Option 1: add environment variable ”PYTHONUNBUFFERED=1“ to the Batch job spec like the following example

{
  "taskGroups": [
    {
      "taskSpec": {
        "environment" : {
          "variables" : {
             "PYTHONUNBUFFERED" : "1"
          }
       },
        "runnables": [
          {
            "container": {
              "imageUri": "YOUR_IMAGE_URL",
              "commands": ["/bin/sh", "-c","/execute.py"]
            }
          }
        ]
      }
    }
  ],
  "logsPolicy": {
    "destination": "CLOUD_LOGGING"
  }
}

Option 2 Run the python script in unbuffered mode using the `-u` option. e.g. if you are running using python3 you can do.

"runnables": [
     {
        "container": {
           "imageUri": "YOUR_IMAGE_URL",
           "commands": ["/bin/sh", "-c", "python3", "-u", "/execute.py"]
         }
      }
]

Option 3 set the environment in your Dockerfile, but this requires much effort since you may need to build image again.

ENV PYTHONUNBUFFERED 1

 

Thanks a lot for the reply! And I am very glad that the solution works, really appreciate it! Just make sure I understand the cause, is this a GCP Batch specific issue? If I run the python program locally or run a docker locally, the prints are immediate without setting any extra flag.

Sorry missed your question. when I use `docker run docker_image` I can reproduce the bufferred behavior (I'm running on a linux, different kernal may also have different behaviors). Did you run docker using different OS? And currently all the VM supported by Batch is using Linux. 

Also I tried if I use `docker run -it docker_image`, I'll get the output in real time. Because -it will treat docker in the interactive mode. And the buffer size in that mode is very small.