Re: Expected startup times with container "enableI...

jacksonwb · 05-23-2024 02:45 PM

When running a job with a fairly large image (few GBs) transition time from SCHEDULED to RUNNING takes about 6 mins.

When enabling container runnable image streaming with `enableImageStreaming` I don't really observe any difference in startup time. What is the expected improvement?

When describing the job the only change I observe is an extra label is added

```
labels {
key: "goog-batch-managed-container"
value: "enabled"
}
```

And the container runnable volumes string appends a `:false` to the supplied path strings.

bolianyin

The improvement of `enableImageStreaming` comes from accessing files in the container image without waiting for the whole image being pulled. Therefore, it depends on access pattern. Does your task use most content in the container is relatively short time (which will see less benefits)?

nomi

To add more points, there are some limitations for using image streaming. Here are some common limitations:

The AR repository must be in the same region as Cloud Batch VMs, or in a multi-region corresponding to the region where Batch VMs are running.
The private AR repository must be accessible by the current service account running Batch.
Images that use the V2 Image Manifest, schema version1 are not supported.
Images with empty layers or duplicate layers are not supported.
You might not notice the benefits of Image streaming during the first pull of an eligible image. However, after Image streaming caches the image, future image pulls on any jobs benefit from Image streaming.

Besides, there is a image pulling log which is specifically for Image Streaming in the Cloud Logging. An example format is

Pulling images us-central1-docker.pkg.dev/batch-project/test/image:test...

You can roughly get an idea of the image streaming time from it.

jacksonwb

If the image streaming requirements are not met, will the submit return an error, or will the job proceed without image streaming enabled?

One can tell based on the log format?

nomi

If image streaming is not met, we will fall back pulling images without image streaming. Currently, the most straight forward way checking if the image is streamed or not is by the VM system log.

journalctl -u snapshotter

The above command will give you logs if your image is streamed. Something like

image xxx is backed by image streaming

We have logging improvement in our backlog to make all image streaming info more accessible though the cloud logging.

Expected startup times with container "enableImageStreaming": true