Re: FailedScheduling with ephemeral PVC using GKE ...

mroach · 12-04-2023 02:13 AM

I've got an application that uses a readonly root filesystem but requires some scratch disk space for file processing. Ephemeral storage with Autopilot didn't work the same as with normal k8s. I wish I could remember the exact issue, but the solution I found was to use an ephemeral volume claim. This is what I've ended up with:

containers:
  - name: app
    volumeMounts:
      - name: tmp
        mountPath: /tmp
volumes:
  - name: tmp
    ephemeral:
      volumeClaimTemplate:
        spec:
          accessModes: ["ReadWriteOnce"]
          resources:
            requests:
              storage: 1Gi

This works, but in the past couple weeks this has made my deployments so slow that they are timing out (we have krane configured with a 5 min timeout). The deploys to eventually work, but they take so long that krane thinks they failed and shows the latest error messages:

FailedScheduling: 0/3 nodes are available: waiting for ephemeral volume controller to create the persistentvolumeclaim "core-backend-jobs-77856c9968-g5tz4-tmp"
   preemption: 0/3 nodes are available: 3 No preemption victims found for incoming pod.. (1 events)
FailedScheduling: running PreFilter plugin "VolumeBinding": error getting PVC "edge/core-backend-jobs-77856c9968-g5tz4-tmp":
  could not find v1.PersistentVolumeClaim "edge/core-backend-jobs-77856c9968-g5tz4-tmp" (1 events)

Question is: why? Why is it taking so long to create an ephemeral volume? And why now? This was fine for months and just started happening recently, and we haven't changed anything. Now it seems to take up to 10 minutes to provision the volumes for each node.

I'm not confident that the ephemeral volumes are the direct cause here. It may be that those are just the most recent events. In general I've found deployments with Autopilot to be significantly slower than normal k8s, but not this slow.

My cluster is running with GKE 1.27.3-gke.100 on Autopilot.

lawrencenelson

Hi @mroach,

Welcome to the Google Cloud Community!

@mroach wrote:

why? Why is it taking so long to create an ephemeral volume? And why now? This was fine for months and just started happening recently, and we haven't changed anything.

To clarify, were you using a Standard cluster before? When did this issue begin? It's possible that the problem arose when you upgraded your Autopilot cluster version.

For the most accurate diagnosis, I recommend contacting Google Cloud Support directly. They can investigate any potential problems in the backend and provide further assistance. Thank you.

mroach

I switched to using Autopilit back in about September. It had been mostly fine, although deployments were noticeably slower than Standard. This problem of deployments timing out though only started happening a week or so ago, and has now appeared to clear-up despite us not changing anything. Perhaps it was a temporary bug in Autopilot?

garisingh

Are you seeing this all the time trying to dynamically provision volumes? You'll probably need to file a support request to get to the bottom of that.

On the other hand, have you tried using `emptyDir` recently with Autopilot to create ephemeral storage?

spec:
  containers:
  - name: app
    image: app
    resources:
      requests:
        ephemeral-storage: "1Gi"
    volumeMounts:
    - name: ephemeral
      mountPath: "/tmp"
  volumes:
    - name: ephemeral
      emptyDir:
        sizeLimit: "1Gi"

mroach

I can't use emptyDir unfortunately. Ruby finds the directory to be insecure because it's world-writable and the stickybit is not set. AFAIK there is no way to set a mode on the dir which is why I was using an ephemeral volume instead.

https://github.com/ruby/ruby/blob/4095e7d2be0ef6426e0cb75a53472f6dc1e5a0af/lib/tmpdir.rb#L26-L44

FailedScheduling with ephemeral PVC using GKE Autopilot