Hi,
I am trying to run a pod with gpu support but I am getting "insufficient nvidia.com/gpu". Can you help me understand what am I doing wrong?
This is the pod definition:
```
apiVersion: v1
kind: Pod
metadata:
name: cuda-vector-add
spec:
nodeSelector:
cloud.google.com/gke-accelerator: nvidia-tesla-t4
restartPolicy: OnFailure
containers:
- name: cuda-vector-add
# https://github.com/kubernetes/kubernetes/blob/v1.7.11/test/images/nvidia-cuda/Dockerfile
image: "registry.k8s.io/cuda-vector-add:v0.1"
resources:
limits:
nvidia.com/gpu: 1
```
And this is the error I get if I run `kubectl describe pods`
```
Warning FailedScheduling 56s (x3 over 11m) gke.io/optimize-utilization-scheduler 0/2 nodes are available: 2 Insufficient cpu, 2 Insufficient memory, 2 Insufficient nvidia.com/gpu, 2 node(s) didn't match Pod's node affinity/selector. preemption: 0/2 nodes are available: 2 Preemption is not helpful for scheduling.
```
Can someone give me a hand?
thanks
Solved! Go to Solution.
Hi,
yeah I saw the message about autoscaling. And this gave me the clue 🙂 ! . I was not understanding that , because of autoscaling feature on autopilot, I need a quota of 2 gpus, when I had a quota of 1.
I was able to fix it by requesting an increase of quota.
Thanks for your help
I believe it should fail at first since there are not GPU nodes deployed/available. After a little while, you should see a message about triggering autoscaling. Did this not happen?
Hi,
yeah I saw the message about autoscaling. And this gave me the clue 🙂 ! . I was not understanding that , because of autoscaling feature on autopilot, I need a quota of 2 gpus, when I had a quota of 1.
I was able to fix it by requesting an increase of quota.
Thanks for your help