Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

GKE force upgrading/Nodes recreating

I am encountering two issues with my Kubernetes cluster:

1.Frequent Node Recreation:

Over the past month, I’ve noticed that nodes are being recreated randomly at least once daily. Initially, I identified the nodes as preemptible and disabled this setting, but the issue persists. What could be causing these recreations, and how can I prevent this from happening in the future?

2.Unexpected Node Upgrades:

Despite disabling auto-upgrade and switching the release channel to “No Channel,” my nodes were still upgraded from version 1.30.3-x (approximate) to 1.30.5-gke.1014003.

What could have overridden the auto-upgrade settings, and how can I ensure such upgrades are avoided in the future?

 

Thank you for your assistance.

0 6 556
6 REPLIES 6

For #1, do your logs say anything? Is it still one or more recreates a day? 

For #2, disabling auto-upgrades stops direct upgrade operations, but anything that causes a node recreation will create the new nodes at the current control plane version if the control plane version was different than the node version. You can't turn off control plane upgrades - disabling node auto-upgrades only stops GKE from upgrading your node versions, putting the responsibility on you to avoid too big of a version skew.

Probably what happened was that some other operation caused your nodes to get recreated (the issue you described in #1?), and they got recreated at the current control plane patch version.

Auto-upgrades: https://cloud.google.com/kubernetes-engine/docs/how-to/node-auto-upgrades#considerations

Hi,
Thank you for your reply!
It's exactly one event every day where all nodes get recreated.

I'm pretty sure that you cannot switch a node pool from being preemptible.

I'd create a new node pool and delete the existing one.

Yes, that's true, but this is happening on a cluster/node pool that's not set as preemptible.

hi @dominykasz ,

did you checked auto upgrade setting and auto repair at node pool level ?
make sure to disable them if you dont need them.

Lingesh_0-1735558804574.png

Most recent update – 3 days ago
December 27, 2024 at 3:09:16 PM UTC+5:30

Some Google Kubernetes Engine customers using GKE Image Streaming may experience workloads restarting on clusters running 1.28 or later.

Mitigation work is currently underway by our engineering team.

The mitigation is expected to complete by Tuesday, 2024-12-31 17:00 US/Pacific.

We will provide more information by Tuesday, 2024-12-31 18:00 US/Pacific.

Top Labels in this Space