Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

How Does GCP Flushes the static Routes pointing to unresponsive VM's

Hello Community Members, 

I have following simple scenario in my architecture and trying to fine tune it.

Topology:- I have shared VPC where I have two NGFW VM's. Each  have its own IPSEC tunnel and on top of it BGP is running.

Northbound  Corp DC device there are same routes received from both VM's (primary and backup). however while advertising the routes over BGP i am adding MED value for routes advertised via backup VM so that VM is not preferred. 

On Southbound I have compute VM and according to the static routes configuration in shared VPC the primary route with priority 10 is given to primary VM instance and priority 15 route is given to backup VM so that Primary always serve the traffic and backup VM is only active if the Primary is failed.

Problem :- When i reboot the Primary VM, corp DC immediately prefer the BGP routes from backup VM however on GCP side the route is still pointing to the Primary VM, at least on control plane (not sure how to check on data plane), GCP static route is not flushed so that causes asymmetric routing and my compute VM traffic is impacted around ~10 min (basically the time primary vm takes to come up).

 

Question: - Does GCP routing does not flush the static route pointed to unresponsive VM?

If yes how to check  routes on control and data plane?

2)What is alternative way to fix this. Does the Active-Passive ILB help? I can't use Act

 

elon1505_0-1674164033337.png

 

 

0 1 388
1 REPLY 1

No, by default, GCP will not flush or delete a route, regardless on the way you have configured it. You can check this public documentation for more details.

Also, if you are going to isolate the issue, there seems to be a problem with the VM. Normally, it will take 3-5 minutes only for the restart/reboot to be completed. You may need to setup health check on your VM to determine what causes the delay. You may also consider upgrading your machine type to resolve this issue.