Hi, got a networking question (I think)!
Our project setup is as follows: there's a custom VPC with a Cloud NAT, Cloud Router and one exposed IPv4 address, acting as the one main public interaction point of all my Compute Engines and Cloud Run Jobs.
I'm running into an issue where outgoing requests, both in the VM and in the Cloud Run Jobs, start hanging. Specifically, the behavior I've noticed is that the hanging starts after 100-150 requests. After a minute, the hanging request resolves as a failed request, after which the process continues successfully for another 100-150 requests.
To me (admittedly a networking amateur), this seems like I'm hitting some kind of quota for requests per minute. However, according to the Quota's page, I've not come close. I'm running an E2 standard VM, which the docs tell me should have 4Gbps egress bandwidth, which at a hundred <500byte JSON requests per minute, I think I'm not even close to hitting.
A seemingly identical same setup in GCP is working in another (paid) project (with higher egress demands, by far) without issue. I have done a curl spam test on an identical VM in that project, and there is no hanging. It just happily spams away requests forever, without hanging (well, much longer and consistently than in this one, at least). Locally, I don't experience this issue, either.
So, I'm puzzled. Could this be caused by being on the free tier still on this project, or something? Where or how can I find out what's causing this? Is this networking setup of routing everything through a VPC okay, or am I making errors?
Any insight or help on this would be greatly appreciated!