Hello,
I am receiving below errors when trying to read a gzip file from Google Cloud Storage through dataflow. I am receiving the same error for multiple dataflow jobs at the same step. I am dont understand what I need to change
google.auth.exceptions.TransportError: ('Failed to retrieve http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/dataflow-runner@*****.i... from the Google Compute Engine metadata service. Status: 429 Response:\nb\'"Too many requests."\'', <google.auth.transport.requests._Response object at 0x7f8a8c5287c0>)
google.auth.exceptions.RefreshError: ('Failed to retrieve http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/dataflow-runner@****.ia... from the Google Compute Engine metadata service. Status: 429 Response:\nb\'"Too many requests."\'', <google.auth.transport.requests._Response object at 0x7f8a8c5287c0>)
Hello @rish,
Thank you for contacting Google Cloud Community.
Please note that the user traffic receiving 429s from GCS can indicate the following:
Exceeding object/bucket mutation rate
Resource contention
DoS limits (QPS and DoS - Egress)
Non-conforming traffic as per these guidelines
Ingress bandwidth limit (see 429s due to exceeding the ingress bandwidth limit )
Project QPS quota (see Project QPS quota exceeded - QPS Throttlers )
Egress bandwidth quotas (Bouncer )
Solution: You are hitting a limit to the number of requests Cloud Storage allows for a given resource. See the Cloud Storage quotas for a discussion of limits in Cloud Storage.
If your workload consists of 1000's of requests per second to a bucket, see Request rate and access distribution guidelines for a discussion of best practices, including ramping up your workload gradually and avoiding sequential filenames.
If your workload is potentially using 50 Gbps or more of network egress to specific locations, check your bandwidth usage to ensure you're not encountering a bandwidth quota.
I hope the above information is helpful 🙂
Thanks & Regards,
Manish Bavireddy.