Dear Community,
We have been facing downtime in our application and have been going though all logs coming from the Google Cloud Ops agent to make sure we understand all logs and if they may be causing issues due to misconfiguration or other problems.
These logs in specific have been appearing so much since first we installed the ops agent on our servers.
I would love if someone could shed light on these error logs and if there's a way to disable them if it will not impact our systems.
1. Workload Certificate Refresh Issue
Log Message:
Finished GCE Workload Certificate refresh Error getting config status, workload certificates may not be configured: failed to GET "instance/gce-workload-certificates/config-status" from MDS with error: error connecting to metadata server, status code: 404"
Cause:
This error occurs because the instance metadata server (MDS) is unable to find the workload certificate configuration. This feature is primarily used for securing workloads in GCE instances.
If your setup does not use GCE Workload Identity or workload certificates, these logs are harmless.
Solution:
Disable Workload Certificate Monitoring:
If you don't require workload certificates, you can disable it by adding the following in your Ops Agent config file (/etc/google-cloud-ops-agent/config.yaml):
yaml
metrics: receivers: [] processors: []
Ensure Metadata Server Connectivity:
Run the following command to check if the metadata server is reachable:
curl "http://metadata.google.internal/computeMetadata/v1/instance/" -H "Metadata-Flavor: Google"
If the request fails, verify firewall rules or VPC restrictions blocking metadata access.