We have built two Vms , one using Terraform and the second via the console. Both are running a debian 10 OS . The Manually built VM is reporting with Ops agent version 2,22 as seen in
FROM CONSOLE BUILT VM
dpkg-query --show --showformat '${Package} ${Version} ${Architecture} ${Status}\n' google-cloud-ops-agent
google-cloud-ops-agent 2.22.0~debian10 amd64 install ok installed
It is correctly reporting Memory usage to the Console.
The second VM is built via Terraform did include Ops agent but it being reported as version 2.23 , a new version version. This VM does not report Memory and insists that we upgrade the agent to the newer version through the console. We done that . The version on the Vm remains at v 2.23 and still doe snot report memory.
Is there an incompatibility here ?
Thank you
FROM TERRAFORM BUILT VM
dpkg-query --show --showformat '${Package} ${Version} ${Architecture} ${Status}\n' google-cloud-ops-agent
google-cloud-ops-agent 2.23.0~debian10 amd64 install ok installed
Hello MickBisignani
I recommend you to check this guide: Troubleshoot the Ops Agent, perform the steps and report back.
Thank you Hector.
The VM that ws built using Terraform reports the following:
dpkg-query --list google-fluentd
Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name Version Architecture Description
+++-==============-============-============-=================================
un google-fluentd <none> <none> (no description available)
however when i run
sudo systemctl status google-cloud-ops-agent"*". i See something that is telling me that eh agent is installed and running .
● google-cloud-ops-agent-opentelemetry-collector.service - Google Cloud Ops Agent - Metrics Agent
Loaded: loaded (/lib/systemd/system/google-cloud-ops-agent-opentelemetry-collector.service; static; vendor preset: enabled)
Active: active (running) since Tue 2022-11-22 20:24:57 UTC; 22min ago
Process: 419 ExecStartPre=/opt/google-cloud-ops-agent/libexec/google_cloud_ops_agent_engine -service=otel -in /etc/google-cloud-ops-agent/config.yaml -logs ${LOGS_DIR
Main PID: 1005 (otelopscol)
Tasks: 8 (limit: 4669)
Memory: 126.4M
CGroup: /system.slice/google-cloud-ops-agent-opentelemetry-collector.service
└─1005 /opt/google-cloud-ops-agent/subagents/opentelemetry-collector/otelopscol --config=/run/google-cloud-ops-agent-opentelemetry-collector/otel.yaml --feat
Nov 22 20:47:02 sta-test-2 otelopscol[1005]: /root/go/pkg/mod/go.opentelemetry.io/collector@v0.61.0/exporter/exporterhelper/internal/bounded_memory_queue.go:61
Nov 22 20:47:21 sta-test-2 otelopscol[1005]: 2022-11-22T20:47:21.922Z error exporterhelper/queued_retry.go:361 Exporting failed. Try enabling retry
Nov 22 20:47:21 sta-test-2 otelopscol[1005]: go.opentelemetry.io/collector/exporter/exporterhelper.(*retrySender).send
Nov 22 20:47:21 sta-test-2 otelopscol[1005]: /root/go/pkg/mod/go.opentelemetry.io/collector@v0.61.0/exporter/exporterhelper/queued_retry.go:361
Nov 22 20:47:21 sta-test-2 otelopscol[1005]: go.opentelemetry.io/collector/exporter/exporterhelper.(*metricsSenderWithObservability).send
Nov 22 20:47:21 sta-test-2 otelopscol[1005]: /root/go/pkg/mod/go.opentelemetry.io/collector@v0.61.0/exporter/exporterhelper/metrics.go:133
Nov 22 20:47:21 sta-test-2 otelopscol[1005]: go.opentelemetry.io/collector/exporter/exporterhelper.(*queuedRetrySender).start.func1
Nov 22 20:47:21 sta-test-2 otelopscol[1005]: /root/go/pkg/mod/go.opentelemetry.io/collector@v0.61.0/exporter/exporterhelper/queued_retry.go:206
Nov 22 20:47:21 sta-test-2 otelopscol[1005]: go.opentelemetry.io/collector/exporter/exporterhelper/internal.(*boundedMemoryQueue).StartConsumers.func1
Nov 22 20:47:21 sta-test-2 otelopscol[1005]: /root/go/pkg/mod/go.opentelemetry.io/collector@v0.61.0/exporter/exporterhelper/internal/bounded_memory_queue.go:61
● google-cloud-ops-agent.service - Google Cloud Ops Agent
Loaded: loaded (/lib/systemd/system/google-cloud-ops-agent.service; enabled; vendor preset: enabled)
Active: active (exited) since Tue 2022-11-22 20:24:57 UTC; 22min ago
Process: 385 ExecStartPre=/opt/google-cloud-ops-agent/libexec/google_cloud_ops_agent_engine -in /etc/google-cloud-ops-agent/config.yaml (code=exited, status=0/SUCCESS
Process: 1001 ExecStart=/bin/true (code=exited, status=0/SUCCESS)
Main PID: 1001 (code=exited, status=0/SUCCESS)
Nov 22 20:24:57 sta-test-2 google_cloud_ops_agent_engine[385]: processors:
Nov 22 20:24:57 sta-test-2 google_cloud_ops_agent_engine[385]: metrics_filter:
Nov 22 20:24:57 sta-test-2 google_cloud_ops_agent_engine[385]: type: exclude_metrics
Nov 22 20:24:57 sta-test-2 google_cloud_ops_agent_engine[385]: metrics_pattern: []
Nov 22 20:24:57 sta-test-2 google_cloud_ops_agent_engine[385]: service:
Nov 22 20:24:57 sta-test-2 google_cloud_ops_agent_engine[385]: pipelines:
Nov 22 20:24:57 sta-test-2 google_cloud_ops_agent_engine[385]: default_pipeline:
Nov 22 20:24:57 sta-test-2 google_cloud_ops_agent_engine[385]: receivers: [hostmetrics]
Nov 22 20:24:57 sta-test-2 google_cloud_ops_agent_engine[385]: processors: [metrics_filter]
Nov 22 20:24:57 sta-test-2 systemd[1]: Started Google Cloud Ops Agent.
● google-cloud-ops-agent-diagnostics.service - Google Cloud Ops Agent - Diagnostics
Loaded: loaded (/lib/systemd/system/google-cloud-ops-agent-diagnostics.service; disabled; vendor preset: enabled)
Active: active (running) since Tue 2022-11-22 20:24:46 UTC; 22min ago
Main PID: 421 (google_cloud_op)
Tasks: 8 (limit: 4669)
Memory: 74.5M
CGroup: /system.slice/google-cloud-ops-agent-diagnostics.service
└─421 /opt/google-cloud-ops-agent/libexec/google_cloud_ops_agent_diagnostics -config /etc/google-cloud-ops-agent/config.yaml
Nov 22 20:37:58 sta-test-2 google_cloud_ops_agent_diagnostics[421]: 2022/11/22 20:37:58 rpc error: code = PermissionDenied desc = Permission monitoring.metricDescriptor
Nov 22 20:38:58 sta-test-2 google_cloud_ops_agent_diagnostics[421]: 2022/11/22 20:38:58 rpc error: code = PermissionDenied desc = Permission monitoring.metricDescriptor
Nov 22 20:39:58 sta-test-2 google_cloud_ops_agent_diagnostics[421]: 2022/11/22 20:39:58 rpc error: code = PermissionDenied desc = Permission monitoring.metricDescriptor
Nov 22 20:40:58 sta-test-2 google_cloud_ops_agent_diagnostics[421]: 2022/11/22 20:40:58 rpc error: code = PermissionDenied desc = Permission monitoring.metricDescriptor
Nov 22 20:41:58 sta-test-2 google_cloud_ops_agent_diagnostics[421]: 2022/11/22 20:41:58 rpc error: code = PermissionDenied desc = Permission monitoring.metricDescriptor
Nov 22 20:42:58 sta-test-2 google_cloud_ops_agent_diagnostics[421]: 2022/11/22 20:42:58 rpc error: code = PermissionDenied desc = Permission monitoring.metricDescriptor
Nov 22 20:43:58 sta-test-2 google_cloud_ops_agent_diagnostics[421]: 2022/11/22 20:43:58 rpc error: code = PermissionDenied desc = Permission monitoring.metricDescriptor
Nov 22 20:44:58 sta-test-2 google_cloud_ops_agent_diagnostics[421]: 2022/11/22 20:44:58 rpc error: code = PermissionDenied desc = Permission monitoring.metricDescriptor
Nov 22 20:45:58 sta-test-2 google_cloud_ops_agent_diagnostics[421]: 2022/11/22 20:45:58 rpc error: code = PermissionDenied desc = Permission monitoring.metricDescriptor
Nov 22 20:46:58 sta-test-2 google_cloud_ops_agent_diagnostics[421]: 2022/11/22 20:46:58 rpc error: code = PermissionDenied desc = Permission monitoring.metricDescriptor
● google-cloud-ops-agent-fluent-bit.service - Google Cloud Ops Agent - Logging Agent
Loaded: loaded (/lib/systemd/system/google-cloud-ops-agent-fluent-bit.service; static; vendor preset: enabled)
Active: active (running) since Tue 2022-11-22 20:24:57 UTC; 22min ago
Process: 420 ExecStartPre=/opt/google-cloud-ops-agent/libexec/google_cloud_ops_agent_engine -service=fluentbit -in /etc/google-cloud-ops-agent/config.yaml -logs ${LOG
Main PID: 997 (fluent-bit)
Tasks: 22 (limit: 4669)
Memory: 29.0M
CGroup: /system.slice/google-cloud-ops-agent-fluent-bit.service
└─997 /opt/google-cloud-ops-agent/subagents/fluent-bit/bin/fluent-bit --config /run/google-cloud-ops-agent-fluent-bit/fluent_bit_main.conf --parser /run/goog
Nov 22 20:24:57 sta-test-2 google_cloud_ops_agent_engine[420]: service:
Nov 22 20:24:57 sta-test-2 google_cloud_ops_agent_engine[420]: pipelines:
Nov 22 20:24:57 sta-test-2 google_cloud_ops_agent_engine[420]: default_pipeline:
Nov 22 20:24:57 sta-test-2 google_cloud_ops_agent_engine[420]: receivers: [hostmetrics]
Nov 22 20:24:57 sta-test-2 google_cloud_ops_agent_engine[420]: processors: [metrics_filter]
Nov 22 20:24:57 sta-test-2 systemd[1]: Started Google Cloud Ops Agent - Logging Agent.
Nov 22 20:24:57 sta-test-2 fluent-bit[997]: Fluent Bit v1.9.8
Nov 22 20:24:57 sta-test-2 fluent-bit[997]: * Copyright (C) 2015-2022 The Fluent Bit Authors
Nov 22 20:24:57 sta-test-2 fluent-bit[997]: * Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
Nov 22 20:24:57 sta-test-2 fluent-bit[997]: * https://fluentbit.io
Do you see anything that looks wrong ?
thanks
With all the information you have of your project, my recommendation due to the nature of your issue will be to create a PIT (Public Issue Tracker) or please engage GCP Support if you're paying or if you're interested in starting to pay for a Support Package. Please be aware that from these 2 options, the second one is the fastest.