Hi,
I am new in this community however I've been working with Apigee Hybrid for some time. I have a query about Apigee Hybrid 1.6 and its feature called Diagnostic Collector. I wanted to capture diagnostics for runtime pods but this feature doesn't seem to be working.
Here's what I've done (following https://cloud.google.com/apigee/docs/hybrid/v1.6/diagnostic-collector)
1. Created Google Cloud Storage Bucket "rad-apigee" in EU region.
2. Created a service account with the Storage Admin role in our project + Downloaded json key file.
3. Configured overrides.yaml for Diagnostic collector as following:
diagnostic:
# required properties:
serviceAccountPath: "./service-accounts/apigee-poc-XYZ-apigee-diagnostic.json"
operation: "LOGGING"
bucket: "rad-apigee"
container: "apigee-runtime"
namespace: "apigee"
podNames:
- apigee-runtime-apigee-poc-XYZ-mvp-dev-0c034c1-160-uhaof-94kg9
- apigee-runtime-apigee-poc-XYZ-mvp-dev-0c034c1-160-uhaof-gq2kz
# optional properties:
tcpDumpDetails:
maxMsgs: 10
timeoutInSeconds: 100
threadDumpDetails:
iterations: 5
delayInSeconds: 2
loggingDetails:
loggerNames:
- ALL
logLevel: FINE
logDuration: 60000
4. Ran Diagnostic collector.
$APIGEECTL_HOME/apigeectl diagnostic -f overrides/overrides.yaml
Parsing file: config/values.yaml
Parsing file: overrides/overrides.yaml
Invoking "kubectl apply" with diagnostic YAML config...
namespace/apigee-diagnostic created
peerauthentication.security.istio.io/apigee-diagnostic created
clusterrole.rbac.authorization.k8s.io/apigee-diagnostic created
serviceaccount/apigee-diagnostic created
clusterrolebinding.rbac.authorization.k8s.io/apigee-diagnostic created
secret/apigee-diagnostic-config created
secret/apigee-diagnostic-svc-account created
job.batch/apigee-diagnostic created
5. Get the pods in the apigee-diagnostic namespace.
kubectl get pods -n apigee-diagnostic
NAME READY STATUS RESTARTS AGE
apigee-diagnostic-x7585 0/1 Completed 0 77s
6. Make note of the pod with the name containing diagnostic-collector.
There is no such a pod! (there's only apigee-diagnostic apigee-diagnostic-x7585)
7. There are no logs in the Google Cloud Storage Bucket.
No rows to display
8. Delete the Diagnostic collector.
$APIGEECTL_HOME/apigeectl diagnostic delete -f overrides/overrides.yaml
Parsing file: config/values.yaml
Parsing file: overrides/overrides.yaml
Invoking "kubectl delete" with diagnostic YAML config...
namespace "apigee-diagnostic" deleted
peerauthentication.security.istio.io "apigee-diagnostic" deleted
clusterrole.rbac.authorization.k8s.io "apigee-diagnostic" deleted
serviceaccount "apigee-diagnostic" deleted
clusterrolebinding.rbac.authorization.k8s.io "apigee-diagnostic" deleted
secret "apigee-diagnostic-config" deleted
secret "apigee-diagnostic-svc-account" deleted
job.batch "apigee-diagnostic" deleted
Google Cloud Storage Bucket still empty!
I am including output from
kubectl logs -n apigee-diagnostic apigee-diagnostic-x7585
There are some warnings present.
https://drive.google.com/file/d/164n0nnAPbUtVks4RhzLLtFEzJ8FeArGv/view?usp=sharing
I am also including output from
kubectl describe pod -n apigee-diagnostic apigee-diagnostic-x7585
https://drive.google.com/file/d/1xvk7Xx0XcnBV43Z5w6ozpNNBOxL9k8-W/view?usp=sharing
Could you please advise on possible root cause of why is this not working for me? Thank you very much in advance for any kind of help!
Let me see if I can find someone to help....
Hi rad,
I see that the operation is defined like below.
operation: "LOGGING"
When the operation is LOGGING, nothing will uploaded to GCS. Only the log levels are changed and logs are directly transported to the Logs Viewer in GCP control plane or you can check it locally if logger is disabled.
If you want to what data is getting uploaded please change the operation to ALL. This will collect all the required and upload the data to GCS.
Hi pbhagwat,
Thank you for your advice!
I've tried it exactly as you recommended but there are still no data in Google Cloud Storage Bucket. After activating the Diagnostic Collector I have checked all the pods and noticed following:
NAMESPACE NAME READY STATUS RESTARTS AGE
apigee-diagnostic apigee-diagnostic-6sgrk 1/1 Running 0 11s
apigee diagnose-aks-apigeertime-21752489-vmss000001 0/1 ErrImagePull 0 5s
apigee diagnose-aks-apigeertime-21752489-vmss000002 0/1 ErrImagePull 0 5s
Status of diagnose-aks-apigeertime pods is "ErrImagePull". I further executed kubectl describe pod -n apigee diagnose-aks-apigeertime-21752489-vmss000001, this is the Events part:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal BackOff 26s kubelet Back-off pulling image "us.gcr.io/google.com/edge-ci/base/edge-hybrid:diagnostics-datacollector_871f8e4"
Warning Failed 26s kubelet Error: ImagePullBackOff
Normal Pulling 14s (x2 over 29s) kubelet Pulling image "us.gcr.io/google.com/edge-ci/base/edge-hybrid:diagnostics-datacollector_871f8e4"
Warning Failed 12s (x2 over 26s) kubelet Failed to pull image "us.gcr.io/google.com/edge-ci/base/edge-hybrid:diagnostics-datacollector_871f8e4": rpc error: code = Unknown desc = failed to pull and unpack image "us.gcr.io/google.com/edge-ci/base/edge-hybrid:diagnostics-datacollector_871f8e4": failed to resolve reference "us.gcr.io/google.com/edge-ci/base/edge-hybrid:diagnostics-datacollector_871f8e4": unexpected status code [manifests diagnostics-datacollector_871f8e4]: 401 Unauthorized
Warning Failed 12s (x2 over 26s) kubelet Error: ErrImagePull
Any idea about this? Thank you!
Hi Folks,
Any update on this?
Thank you!