Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

CrashLoopBackOff Issue with runtime Pod While Installing Apigee Organization

Dear Team,

I am encountering an issue while installing the Apigee environment operator. One of the pods,  apigee-runtime-hale-mode-43890-evaluation-1042d85-1132-5btbmvj6, is stuck in a CrashLoopBackOff state. Below are the logs 

{"level":"SEVERE","thread":"NIOThread@4","mdc":{},"className":"com.apigee.probe.ProbeServiceImpl","method":"lambda$runProbesForStatus$1","severity":"SEVERE","message":"probe \"READINESS:VersionLoadProbe\" execution failed due to Version not loaded by Message Processor","formattedDate":"2024-11-27T05:13:30.708Z","logger":"ProbeServiceImpl"}
{"level":"SEVERE","thread":"NIOThread@4","mdc":{},"className":"com.apigee.probe.ProbeAPI","method":"getResponse","severity":"SEVERE","message":"probe failed with details ProbeStatusResponse{isProbeSuccessful=false, failureMessages=[Probe VersionLoadProbe failed due to Version not loaded by Message Processor]}","formattedDate":"2024-11-27T05:13:30.709Z","logger":"ProbeAPI"}

 

I would appreciate guidance on resolving this issue. Thank you in advance for your support! 

0 4 268
4 REPLIES 4

Hello @hemanth_ch ,

When synchronizer pods have connection issues to Cassandra, they will fail their health probe leading newer Message Processor pods not able to start due to lack of contracts, you can review the logs on Cloud Logging

resource.type="k8s_container"
resource.labels.container_name="apigee-synchronizer"
(jsonPayload.className="com.apigee.probe.ProbeAPI" OR jsonPayload.className="com.apigee.probe.ProbeServiceImpl")

You can check the status of the Cassandra nodes using nodetool status, is very important verify that Cassandra cluster was able to accept new connections.

I recommend create a support ticket to have more assistance.

 

 

Hi @AldoMeneses86 

Thank you for your kind assistance. Below are the logs

From Cassandra Container:

sh-5.1$ /opt/apigee/apigee-cassandra/bin/nodetool -u admin_user -pw iloveapis123 status
Datacenter: dc-1
================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address     Load      Tokens  Owns (effective)  Host ID                               Rack
UN  10.130.1.1  4.14 MiB  256     100.0%            6005722a-97fb-4ab6-93bc-26e1989196b0  ra-1

From Synchronizer Pod

sh-5.0$ telnet 10.130.1.1 9042
Connection closed by foreign host

and json response from query

{
insertId: "usroawf26thoj"
jsonPayload: {
_p: "F"
className: "com.apigee.probe.ProbeServiceImpl"
formattedDate: "2024-11-27T03:23:51.764Z"
level: "SEVERE"
logger: "ProbeServiceImpl"
mdc: {0}
message: "probe "READINESS:VersionLoadProbe" execution failed due to Version not loaded by Message Processor"
method: "lambda$runProbesForStatus$1"
thread: "NIOThread@4"
time: "2024-11-27T03:23:51.764356625+00:00"
}
logName: "projects/hale-mode-438906-h7/logs/stderr"
receiveTimestamp: "2024-11-27T03:23:51.958712379Z"
resource: {2}
severity: "ERROR"
timestamp: "2024-11-27T03:23:51.764356625Z"
}

Kindly provide next steps to troubleshoot the issue

Thanks

Also, Please find below logs from synchronizer pod

{"level":"SEVERE","thread":"Apigee-Timer-9","mdc":{},"className":"com.apigee.hybrid.runtime.signals.trace.sync.pubsub.PubSubContext","method":"createSubscription","severity":"SEVERE","message":"Upstream subscription creation request failed due to error with status code : 404 reason : Not Found response : <!DOCTYPE html>\n<html lang=en>\n <meta charset=utf-8>\n <meta name=viewport content=\"initial-scale=1, minimum-scale=1, width=device-width\">\n <title>Error 404 (Not Found)!!1</title>\n <style>\n *{margin:0;padding:0}html,code{font:15px/22px arial,sans-serif}html{background:#fff;color:#222;padding:15px}body{margin:7% auto 0;max-width:390px;min-height:180px;padding:30px 0 15px}* > body{background:url(//www.google.com/images/errors/robot.png) 100% 5px no-repeat;padding-right:205px}p{margin:11px 0 22px;overflow:hidden}ins{color:#777;text-decoration:none}a img{border:0}@media screen and (max-width:772px){body{background:none;margin-top:0;max-width:none;padding-right:0}}#logo{background:url(//www.google.com/images/branding/googlelogo/1x/googlelogo_color_150x54dp.png) no-repeat;margin-left:-5px}@media only screen and (min-resolution:192dpi){#logo{background:url(//www.google.com/images/branding/googlelogo/2x/googlelogo_color_150x54dp.png) no-repeat 0% 0%/100% 100%;-moz-border-image:url(//www.google.com/images/branding/googlelogo/2x/googlelogo_color_150x54dp.png) 0}}@media only screen and (-webkit-min-device-pixel-ratio:2){#logo{background:url(//www.google.com/images/branding/googlelogo/2x/googlelogo_color_150x54dp.png) no-repeat;-webkit-background-size:100% 100%}}#logo{display:inline-block;height:54px;width:150px}\n </style>\n <a href=//www.google.com/><span id=logo aria-label=Google></span></a>\n <p><b>404.</b> <ins>That’s an error.</ins>\n <p>The requested URL <code>/v1/organizations/hale-mode-438906-h7/environments/evaluation:subscribe</code> was not found on this server. <ins>That’s all we know.</ins>\n","formattedDate":"2024-11-29T08:54:22.697Z","logger":"CONFIG-CHANGE"}
{"level":"SEVERE","thread":"Apigee-Timer-9","mdc":{},"className":"com.apigee.hybrid.runtime.signals.context.SignalsContextImpl","method":"initializePubSubContext","severity":"SEVERE","message":"ControlPlaneCommunicationFailure exception occurred. Backing off signal polling","formattedDate":"2024-11-29T08:54:22.698Z","logger":"CONFIG-CHANGE","exceptionStackTrace":"com.apigee.hybrid.runtime.contract.sync.ControlPlaneCommunicationFailure{ code = runtime.contract.sync.SubscriptionRequestError, message = Signals subscription request with control plane failed, error msg: Not Found, associated contexts = []}\n"}

@hemanth_ch Which version is of Apigee Hybrid?.

Looks like that the synchronizer to control plane connection issues, primarily due to wrong configurations. most of the cases configuring wrong service account for the wrong control plane url.