Dear Google Cloud Community and Support Team,
12 hours of TransactionalTaskException when Adding Tasks (App Engine Java) - Self-Resolved - Cause Unknown
We had 12 hours of TransactionalTaskException when Adding Tasks to legacy Task Queues that previously (maybe 2 years ago?) were automatically converted by Google to "Cloud Task" queues. Then the TransactionalTaskException stopped happening as mysteriously as it started.
I spoke to Gemini a lot about it while I was trying to figure out the cause, and once it had stopped, I asked Gemini to summarise the situation:
You are seeking diagnostic assistance regarding a critical issue in your App Engine Java application experienced over a ~12-hour period. The problem has spontaneously resolved, but due to its severity and impact across all your deployed versions, you cannot simply disregard it.
Problem Summary:
Our App Engine Java application began throwing com.google.appengine.api.taskqueue.TransactionalTaskException errors when attempting to add tasks to a queue. The Datastore transaction context appeared to be valid at the point of the task enqueue attempt. No tasks were observed to be added to the queue, neither in a pending nor a failed state.
Key Observations & Debugging Steps:
The Incident's Resolution:
The TransactionalTaskException errors simply stopped appearing after approximately 12 hours, and our application's task enqueuing functionality returned to normal without any intervention on our part. By no invention, I mean no intervention that made any obvious difference. We did deploy various test versions with adjusted code in the hope that the problem might be something that could be fixed by code changes. However, ultimately when it stopped bugging out, all previous versions started working again as well as the new test deployment. No configuration changes were made.
Request for help:
We have the diagnostic logging for transactional task counting in place, which we plan to keep deployed.
Any insights, debugging strategies, or recommendations for engaging with Google Cloud Support effectively (beyond this forum post) would be greatly appreciated.
Thank you for your time and assistance.
Solved! Go to Solution.
So I guess no one has responded because no one knows, anymore than I do, how this could happen?
In summary, a critical system running on Google's Cloud service mysteriously stops working for 12 hours without any explanation, and no one knows why it could happen or what I should do about it.
So I guess no one has responded because no one knows, anymore than I do, how this could happen?
In summary, a critical system running on Google's Cloud service mysteriously stops working for 12 hours without any explanation, and no one knows why it could happen or what I should do about it.
Hi @Bindon,
When exactly did you encounter the TransactionalTaskException
issue? I just want to confirm the timeline to make sure we have everything aligned.
If it was around mid-June, it could be related to some updates that Google Cloud was pushing at the time. They were testing a new default behavior through an A/B experiment, which started the week of June 17. The experiment involved two changes—one on the Java Runtime and one on the AppServer side. However, due to an unrelated memory issue, the Google Cloud team had to roll back the AppServer to an older version on June 19, which caused the necessary backend change to be missing. This led to the TransactionalTaskException
error you saw.
The Google Cloud team quickly rolled back the Java runtime experiment and restored everything to the correct state by June 20. They are now closely monitoring the situation and planning to push the changes again once they’re confident everything is stable.
If you’re still facing issues, you may reach out to Google Cloud Support for further assistance. They can also, if needed, exclude your application from any upcoming experiments for a period of time to help avoid further disruptions—though please note this depends on the specific context and is handled on a case-by-case basis.
Was this helpful? If so, please accept this answer as “Solution”. If you need additional assistance, reply here within 2 business days and I’ll be happy to help.