A Cloud Run job launched using a scheduler failed because it reached it's memory limit. A log of severity Error can be found in the Log Explorer:
{
"protoPayload": {
"@type": "type.googleapis.com/google.cloud.audit.AuditLog",
"status": {
"code": 8,
"message": "Execution jobname-fxwpp has failed to complete, 0/1 tasks were a success."
},
"serviceName": "run.googleapis.com",
"methodName": "v1",
"resourceName": "namespaces/project-id/executions/jobname-fxwpp",
"response": {
"metadata": {
"name": "jobname-fxwpp",
"namespace": "namespacenumber",
"selfLink": "/apis/run.googleapis.com/v1/namespaces/namespacenumber/executions/jobname-fxwpp",
"uid": "9ede51e6-ae18-4718-95ac-4bc799410525",
"resourceVersion": "AAYAyQCpMtU",
"generation": 1,
"creationTimestamp": "2023-07-18T20:17:00.314898Z",
"labels": {
"run.googleapis.com/jobGeneration": "5",
"client.knative.dev/nonce": "maw_mav_puz",
"run.googleapis.com/jobResourceVersion": "1689688190219110",
"run.googleapis.com/jobUid": "05d448c5-a441-46c1-bc68-a749916faeb4",
"run.googleapis.com/job": "jobname",
"cloud.googleapis.com/location": "europe-west1"
},
"annotations": {
"run.googleapis.com/client-name": "gcloud",
"run.googleapis.com/lastModifier": "development@project-id.iam.gserviceaccount.com",
"run.googleapis.com/client-version": "437.0.1",
"run.googleapis.com/creator": "development@project-id.iam.gserviceaccount.com",
"run.googleapis.com/cloudsql-instances": "project-id:europe-west1:sqlinstancename",
"run.googleapis.com/execution-environment": "gen2",
"run.googleapis.com/launch-stage": "BETA",
"run.googleapis.com/operation-id": "71a996ca-f9ae-4611-9831-c1316895cfdf"
},
"ownerReferences": [
{
"kind": "Job",
"name": "jobname",
"uid": "05d448c5-a441-46c1-bc68-a749916faeb4",
"apiVersion": "serving.knative.dev/v1",
"controller": true,
"blockOwnerDeletion": true
}
]
},
"apiVersion": "run.googleapis.com/v1",
"kind": "Execution",
"spec": {
"parallelism": 1,
"taskCount": 1,
"template": {
"spec": {
"containers": [
{
"image": "gcr.io/project-id/imagename@sha256:70993975c636e20ab666bc54183e44461f2b04769f8dfd3b46bb8f0449a96b25",
"command": [
"python"
],
"args": [
"-m",
"scripts.refresh_all"
],
"env": [
{
"name": "CONFIG_NAME",
"value": "prod"
}
],
"resources": {
"limits": {
"cpu": "1000m",
"memory": "4Gi"
}
}
}
],
"maxRetries": 0,
"timeoutSeconds": "43200",
"serviceAccountName": "development@project-id.iam.gserviceaccount.com"
}
}
},
"status": {
"observedGeneration": 1,
"conditions": [
{
"type": "Completed",
"status": "False",
"message": "Task jobname-fxwpp-task0 failed with message: The configured memory limit was reached.",
"lastTransitionTime": "2023-07-18T20:45:59.779029Z"
},
{
"type": "ResourcesAvailable",
"status": "True",
"lastTransitionTime": "2023-07-18T20:17:09.179373Z"
},
{
"type": "Started",
"status": "True",
"lastTransitionTime": "2023-07-18T20:17:09.309868Z"
},
{
"type": "Retry",
"status": "True",
"reason": "ImmediateRetry",
"message": "System will retry after 0:00:00 from lastTransitionTime for attempt 0.",
"lastTransitionTime": "2023-07-18T20:45:59.779029Z",
"severity": "Info"
}
],
"startTime": "2023-07-18T20:17:03.912315Z",
"completionTime": "2023-07-18T20:45:58.663707Z",
"failedCount": 1,
"logUri": "https://console.cloud.google.com/logs/viewer?project=project-id&advancedFilter=resource.type%3D%22cloud_run_job%22%0Aresource.labels.job_name%3D%22jobname%22%0Aresource.labels.location%3D%22europe-west1%22%0Alabels.%22run.googleapis.com/execution_name%22%3D%22jobname-fxwpp%22"
},
"@type": "type.googleapis.com/google.cloud.run.v1.Execution"
}
},
"insertId": "tff7dyd1em4",
"resource": {
"type": "cloud_run_job",
"labels": {
"job_name": "jobname",
"project_id": "project-id",
"location": "europe-west1"
}
},
"timestamp": "2023-07-18T20:45:59.834112Z",
"severity": "ERROR",
"labels": {
"run.googleapis.com/execution_name": "jobname-fxwpp"
},
"logName": "projects/project-id/logs/cloudaudit.googleapis.com%2Fsystem_event",
"receiveTimestamp": "2023-07-18T20:46:00.146733913Z"
}
However, I cannot find this error in Error Reporting, even when searching for Resolved or Muted errors.
Do you know of any way I could have misconfigured Error Reporting, or any other reason that would explain why this error is not showing in Error Reporting? Thanks!
This log has the particularity of coming from cloud audit logging, while most of our errors come from run logging.
Following the page https://cloud.google.com/error-reporting/docs/troubleshooting, I checked the configuration of our sinks and log buckets.
gcloud logging sinks list
NAME DESTINATION FILTER
_Required logging.googleapis.com/projects/project-id/locations/global/buckets/_Required LOG_ID("cloudaudit.googleapis.com/activity") OR LOG_ID("externalaudit.googleapis.com/activity") OR LOG_ID("cloudaudit.googleapis.com/system_event") OR LOG_ID("externalaudit.googleapis.com/system_event") OR LOG_ID("cloudaudit.googleapis.com/access_transparency") OR LOG_ID("externalaudit.googleapis.com/access_transparency")
_Default logging.googleapis.com/projects/project-id/locations/global/buckets/_Default NOT LOG_ID("cloudaudit.googleapis.com/activity") AND NOT LOG_ID("externalaudit.googleapis.com/activity") AND NOT LOG_ID("cloudaudit.googleapis.com/system_event") AND NOT LOG_ID("externalaudit.googleapis.com/system_event") AND NOT LOG_ID("cloudaudit.googleapis.com/access_transparency") AND NOT LOG_ID("externalaudit.googleapis.com/access_transparency")
gcloud logging buckets describe _Required --location=global
description: Audit bucket
lifecycleState: ACTIVE
locked: true
name: projects/project-id/locations/global/buckets/_Required
retentionDays: 400
But nothing caught my attention, and I don't understand why cloud audit logs could not be seen from Error Reporting.
In fact, event runtime errors are not visible in Error Reporting, not just errors from audit logs.
{
"textPayload": "Runtime exception\nTraceback (most recent call last):\n <... redacted...> TypeError: reduce() of empty iterable with no initial value",
"insertId": "64b848e5000a1838ab40cd93",
"resource": {
"type": "cloud_run_job",
"labels": {
"project_id": "project-id",
"location": "europe-west1",
"job_name": "job-name"
}
},
"timestamp": "2023-07-19T20:34:45.656094551Z",
"severity": "ERROR",
"labels": {
"run.googleapis.com/task_index": "0",
"instanceId": "001bdd422e8d7fc377985619f19c61d6138e40a810da519cb7c34e109fb13b1c4612031cad4eeba82fb75feca381f9eb4c568cf64766bbc9e4d4a7865dae7560e0",
"run.googleapis.com/execution_name": "job-name-gchl2",
"run.googleapis.com/task_attempt": "0"
},
"logName": "projects/project-id/logs/run.googleapis.com%2Fstderr",
"sourceLocation": {
"file": "/dscore/src/dscore/cloud_function.py",
"line": "22",
"function": "wrapper"
},
"receiveTimestamp": "2023-07-19T20:34:45.845658263Z"
}
Something similar is happening to me, I have a DBT job failing (with exit(1)), marked as 'failed' in Cloud Run Jobs console but the error is not reported in Error Reporting.