Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

Cloud function error Cannot read properties of undefined (reading 'jobCompletedEvent')

Hi there, 

I got an error when the Cloud Function executed. 

 

 

run_scheduled_query_ga4_daily_master TypeError: Cannot read properties of undefined (reading 'jobCompletedEvent')
    at exports.runScheduledQuery

 

 

Here's the cloud function i'm using: (from Simo's blog https://www.teamsimmer.com/2022/12/07/how-do-i-trigger-a-scheduled-query-when-the-ga4-daily-export-h...)

 

 

const bigqueryDataTransfer = require('@google-cloud/bigquery-data-transfer');

exports.runScheduledQuery = async (event, context) => {
  // Update configuration options
  const projectId = 'my_project_id';
  const configId = 'id from scheduled query';
  const region = 'eu';

  // Load the log data from the buffer
  const eventData = JSON.parse(Buffer.from(event.data, 'base64').toString());
  const destinationTableId = eventData.protoPayload.serviceData.jobCompletedEvent.job.jobConfiguration.load.destinationTable.tableId;

  // Grab the table date and turn it into the run time for the scheduled query
  const tableTime = destinationTableId.replace('events_', '');
  const year = tableTime.substring(0, 4),
        month = tableTime.substring(4, 6),
        day = tableTime.substring(6, 8);
  // Set the run time for the day after the table date so that the scheduled query works with "yesterday's" data
  const runTime = new Date(Date.UTC(year, month - 1, parseInt(day) + 1, 12));
  // Create a proto-buffer Timestamp object from this
  const requestedRunTime = bigqueryDataTransfer.protos.google.protobuf.Timestamp.fromObject({
    seconds: runTime / 1000,
    nanos: (runTime % 1000) * 1e6
  });

  const client = new bigqueryDataTransfer.v1.DataTransferServiceClient();
  const parent = client.projectLocationTransferConfigPath(projectId, region, configId);

  const request = {
    parent,
    requestedRunTime
  };

  const response = await client.startManualTransferRuns(request);
  return response;
};

 

 

I'm using log query in my log sink

 

 

resource.type="bigquery_dataset"
resource.labels.dataset_id="analytics_123456789"
resource.labels.project_id="my_project_name"
protoPayload.metadata.tableCreation.reason="JOB"
protoPayload.serviceName="bigquery.googleapis.com"
protoPayload.methodName="google.cloud.bigquery.v2.JobService.InsertJob"
protoPayload.authenticationInfo.principalEmail="firebase-measurement@system.gserviceaccount.com" 
protoPayload.resourceName:"tables/events_"
NOT
protoPayload.resourceName:"tables/events_intraday"

 

 

Simo's blog is using 

 

 

protoPayload.methodName="jobservice.jobcompleted" 
protoPayload.authenticationInfo.principalEmail="firebase-measurement@system.gserviceaccount.com" 
protoPayload.serviceData.jobCompletedEvent.job.jobConfiguration.load.destinationTable.datasetId="my_project_id.dataset_id" 
protoPayload.serviceData.jobCompletedEvent.job.jobConfiguration.load.destinationTable.tableId=~"^events_\d+"

 

 

 I'm not expert in this area, so I asked for help from Duet AI and Duet AI suggested me to add 

 

 

const protoPayload = {
  methodName: "google.cloud.bigquery.v2.JobService.InsertJob",
  jobCompletedEvent: {
    job: {
      jobConfiguration: {
        load: {
          destinationTable: {
            datasetId: "my_project_id.dataset_id",
            tableId: "~^events_\d+"
          }
        }
      }
    }
  }
};

 

 

 but not sure where do use it in the cloud function and how?  

Solved Solved
0 8 6,086
1 ACCEPTED SOLUTION

The error you're encountering, TypeError: Cannot read properties of undefined (reading 'jobCompletedEvent'), arises because the object eventData.protoPayload.serviceData.jobCompletedEvent is undefined at the time the Cloud Function is trying to access its properties. This could be due to differences in the log entry structure between your setup and Simo's setup.

You could follow Duet AI's suggestion by creating a protoPayload object. This would involve declaring the protoPayload object at the start of your Cloud Function and then modifying the line:

 

const destinationTableId = eventData.protoPayload.serviceData.jobCompletedEvent.job.jobConfiguration.load.destinationTable.tableId;

 

const destinationTableId = protoPayload.jobCompletedEvent.job.jobConfiguration.load.destinationTable.tableId;

In addition to the above, you may also want to consider modifying your log query to ensure that it is matching the logs that are being generated by your BigQuery job. The log query that you are currently using is:

 

resource.type="bigquery_dataset"
resource.labels.dataset_id="analytics_123456789"
resource.labels.project_id="my_project_name"
protoPayload.methodName="JOB"
protoPayload.serviceName="bigquery.googleapis.com"
protoPayload.methodName="google.cloud.bigquery.v2.JobService.InsertJob"
protoPayload.authenticationInfo.principalEmail="firebase-measurement@system.gserviceaccount.com" 
protoPayload.resourceName:"tables/events_"
NOT
protoPayload.resourceName:"tables/events_intraday"

This log query will match any log that has the following properties:

  • The resource type is bigquery_dataset.
  • The dataset ID is analytics_123456789.
  • The project ID is my_project_name.
  • The method name is JOB.
  • The service name is bigquery.googleapis.com.
  • The method name is google.cloud.bigquery.v2.JobService.InsertJob.
  • The authentication information principal email is firebase-measurement@system.gserviceaccount.com.
  • The resource name starts with tables/events_ and does not end with events_intraday.

If your BigQuery job is not generating logs with these properties, then your log query will not match any logs and the eventData.protoPayload object will be undefined.

To troubleshoot this, you can use the BigQuery logging API to retrieve the logs for your BigQuery job. Once you have the logs, you can use them to determine the correct properties to use in your log query.

View solution in original post

8 REPLIES 8

The error you're encountering, TypeError: Cannot read properties of undefined (reading 'jobCompletedEvent'), arises because the object eventData.protoPayload.serviceData.jobCompletedEvent is undefined at the time the Cloud Function is trying to access its properties. This could be due to differences in the log entry structure between your setup and Simo's setup.

You could follow Duet AI's suggestion by creating a protoPayload object. This would involve declaring the protoPayload object at the start of your Cloud Function and then modifying the line:

 

const destinationTableId = eventData.protoPayload.serviceData.jobCompletedEvent.job.jobConfiguration.load.destinationTable.tableId;

 

const destinationTableId = protoPayload.jobCompletedEvent.job.jobConfiguration.load.destinationTable.tableId;

In addition to the above, you may also want to consider modifying your log query to ensure that it is matching the logs that are being generated by your BigQuery job. The log query that you are currently using is:

 

resource.type="bigquery_dataset"
resource.labels.dataset_id="analytics_123456789"
resource.labels.project_id="my_project_name"
protoPayload.methodName="JOB"
protoPayload.serviceName="bigquery.googleapis.com"
protoPayload.methodName="google.cloud.bigquery.v2.JobService.InsertJob"
protoPayload.authenticationInfo.principalEmail="firebase-measurement@system.gserviceaccount.com" 
protoPayload.resourceName:"tables/events_"
NOT
protoPayload.resourceName:"tables/events_intraday"

This log query will match any log that has the following properties:

  • The resource type is bigquery_dataset.
  • The dataset ID is analytics_123456789.
  • The project ID is my_project_name.
  • The method name is JOB.
  • The service name is bigquery.googleapis.com.
  • The method name is google.cloud.bigquery.v2.JobService.InsertJob.
  • The authentication information principal email is firebase-measurement@system.gserviceaccount.com.
  • The resource name starts with tables/events_ and does not end with events_intraday.

If your BigQuery job is not generating logs with these properties, then your log query will not match any logs and the eventData.protoPayload object will be undefined.

To troubleshoot this, you can use the BigQuery logging API to retrieve the logs for your BigQuery job. Once you have the logs, you can use them to determine the correct properties to use in your log query.

Hi @ms4446 , Thank you for the suggestion.  How about the scheduled query setting? Do i use overwrite the new dataset's table events_${run_time-24h|”%Y%m%d”} or just a fix table name ?   My scheduled query is just to flat the GA4 export data. 

delete
`my_project_id.ga4_master.a_table_name`
where
event_date = FORMAT_DATE('%Y%m%d', DATE_SUB(@run_date, INTERVAL 1 DAY)); --date_sub(@run_date, interval 1 day);

insert into
`my_project_id.ga4_master.a_table_name`

SELECT
  event_date,
  event_timestamp,
  event_name,
  event_bundle_sequence_id, 
  user_id,
  user_pseudo_id,
  CONCAT((SELECT ep.value.int_value AS ga_session_id FROM UNNEST(event_params) ep WHERE ep.key = 'ga_session_id'), user_pseudo_id) AS session_id,
  (SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_id') AS ga_session_id,
  (SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_number') AS ga_session_number,
  (SELECT value.string_value FROM UNNEST(event_params) WHERE key = 'source') AS session_source,
  (SELECT value.string_value FROM UNNEST(event_params) WHERE key = 'medium') AS session_medium,
  split((select value.string_value from unnest(event_params) where key='page_location'),'?')[safe_offset(0)] AS page_location,
  split((select value.string_value from unnest(event_params) where key='page_title'),'?')[safe_offset(0)] AS page_title,
  split((select value.string_value from unnest(event_params) where key='page_referrer'),'?')[safe_offset(0)] AS page_referrer,
  IFNULL(privacy_info.analytics_storage,'not set') AS analytics_storage,
  privacy_info.uses_transient_token AS uses_transient_token,
  device.category AS device_category,
  geo.country AS country,
  geo.city AS city,
  traffic_source.name AS ts_name,
  traffic_source.source AS acquisition_source,
  traffic_source.medium AS acquisition_medium,
  _TABLE_SUFFIX AS raw_table_suffix
FROM
  `my_project_id.analytics_123456789.events_*`
WHERE
  _table_suffix = FORMAT_DATE('%Y%m%d', DATE_SUB(@run_date, INTERVAL 1 DAY))
  AND COALESCE(user_id, user_pseudo_id) IS NOT null

 

You should use the overwrite option for the scheduled query setting. This will ensure that the destination table is always up-to-date with the latest data.

The reason for this is that the scheduled query is deleting all rows from the destination table before inserting the new data. If you used the append option, the new data would be inserted into the table, but the old data would not be deleted. This would result in the destination table containing a mix of old and new data.

By using the overwrite option, you can be sure that the destination table will only contain the latest data.

In your case, the destination table is my_project_id.ga4_master.a_table_name. The scheduled query is deleting all rows from this table where the event_date is equal to the previous day. The query is then inserting new rows into the table for all events that occurred on the previous day.

By using the overwrite option, you can be sure that the a_table_name table will only contain events that occurred on the previous day.

Thank you so much again, @ms4446 Does this mean delete , insert statements are not required in the scheduled query? 

 

-- delete
-- `my_project_id.ga4_master.a_table_name`
-- where
-- event_date = FORMAT_DATE('%Y%m%d', DATE_SUB(@run_date, INTERVAL 1 DAY)); --date_sub(@run_date, interval 1 day);

-- insert into
-- `my_project_id.ga4_master.a_table_name`

SELECT
  event_date,
  event_timestamp,
  event_name,
  event_bundle_sequence_id, 
  user_id,
  user_pseudo_id,
...
FROM ...
WHERE...

 

 and the destination table id sets to events_{run_time-24h|"%Y%m%d"} ?

sysph_0-1697833926273.png

 

 

Yes, if you use the overwrite option with the destination table ID set to events_{run_time-24h|"%Y%m%d"}, you won't need the delete and insert statements. The overwrite option will ensure that the data in the specified table is replaced with the new data from the query each time it runs. This simplifies your query to just the SELECT statement along with its FROM and WHERE clauses, making your scheduled query more straightforward.

i'm not sure what this 2 posts trying to achieve? in https://www.teamsimmer.com/2022/12/07/how-do-i-trigger-a-scheduled-query-when-the-ga4-daily-export-h...

sysph_0-1697834878661.pngsysph_1-1697834890727.png

My current log sink query when GA4 raw data ready:

resource.type="bigquery_dataset"
resource.labels.dataset_id="analytics_123456789"
resource.labels.project_id="my_project_id"
protoPayload.metadata.tableCreation.reason="JOB"
protoPayload.authenticationInfo.principalEmail="firebase-measurement@system.gserviceaccount.com"
protoPayload.methodName="google.cloud.bigquery.v2.JobService.InsertJob"
protoPayload.resourceName:"tables/events_"
NOT
protoPayload.resourceName:"tables/events_intraday"

scheduled query: remove delete and insert statements

-- delete
-- `my_project_id.ga4_master.a_table_name`
-- where
-- event_date = FORMAT_DATE('%Y%m%d', DATE_SUB(@run_date, INTERVAL 1 DAY)); --date_sub(@run_date, interval 1 day);

-- insert into
-- `my_project_id.ga4_master.a_table_name`

SELECT
  event_date,
  event_timestamp,
  event_name,
  event_bundle_sequence_id, 
  user_id,
  user_pseudo_id,
  CONCAT((SELECT ep.value.int_value AS ga_session_id FROM UNNEST(event_params) ep WHERE ep.key = 'ga_session_id'), user_pseudo_id) AS session_id,

and my cloud function is now 

const bigqueryDataTransfer = require('@google-cloud/bigquery-data-transfer');

exports.runScheduledQuery = async (event, context) => {
  // Update configuration options
  const projectId = 'my_project_id';
  const configId = 'the id from scheduled query';
  const region = 'eu';
console.log('Received event:', JSON.stringify(event));
// Modify the log data structure if needed
  const protoPayload = {
    methodName: "google.cloud.bigquery.v2.JobService.InsertJob",
    jobCompletedEvent: {
      job: {
        jobConfiguration: {
          load: {
            destinationTable: {
              datasetId: "my_project_id.ga4_master",
              tableId: "~^events_\\d+"
            }
          }
        }
      }
    }
  };

const destinationTableId = protoPayload.jobCompletedEvent.job.jobConfiguration.load.destinationTable.tableId;


  // Grab the table date and turn it into the run time for the scheduled query
  const tableTime = destinationTableId.replace('events_', '');
  const year = tableTime.substring(0, 4),
        month = tableTime.substring(4, 6),
        day = tableTime.substring(6, 8);
  // Set the run time for the day after the table date so that the scheduled query works with "yesterday's" data
  const runTime = new Date(Date.UTC(year, month - 1, parseInt(day) + 1, 12));
  // Create a proto-buffer Timestamp object from this
  const requestedRunTime = bigqueryDataTransfer.protos.google.protobuf.Timestamp.fromObject({
    seconds: runTime / 1000,
    nanos: (runTime % 1000) * 1e6
  });

  const client = new bigqueryDataTransfer.v1.DataTransferServiceClient();
  const parent = client.projectLocationTransferConfigPath(projectId, region, configId);

  const request = {
    parent,
    requestedRunTime
  };

  const response = await client.startManualTransferRuns(request);
  return response;
};

 I have to wait until the next ga4 raw event table created and if the function works at my time which is tomorrow.   Thanks for your time ms4446. 

It looks like the cloud function is working. However, i got an error in scheduled query history:

 

sysph_3-1697873646475.png

sysph_0-1698765536472.png

 

I removed the partitioning field in scheduled query setting and retry the query.  The table events_19691231 was created.  I think i need to specify the partitioning field to a field which has type TIMESTAMP, DATETIME or DATE but not string. 

sysph_6-1697873928401.png

 

sysph_5-1697873913248.png

Here's the GA4 raw data tables

sysph_1-1698765620785.png

 

 

The destination table in ga4_master dataset should be events_20231020. How do i fix it? thanks

Finally, the issue has been fixed. Here's the code i used in function. 

 

const bigqueryDataTransfer = require('@google-cloud/bigquery-data-transfer');

exports.runScheduledQuery = async (event, context) => {
  // Update configuration options
  const projectId = 'my_project_id';
  const configId = 'the_scheduled_query_id';
  const region = 'eu';
console.log('Received event:', JSON.stringify(event));  
 // Use the modified log data structure
// Load the log data from the buffer
  const eventData = JSON.parse(Buffer.from(event.data, 'base64').toString());
  const destinationTableId = protoPayload.jobCompletedEvent.job.jobConfiguration.load.destinationTable.tableId;
  const tableName = eventData.protoPayload.metadata.tableCreation.table.tableName;
  const event_tableName = tableName.slice(63, 78);
  console.log(event_tableName); 
  // Grab the table date and turn it into the run time for the scheduled query
  const tableTime = event_tableName.replace('events_', '');
  const year = tableTime.substring(0, 4),
        month = tableTime.substring(4, 6),
        day = tableTime.substring(6, 8);
  // Set the run time for the day after the table date so that the scheduled query works with "yesterday's" data
  const runTime = new Date(Date.UTC(year, month - 1, parseInt(day) + 1, 12));
  // Create a proto-buffer Timestamp object from this
  const requestedRunTime = bigqueryDataTransfer.protos.google.protobuf.Timestamp.fromObject({
    seconds: runTime / 1000,
    nanos: (runTime % 1000) * 1e6
  });

  const client = new bigqueryDataTransfer.v1.DataTransferServiceClient();
  const parent = client.projectLocationTransferConfigPath(projectId, region, configId);

  const request = {
    parent,
    requestedRunTime
  };

  const response = await client.startManualTransferRuns(request);
  return response;
};

 

The result in ga4_master dataset is showing the correct table name events_20231022 and both contents of ga4 raw data and new scheduled query results are match.

 

 

sysph_0-1698046506560.pngsysph_1-1698046522453.png