Talking about the data export from GA4 to BigQuery, in logs explorer I see an InsertJob for the events_yyyymmdd table and after 15-20 minutes a DeleteTable for the events_intraday_yyyymmdd.
Is there any reason for that?
If I am creating a pipeline to process this data once it's complete is it safe to work on the creation of the events_yyyymmdd table or the deletion of the events_intraday_yyyymmdd?
Thank you in advance!
Hi @AElashry,
Welcome to Google Cloud Community!
Data export from GA4 to BigQuery creates the events_intraday_YYYYMMDD table if the streaming option is enabled, and the events_YYYYMMDD table if the daily export option is selected. The events_intraday_YYYYMMDD table is an internal staging table created throughout the day, which is exported in near real-time. On the other hand, the events_YYYYMMDD table is created once daily and a full daily export of events.
The behavior you observed in the Logs Explorer is expected, as the events_intraday_YYYYMMDD table is deleted at the end of each day once the events_YYYYMMDD table is completed to free up space. For creating a pipeline to process data once it is complete, it is safe to work with the events_YYYYMMDD table rather than the events_intraday_YYYYMMDD table, since it is the full and final daily export.
For reference, you can refer to this documentation.
I hope the above information is helpful.
Hello @marckevin
Thank you for your response. I would like to ask which one would be a safer triggerer for the pipeline the log that the table events_yyyymmdd is created or that the table events_intraday_yyyymmdd is deleted.
The reason I am asking this is that I read in the docs (here) that Google continues updating the events_yyyymmdd after its creation up to 3 days. Are these updates saved first in the intraday table then the daily table, or it goes directly to the daily table after its creation?
from docs:
"Not all devices on which events are triggered send their data to Analytics on the same day the events are triggered. To account for this latency, Analytics will update the daily tables (events_YYYYMMDD) with events for those dates for up to three days after the dates of the events. "