Hi Google,
I'm creating a data fusion pipeline where my source is GCS bucket and I'm processing a CSV file from GCS to Bigquery. I'm have 10 fields in source file and Im creating 1 extra field called IngestionTimestamp in bigquery output schema. so whenever files are inserted to bigquery. this ingestion timestamp default collects current timestamp at the time of ingestion. but the pipeline is getting failed.
can anyone suggest me how i achieve this scenairo
Refer Below images for refernce
Solved! Go to Solution.
The core issue seems to be related to how Data Fusion is interpreting and passing the timestamp value you're generating. The pipeline might be failing due to:
Type Mismatch: BigQuery expects timestamps in a specific format (ISO 8601 format: YYYY-MM-DDTHH:MM:SS[.SSSSSS]Z
). If your IngestionTimestamp
isn’t formatted correctly, it can cause errors during insertion.
Timezone Conflicts: There might be a discrepancy between the timezone settings in the Data Fusion environment and the timezone of the generated timestamp.
Here's how you might address this issue:
Review the Wrangler Recipe:
current-datetime
to generate the timestamp. Ensure this aligns with the format BigQuery expects.Timezone Handling:
set-column IngestionTimestamp datetime_format(current_datetime(), 'yyyy-MM-dd\'T\'HH:mm:ss.SSSSSS\'Z\'')
'UTC-08:00'
with your desired timezone, but ensure it follows the ISO 8601 format for compatibility.If the issue persists, use Wrangler’s transformation capabilities to explicitly convert the generated timestamp into the correct format. Here’s an example:
set-column IngestionTimestamp datetime_format(current_datetime(), 'yyyy-MM-dd\'T\'HH:mm:ss.SSSSSS\'Z\'')
'yyyy-MM-dd\'T\'HH:mm:ss.SSSSSS\'Z\'
), which is what BigQuery expects for TIMESTAMP fields.Schema Alignment
IngestionTimestamp
field’s data type is set to TIMESTAMP
in the output schema.Additional Tips
Check Logs:
Test Incrementally:
Community Resources:
Wrangler Recipe
Here’s a revised Wrangler recipe example that incorporates explicit timestamp formatting:
set-column IngestionTimestamp datetime_format(current_datetime(), 'yyyy-MM-dd\'T\'HH:mm:ss.SSSSSS\'Z\'')
Note: Replace 'UTC-08:00'
with your actual timezone if different, but ensure the format aligns with ISO 8601 standards.
Hello,
Thank you for contacting the Google Cloud Community.
To ensure expert assistance and dedicated support for your issue, I kindly request you to file a support ticket by clicking here[1]. Our support team will prioritize your request and provide you with the assistance you need.
For individual support issues, it is best to utilize the support ticketing system. We appreciate your cooperation!
[1]: https://cloud.google.com/support/docs/manage-cases#creating_cases
Hi Team,
can anyone previously tried this requirement?
The core issue seems to be related to how Data Fusion is interpreting and passing the timestamp value you're generating. The pipeline might be failing due to:
Type Mismatch: BigQuery expects timestamps in a specific format (ISO 8601 format: YYYY-MM-DDTHH:MM:SS[.SSSSSS]Z
). If your IngestionTimestamp
isn’t formatted correctly, it can cause errors during insertion.
Timezone Conflicts: There might be a discrepancy between the timezone settings in the Data Fusion environment and the timezone of the generated timestamp.
Here's how you might address this issue:
Review the Wrangler Recipe:
current-datetime
to generate the timestamp. Ensure this aligns with the format BigQuery expects.Timezone Handling:
set-column IngestionTimestamp datetime_format(current_datetime(), 'yyyy-MM-dd\'T\'HH:mm:ss.SSSSSS\'Z\'')
'UTC-08:00'
with your desired timezone, but ensure it follows the ISO 8601 format for compatibility.If the issue persists, use Wrangler’s transformation capabilities to explicitly convert the generated timestamp into the correct format. Here’s an example:
set-column IngestionTimestamp datetime_format(current_datetime(), 'yyyy-MM-dd\'T\'HH:mm:ss.SSSSSS\'Z\'')
'yyyy-MM-dd\'T\'HH:mm:ss.SSSSSS\'Z\'
), which is what BigQuery expects for TIMESTAMP fields.Schema Alignment
IngestionTimestamp
field’s data type is set to TIMESTAMP
in the output schema.Additional Tips
Check Logs:
Test Incrementally:
Community Resources:
Wrangler Recipe
Here’s a revised Wrangler recipe example that incorporates explicit timestamp formatting:
set-column IngestionTimestamp datetime_format(current_datetime(), 'yyyy-MM-dd\'T\'HH:mm:ss.SSSSSS\'Z\'')
Note: Replace 'UTC-08:00'
with your actual timezone if different, but ensure the format aligns with ISO 8601 standards.
Hi MS4446,
I tried it but i unable to solve, can I have quick 10 mins call with you. so I can complete this task easily. if possible please share mail I can setup a short call.
If you are still running into issues you might to reachout to Google Cloud support for expert assistance.
Hello,
Thank you for contacting the Google Cloud Community.
To ensure expert assistance and dedicated support for your issue, I kindly request you to file a support ticket by clicking here[1]. Our support team will prioritize your request and provide you with the assistance you need.
For individual support issues, it is best to utilize the support ticketing system. We appreciate your cooperation!
[1]: https://cloud.google.com/support/docs/manage-cases#creating_cases