Hello,
I am using GCP Composer(airflow). What I want to do is to change Airflow, in all its components, to reflect my Time Zone: "Europe/Lisbon". I know that, by default, Composer uses timedates in UTC timezone, so I alredy proceed on some steps to change that, but, without being able to change in all components.
What I already did was:
1 Change Composer properties - Airflow Configuration Overwrites - with values:
webserver - default_ui_timezone: "Europe/Lisbon"
core - default_timezone: "Europe/Lisbon"
2 create DAGs timezone aware :
I am using pendulum library, and specifying timezone. The scheduled is working accordley with my timezone.
What is working as expected?:
The Webserver UI is presented in my timezone - WebUI: OK
The DAG being executed accordly with the cron on my timezone - Scheduling: OK
What is the issue?
It seams that internally Composer is not using my timezone as Default. As example, looking for Task Log, the AIRFLOW_CTX_EXECUTION_DATE is still in UTC:
(...) AIRFLOW_CTX_EXECUTION_DATE=2023-04-14T11:58:00+00:00 (...) [2023-04-14, 12:59:06 WEST] {taskinstance.py:1416} INFO - Marking task as SUCCESS. dag_id=timezone_aware_dag3, task_id=task_one, execution_date=20230414T115800, start_date=20230414T115904, end_date=20230414T115906 |
So, I have my logs messages in WEST (12:59:06 WEST), but internal date meatadate still in UTC (execution_date=20230414T115800)
Another issue is looking for the Schedule time, vs executed time, where Airflow shows Logs, accordly with UTC, but the scheduler, accordly with my local time. See the image in this link: https://drive.google.com/file/d/1vT5C_6Q2xNLTzcRV1kgxijRigiXLypWM/view?usp=share_link
Expected behaviour: Once I changed airflow core timezone, what I was expecting was that all times were handled in my timezone.
Complete Log of Task Execution:
[2023-04-14, 12:59:03 WEST] {taskinstance.py:1180} INFO - Dependencies all met for <TaskInstance: timezone_aware_dag3.task_one scheduled__2023-04-14T11:58:00+00:00 [queued]> [2023-04-14, 12:59:04 WEST] {taskinstance.py:1180} INFO - Dependencies all met for <TaskInstance: timezone_aware_dag3.task_one scheduled__2023-04-14T11:58:00+00:00 [queued]> [2023-04-14, 12:59:04 WEST] {taskinstance.py:1377} INFO - -------------------------------------------------------------------------------- [2023-04-14, 12:59:04 WEST] {taskinstance.py:1378} INFO - Starting attempt 1 of 3 [2023-04-14, 12:59:04 WEST] {taskinstance.py:1379} INFO - -------------------------------------------------------------------------------- [2023-04-14, 12:59:04 WEST] {taskinstance.py:1398} INFO - Executing <Task(PythonOperator): task_one> on 2023-04-14 11:58:00+00:00 [2023-04-14, 12:59:04 WEST] {standard_task_runner.py:52} INFO - Started process 6068 to run task [2023-04-14, 12:59:04 WEST] {standard_task_runner.py:79} INFO - Running: ['airflow', 'tasks', 'run', 'timezone_aware_dag3', 'task_one', 'scheduled__2023-04-14T11:58:00+00:00', '--job-id', '340', '--raw', '--subdir', 'DAGS_FOLDER/5-dag_timezone_aware3.py', '--cfg-path', '/tmp/tmpy4s2bwl1', '--error-file', '/tmp/tmpessqsl6w'] [2023-04-14, 12:59:04 WEST] {standard_task_runner.py:80} INFO - Job 340: Subtask task_one [2023-04-14, 12:59:05 WEST] {task_command.py:375} INFO - Running <TaskInstance: timezone_aware_dag3.task_one scheduled__2023-04-14T11:58:00+00:00 [running]> on host airflow-worker-ch98z [2023-04-14, 12:59:06 WEST] {taskinstance.py:1591} INFO - Exporting the following env vars: AIRFLOW_CTX_DAG_OWNER=vsilva AIRFLOW_CTX_DAG_ID=timezone_aware_dag3 AIRFLOW_CTX_TASK_ID=task_one AIRFLOW_CTX_EXECUTION_DATE=2023-04-14T11:58:00+00:00 AIRFLOW_CTX_TRY_NUMBER=1 AIRFLOW_CTX_DAG_RUN_ID=scheduled__2023-04-14T11:58:00+00:00 [2023-04-14, 12:59:06 WEST] {logging_mixin.py:115} INFO - Function One [2023-04-14, 12:59:06 WEST] {logging_mixin.py:115} INFO - DAG Timezone: [2023-04-14, 12:59:06 WEST] {logging_mixin.py:115} INFO - Timezone('Europe/Lisbon') [2023-04-14, 12:59:06 WEST] {python.py:173} INFO - Done. Returned value was: None [2023-04-14, 12:59:06 WEST] {taskinstance.py:1416} INFO - Marking task as SUCCESS. dag_id=timezone_aware_dag3, task_id=task_one, execution_date=20230414T115800, start_date=20230414T115904, end_date=20230414T115906 [2023-04-14, 12:59:06 WEST] {local_task_job.py:156} INFO - Task exited with return code 0 [2023-04-14, 12:59:07 WEST] {local_task_job.py:273} INFO - 1 downstream tasks scheduled from follow-on schedule check |
Thank you
Hi @vmasilva,
Welcome back to Google Cloud Community.
It appears that even after changing the timezone in the Composer properties and making your DAG timezone aware, your Airflow logs are still displaying UTC time.
The Airflow scheduler and workers, independent processes running on different computers than the Composer environment, produce the Airflow logs may be one factor contributing to this problem. To change the timezone for certain components, altering the Composer settings might not be sufficient.
You may also try adjusting the Airflow scheduler and workers' timezone environment variable to "Europe/Lisbon" to address this problem. This can be done by including the subsequent configuration in the Airflow UI:
The "timezone" connection and "AIRFLOW__CORE__TIMEZONE" variable are set to "Europe/Lisbon" in this arrangement, which the Airflow scheduler and employees should recognize.
You can try running your DAG once more after adding this configuration to see if the logs reflect the proper timezone.
Here are some documentation that might help you.
https://cloud.google.com/composer/docs/run-apache-airflow-dag?_ga=2.184955241.-1392753435.1676655686
Hi Aris,
Thank you for your reply.
Once I understood, I have to set on Airflow, a new Variable, and new Connection. The new Variable it's fine, but I didn't understood exaclty what values should I set in new connection:
Connection Id : "timezone"
Connection Type - What should I put in this field? By default Connection Type is "Email". Shouldn't I change that?
The remain fields: Description, Host, Schema, Login, Password.... Should I leave empty?
Thank you
Hi @vmasilva,
You must include the following information when creating a new timezone connection in Airflow: