Solved! Go to Solution.
OK, if someone is trying to access the airflow db from the "managed" version of apache airflow by google, here is how I did it.
First of all, you need to find out what the host / user / password / port / name are.
In order to do that, you can export the list of connections using the airflow CLI.
(continuing the issue description here as I wasn't allowed to post the rest of the message)
You're encountering a situation where tasks are marked as successful in the backend (Task Instances view) yet are missing from the DAG graph (no state displayed). This inconsistency prevents external sensors that depend on these tasks from completing their checks, thus blocking your workflows. The likely cause is the recent Composer environment upgrade. A worker pod trying to access an outdated Airflow database suggests potential issues with database synchronization or task metadata consistency.
Troubleshooting and Resolution
1. Verify Database Consistency
Access the Airflow Database: Despite the sql_alchemy_conn
being hidden in the UI, follow these steps to connect to the database:
Obtain Credentials: Navigate to the Google Cloud Console to find the connection details for your Composer environment's PostgreSQL database.
Set Up Cloud SQL Proxy: Install and configure the Cloud SQL Proxy to create a secure connection to your database from your local environment.
Connect using psql
: Use the psql
client and the credentials obtained to connect to the database.
Inspect Task Metadata: Execute SQL queries to examine the task_instance
table for discrepancies between database records and what's shown in the Airflow UI.
2. Clear Task State and Re-run DAGs
Clear Task Instances: Use the Airflow UI or CLI to clear the states of the affected tasks. Be cautious, as this could affect other tasks.
Re-trigger DAGs: Manually trigger the DAGs containing these tasks to attempt to normalize the state.
3. Address the Worker Pod Issue
Investigate the Pod: Use Kubernetes commands like kubectl logs
for the worker pod to explore error messages or other clues pointing to the root cause.
Potential Fixes:
Restart Composer: Consider restarting your Composer environment, which involves not just a simple reboot but a comprehensive refresh of the environment settings.
Upgrade Airflow: If you suspect a bug, update your Airflow version within Composer while ensuring compatibility with existing configurations.
4. Preventive Measures
Monitor Worker Logs: Keep an eye on the logs for any anomalies.
Implement Health Checks: Use monitoring tools to continuously track the health of your Airflow workflows and databases.
Testing and Staging: Always test environment upgrades in a staging setup before rolling them out to production.
Important Considerations
Data Loss Risk: Evaluate the implications of clearing task states. If tasks perform critical data operations, their states should be managed carefully to avoid data loss.
Database Backups: Maintain regular backups of your Airflow database to safeguard against unforeseen issues.
Hello and thanks for your answer ms4446,
1. Verify Database Consistency
It's precisely because the sql_alchemy_conn is hidden that I am not able to find the credentials to the airflow database.
I have no idea where to search in order to get these infos from the google cloud console (it doesn't seem like the airflow database appears as a Cloud SQL instance). Do you have some hints?
2. Clear Task State and Re-run DAGs
I'd prefer to avoid this as the tasks who disappeared (they were for sure considered as successful even in the UI at some point) are not idempotent, they ran once and I'm happy with that.
3. Address the Worker Pod Issue
It looks like an issue with the last upgrade which didn't go well. I think it's acceptable as this feature is documented as "in preview" so I'm fine with meeting this issue.
Direct access to the Airflow database can be complex due to the hidden sql_alchemy_conn
. Here are two practical methods to find the necessary information:
A. Cloud SQL Proxy
Mechanism: This method uses a local proxy to securely connect to the Cloud SQL instance that backs your Composer Airflow database.
Steps:
Enable Cloud SQL Admin API: Make sure the Cloud SQL Admin API is enabled in your Google Cloud Project.
Install Cloud SQL Proxy: Download and install the Cloud SQL Proxy for your operating system.
Locate Instance Connection Name:
Navigate to the Composer Environments page in the Cloud Console.
Click on your environment's name to view details.
In the Configuration section, find the Database entry, which should display the Cloud SQL instance name (e.g., projects/<your-project>/instances/<instance-name>
).
Start Proxy: Run the cloud_sql_proxy command with the instance connection name and specify a local port.
Connect with psql: Use psql
to connect to the database using the local port opened by the proxy. The typical database name is airflow
.
B. Kubernetes Port Forwarding (Advanced)
Mechanism: This method forwards a port from a Kubernetes pod in your Composer environment directly to your local machine.
Steps:
Locate Airflow Pod: Use kubectl get pods
to find a running Airflow pod in your Composer environment's GKE cluster.
Port Forward: Execute kubectl port-forward <pod-name> <local-port>:5432
to forward the PostgreSQL port (5432) to a local port.
Connect with psql: Connect to the forwarded local port using psql
.
Important Notes:
Credentials: For both methods, you will need to authenticate using a service account with appropriate IAM permissions to access the Cloud SQL instance.
Security: Prioritize secure connections and ensure database credentials are not exposed.
2. Alternatives to Clearing Task States
If clearing task states is not a suitable option due to the non-idempotent nature of the tasks, consider the following approaches:
Database Manipulation (Careful): If you have expertise in SQL, you could directly update the task state records in the Airflow database. This approach should be undertaken with caution, as it involves understanding the Airflow database schema comprehensively.
Wait and See: Sometimes, Airflow's internal synchronization mechanisms may resolve such inconsistencies over time. It's advisable to monitor the situation before taking drastic measures.
Hello,
@ms4446 wrote:In the Configuration section, find the Database entry, which should display the Cloud SQL instance name (e.g., projects/<your-project>/instances/<instance-name>).
Unfortunately this is not accurate. As in the documentation and as you can see in the environment page, there is no such thing as "database entry" nor any clue regarding the SQL connection, hence my post here.
@ms4446 wrote:Port Forward: Execute kubectl port-forward <pod-name> <local-port>:5432 to forward the PostgreSQL port (5432) to a local port.
I believe this is wrong, this listens to the local port and forwards to the postgreSQL port, not the other way around. Besides, I'm not even sure listening to the 5432 port could help me in any way to connect to the airflow db 😕
Anyway, I have no clue about how to use a google service account to connect to a db using psql.
I am surprised to see that there doesn't seem to be an easy way to connect to the airflow db? Composer is meant to be a service managed by Google so it's "ok-ish" to not let the end user access to some parts of the product, but if the "managed service" doesn't work, it would make sense to let the user access to the stuff that are not properly managed.
Is there really no way to connect to the airflow db to fix the metadata table content after the failure of upgrading to a newer composer version?
Cloud Composer indeed manages a lot of components for you, including the underlying database that powers Apache Airflow. However, direct access to the database can be restricted and not straightforward, especially since Google Cloud manages the environment to ensure security and stability. Here are a few steps and tips that could help you address the problem of connecting to the Airflow database and potentially fixing metadata issues:
Access via Cloud SQL Proxy: Google Cloud Composer uses Google Cloud SQL to manage the Airflow metadata database. While direct access isn't provided in the Composer interface, you can use the Cloud SQL Proxy to securely connect to your database. To do this:
gcloud
command-line tool to obtain the instance connection name (you can find this in the Composer environment's details under the Cloud SQL section)../cloud_sql_proxy -instances=<INSTANCE_CONNECTION_NAME>=tcp:5432
5432
.Service Account Configuration: To use the proxy and connect to the database, you'll need a service account with appropriate permissions. Ensure the service account has roles like Cloud SQL Client
and Cloud SQL Editor
. You can set this up in the IAM & Admin section of the Google Cloud Console.
Connecting using psql: Once the proxy is running, you can connect using psql
or any other PostgreSQL-compatible client. For instance:
psql -h 127.0.0.1 -p 5432 -U <USERNAME> -d <DATABASE_NAME>
The username and database name can usually be found in your environment's configuration; the default user for Airflow databases in Composer is often airflow
.
Port Forwarding Misunderstanding: The statement about port forwarding you mentioned is indeed about mapping a local port to the service's port in the pod. The command kubectl port-forward <pod-name> <local-port>:5432
correctly maps a local port to the 5432 port of a PostgreSQL pod. This could be useful in Kubernetes environments where you might want to temporarily expose a pod's service for debugging. However, for Composer, using the Cloud SQL Proxy as described above is the recommended method.
Troubleshooting Failed Upgrades: If an upgrade fails and you suspect metadata corruption or issues, connecting to the database to manually intervene should be done with caution. Ensure you have backups and understand the schema changes that might be involved with version upgrades.
Contact Support: If you're facing persistent issues or the upgrade has corrupted data in a way that isn't manageable, contacting Google Cloud Support is advisable. They can provide more direct assistance and possibly access more tools to help resolve your issue.
Hello,
Using the airflow cli, I was able to export all connections including airflow_db connection, hence retrieving the user (root), the password and the database name (I believe this should be documented because none of the method you suggested worked or corresponded to what's displayed in the UI).
Now, I'm struggling to get this cloud SQL proxy to work.
First of all, I'd like to confirm with you that Cloud SQL Proxy simply does not exist, and that you were referring to Cloud SQL Auth proxy ?
If that's the case, the syntax you are showing is wrong, right? ./cloud_sql_proxy is actually ./cloud-sql-proxy, and it does not take '-instance as an argument.
Last but not least, the instance name is obviously still unknown, I was thinking you could maybe share a bit of information regarding how to find this instance name?
The ./cloud-sql-proxy seems to be looking for names like project:region:instance and as you know, I have no clue of the name of the project since the db is actually in one of your google managed project, not in my project.
You're right; Let me clarify a few things about using the Cloud SQL Auth proxy and how it connects to Cloud Composer's underlying Cloud SQL instance.
Cloud SQL Auth Proxy vs. Cloud SQL Proxy: The correct term is indeed "Cloud SQL Auth Proxy," often referred to just as "Cloud SQL Proxy." The executable's name can vary depending on how it was downloaded or installed. Typically, it is downloaded as cloud_sql_proxy
on Linux and macOS. If your executable is named differently, you should use that name in your commands.
Correct Command Syntax: The correct command to start the Cloud SQL Auth proxy, assuming the executable name is cloud_sql_proxy
and the placeholder <INSTANCE_CONNECTION_NAME>
represents your actual instance connection name, is:
./cloud_sql_proxy -instances=<INSTANCE_CONNECTION_NAME>=tcp:5432
Here, instances
is indeed the correct argument. The executable may not run if there's a typo or the instance connection name is not properly formatted.
Finding the Instance Name: For Cloud Composer, the Cloud SQL instance is indeed managed by Google and not directly exposed in your Google Cloud project's resources. To find the instance connection name:
If Instance Name is Not Available: If you cannot locate the instance name through the Cloud Composer UI, it may be because Google does not expose these details for managed services to prevent unintended changes that could destabilize your environment. In such cases, the standard practice would be to handle database-related operations through the Airflow Web UI or the Airflow CLI, which it seems you have already been using effectively.
Alternative Approaches: Since you've been able to extract the database credentials using the Airflow CLI, you might consider the following if direct access to the database is crucial:
Final Steps: If the Cloud SQL Proxy method is essential for your workflow, ensure that:
Hey, thanks for the answer.
The executable's name can vary depending on how it was downloaded or installed. Typically, it is downloaded as cloud_sql_proxy on Linux and macOS. If your executable is named differently, you should use that name in your commands.
==> This is not true, on both linux and mac os, it's definitely cloud-sql-proxy. I am still a bit afraid that we aren't talking about the same thing, especially since you keep mistyping the name of the command.
Here, instances is indeed the correct argument.
No, there is definitely NO "--instances" parameter. I am asked to type the name of the instance right away, so I disagree with what you wrote. Can you confirm we are talking about the same tool? I really doubt it since nothing of what you say matches what I see in the link I shared above.
./cloud-sql-proxy my-project:us-central1:my-db-server
Then, you ask me again to go in the environment details page. Can you screenshot me where to find Database entry on this page ? As I told you, I believe it simply does not exist but you keep asking me to go here so I'm not sure what to do.
"If Instance Name is Not Available" Well I don't know, you tell me!
Composer is a product managed by Google, not by me. Also, none of the solution you are referring could potentially work as you can't connect to the airflow db through the UI nor using the cli
Not mentioning the remaining answers as it looks like some bot generated responses.
OK, if someone is trying to access the airflow db from the "managed" version of apache airflow by google, here is how I did it.
First of all, you need to find out what the host / user / password / port / name are.
In order to do that, you can export the list of connections using the airflow CLI.