Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

Dataform repository SSH connection to GitLab failing

For our Dataform repository, I'm attempting to add an SSH connection to our remote GitLab repository (cloud hosted). I've added a Deploy Key to GitLab and associated the key there. I've also tested the ssh key pair to make sure it can connect successfully to GitLab. I've even cloned the repo with that key.

However, when I configure Dataform to use the private key (stored in Secrets Manager) and setup all the other settings, it fails to connect.

I've ensured the default service account has access to the password version as well. Yet we still get this error:

"We are unable to connect to your Git provider with the configured credentials. If you use HTTPS authentication check that the token is valid and has not expired. If you use SSH authentication check that the private user key and public host key are valid. View docs on connecting a remote git repository. "

Anything else I can try or logs I could look at?

Solved Solved
0 4 3,027
1 ACCEPTED SOLUTION

Thanks @ms4446 for your thoughts on this. That resource type didn't yield any logs.

I figured out the problem though via a different path. When I was looking at the compilation results of my Release Config, they were failing and had better error message:

"The ssh host public key for remote repository 'git@gitlab.com:*****/****.git' didn't match"

This clued me in that we were using the wrong public key. We were using the public key from the SSH key pair we had to generate, not the one that is stored in the known_hosts file when connecting to GitLab.

I wish this error was visible from the SSH connection configuration screen as opposed to the Release Configuration details screen.

It appears that a search expression that will yield dataform repository logs is:

protoPayload.serviceName="dataform.googleapis.com"

View solution in original post

4 REPLIES 4

The error message indicates that Dataform cannot connect to your GitLab repository using the SSH credentials provided. Here are the steps to troubleshoot this issue:

  1. SSH Key Validity: Ensure that the SSH key is correctly configured and has not been revoked or replaced in GitLab. SSH keys do not typically expire unless explicitly set with an expiration date.

  2. Public SSH Key Registration: Register the public SSH key with GitLab by adding it to your profile's SSH Keys section. This is essential for SSH authentication.

  3. Service Account Access: Confirm that the Dataform service account has the roles/secretmanager.secretAccessor role to access the private SSH key in the Secret Manager.

  4. Firewall Settings: Check for any IP restrictions or changes in the network configuration that could be blocking the connection, even if you have successfully cloned the repository manually.

  5. GitLab Server Status: While unlikely to be the issue if manual cloning works, verify GitLab's operational status to rule out any platform-wide problems.

  6. Network Access and Firewall Settings: If you've cloned the repository manually from the same environment, the network settings should be correct, but it's still worth a review.

  7. SSH Client and Network: If you've successfully cloned the repo, the current setup should be adequate, but trying a different SSH client or network can help isolate any potential issues.

  8. SSH Debugging: Use verbose logging during the SSH connection attempt (ssh -vvv) to gain detailed insights into any authentication issues.

  9. Dataform Logs: Examine Dataform logs in the Google Cloud console to identify the specific step where the connection is failing.

Thank you for your response, @ms4446 ! 

I actually did alll the steps 1-8 before I got to this point to verify everything was working and it seems correct. The one step I'm not sure of is Step 9 and what logs to look at where the SSH connection results may be outputted. Any direction on that?

To examine the Dataform logs for more information about the SSH connection error, you can follow these steps:

Accessing and Analyzing Dataform Logs in Google Cloud Console:

  1. Locate the Logs Explorer: Sign in to the Google Cloud console with your credentials. Within the console, navigate to the "Operations" suite and select "Logging" to access the Logs Explorer.

  2. Filter by Dataform logs: Use the Logs Explorer search bar to filter logs for Dataform. If the filter resource.type="dataform_job" does not yield results, consult Dataform's documentation or Google Cloud support to determine the correct filter syntax for Dataform resources.

  3. Narrow down results: Apply additional filters to manage the volume of logs. For example, use severity=ERROR to isolate error-level logs. Adjust the severity level filter as needed to capture the appropriate level of log detail.

  4. Identify relevant log entries: Search for log entries that match the timeframe of your SSH connection attempt to GitLab. These logs should provide insights into the connection attempt and any errors that occurred.

  5. Analyze error messages: Examine the error messages to understand the cause of the connection failure. Look for details regarding authentication issues, authorization problems, network connectivity, or other errors.

  6. Check Secret Manager logs: Since your SSH key is stored in Secret Manager, review its logs to confirm that the Dataform service account is accessing the SSH key without issues.

  7. Obtain verbose SSH output: If possible, trigger a manual SSH connection attempt to get a verbose log output (ssh -vvv). This step may require access to the underlying infrastructure, such as a Compute Engine instance, and may not be directly available through Dataform.

Thanks @ms4446 for your thoughts on this. That resource type didn't yield any logs.

I figured out the problem though via a different path. When I was looking at the compilation results of my Release Config, they were failing and had better error message:

"The ssh host public key for remote repository 'git@gitlab.com:*****/****.git' didn't match"

This clued me in that we were using the wrong public key. We were using the public key from the SSH key pair we had to generate, not the one that is stored in the known_hosts file when connecting to GitLab.

I wish this error was visible from the SSH connection configuration screen as opposed to the Release Configuration details screen.

It appears that a search expression that will yield dataform repository logs is:

protoPayload.serviceName="dataform.googleapis.com"