Hi. My database migration project is to import from an AWS RDS Postgresql cluster to AlloyDB. I've connected the AWS and GCP VPCs using a VPN well enough that I can reach both the AWS and GCP instances using psql (or telnet hostname 5423, etc, etc.) from compute VM instances in any subnet in the two VPCs.
More details on the VPN is that it is a HA type with dynamic routes, the same as this documented at
Solved! Go to Solution.
I've resolved the issue - the root cause was a AWS Security Group filter. Its needs to be expanded to also allow the IP range(s) for the Google "Private Services Access" in your GCP VPC. The route tables in AWS and GCP were being updated automatically and correctly by my VPN configuration, but the security group is separate matter.
For interested readers I'll describe how I diagnosed this.
I looked in the AWS RDS Postgres instance's log and found that there were no log messages at all when I attempted to start the DMS job. I could make a connection with a deliberately bad password (successful connections are silent at default log verbosity) at say 5:42, then attempt the DMS job start, wait until it fails at say 5:45, then make another bad connection attempt from a normal VM at say 5:48, and the only thing logged were the 5:42 and 5:48 failures. So I was certain enough at this point the TCP traffic from GCP's DMS was being blocked.
Then I realized that a AWS Reachability Analysis was the other half of network route investigation that needed to be done. The GCP Connectivity test only checks the route as far as the VPN tunnel on GCP end. The AWS Reachability analysis is likewise partial, it only checks what happens within the AWS side of the VPN.
What it found, but only once I had included destination port=5432 and source IP = the private IP address of the AlloyDB as optional packet headers, was that the security group was blocking that. It already permitted port 5432 traffic for the normal GCP subnet ranges, but the GCP Private Services Access CIDR, the one for "servicenetworking-googleapis-com" which the AlloyDB services are in, is a different range.
Once I added an extra ingress filter rule to the AWS security group the connections starting getting through.
The connection from DMS in the AWS RDS Postgres log shows its IP address was in CIDR of the "Private Services Access" range that the AlloyDB instances are in. A new IP address, but close to them. xx.xx.xx.9 instead xx.xx.xx.2 in my case.
With some further testing, just for curiosity, I found the DMS connections will get through TCP-wise whether or not the DMS connection profile is using encryption. With AWS RDS Postgres you won't be totally successful with "None" as an encryption option, it will reject it, but you'll see the rejection message in the AWS RDS Postgres log.
Using SSL in a DMS connection profile requires uploading or pasting the CA cert. I found the pem file for my region supplied at https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/UsingWithRDS.SSL.html#UsingWithRDS.SSL.Region... works for this.
I understand you're encountering a connection timeout error during your AWS RDS PostgreSQL to AlloyDB migration using the Database Migration Service (DMS) with a VPC peering setup. This can indeed be challenging, here are some steps for troubleshooting Connection Timeout During AWS RDS PostgreSQL to AlloyDB Migration with VPC Peering:
1. Network Accessibility:
traceroute
to test connectivity from a GCP VM to the RDS instance. Remember, ping might be disabled on AWS RDS.2. DMS Specific Checks:
cloudsql.client.connect
and cloudsql.instances.get
for Cloud SQL.3. Additional Considerations:
Hello. Thanks for the help.
To cover the parts in "1. Network Accessibility" I could run telnet, traceroute etc. without issue. As I was able to use the psql client that was already implied though. DNS is also working well, even resolving the hostnames assigned within AWS just works first time.
But, as a general point, this is all from a VM in the normal subnet of the VPC, not the 'private service' IP range the AlloyDB instances are in.
Q. Where does the database migration service run from? Is it a part of the db daemon like it would be in normal Postgres? Or is it in its own VM or K8s container? If so where is that, and can I do these network tests from that?
For "2 DMS Specific Checks":
Regarding DMS version I've only just created the DMS job so presumably it is a recommended version.
> Confirm the service account used by DMS has the necessary permissions for accessing AlloyDB instances.
No extra service account has been configured. Is this applicable?
A DMS connection profile with correct credentials, namely username & password, was created to connect to the AWS RDS Postgres source DB of course.
Update: I remember a point here: the encryption mode of "None" has been used in this connection profile. I don't have to chose the network encryption mode when using the psql client, so it didn't seem necessary but it's a toin coss whether SSL is on or off by default with each separate client these days.
It's a pity we can't just test connectivity from the DMS connection profile page. "Test" -> "Select Destination AlloyDB or Cloud SQL instance for network context" -> "Go".
> Utilize the DMS connectivity test feature to verify if DMS can connect to your source database. This will help discover if the issue is network-related or database-specific.
This was a new thing for me to try 👍. And it gave me a way to check with the IP range of the AlloyDB instance rather than the normal subnet 👍. Unfortunately nothing interesting has been discovered. The test result is "Reachable". In the detail view it shows steps from "Non-google Network" (actually the private service range for the google AlloyDB instance), to "Dynamic route", to VPN tunnel.
For the "3. Additional Considerations" section:
I have, and will continue to follow, the advice to stick with the dynamic routing.
> Review any VPC peering limitations or known issues mentioned in GCP documentation.
I haven't found any in the reading so far.
> Double-check the database user credentials provided to DMS for accuracy and necessary privileges.
The credentials are the same as the successful test by command line on VMs.
> Consider network latency or bandwidth limitations that might impair the connection.
There's been none observed so far.
It's good to hear that you've successfully tested network accessibility and DNS resolution from a VM in the normal subnet of your VPC, and that you're using the latest version of the Database Migration Service (DMS). Addressing your specific questions and concerns:
DMS Execution Context:
Network Testing from DMS Context:
telnet
or traceroute
) directly from its execution context. However, the "Reachable" status from the DMS connectivity test indicates that DMS can communicate with your AWS RDS instance, which is a positive sign.Service Account Configuration:
Encryption Mode:
Testing Connectivity from DMS:
Stick with Dynamic Routing:
VPC Peering Limitations:
Database User Credentials:
Network Latency or Bandwidth:
Given your thorough testing and the results you've observed, here are some additional steps you might consider:
SSL Encryption:
Detailed Logs and Monitoring:
GCP Support:
Incremental Troubleshooting:
Review RDS Instance Settings:
I'd like to double-check something just as one step by itself.
The last two replies mentioned specifically a "DMS Connectivity Test". I used the Connectivity Test which exists in the Network Intelligence service. Is there a specific Database Migration Service connectivity test that is different?
Sorry for confusion. To clarify, the DMS in Google Cloud does not have a separate, standalone connectivity test feature specific to DMS itself. The Connectivity Test you used from the Network Intelligence service in Google Cloud is the appropriate tool for testing network connectivity and configurations, including those relevant to DMS.
The Network Intelligence Center's Connectivity Test is designed to diagnose network issues across Google Cloud VPCs and on-premises networks. It helps you understand network configurations and connectivity for services running in the cloud, which is relevant for your use case with DMS.
In the context of DMS, when I referred to a "DMS Connectivity Test," it was meant to suggest using available tools within GCP (like the Network Intelligence Center's Connectivity Test) to ensure that the network path from your DMS setup to the source (AWS RDS) and destination (AlloyDB) databases is correctly configured and not encountering any blockages.
Since you've already used the Connectivity Test from the Network Intelligence service and it shows "Reachable," it indicates that the network path is correctly set up for the DMS to communicate with your AWS RDS instance. The issue with the DMS job might lie elsewhere, possibly in the configuration of the DMS job itself, database settings, or permissions.
If you continue to face issues with the DMS job, I recommend:
I've resolved the issue - the root cause was a AWS Security Group filter. Its needs to be expanded to also allow the IP range(s) for the Google "Private Services Access" in your GCP VPC. The route tables in AWS and GCP were being updated automatically and correctly by my VPN configuration, but the security group is separate matter.
For interested readers I'll describe how I diagnosed this.
I looked in the AWS RDS Postgres instance's log and found that there were no log messages at all when I attempted to start the DMS job. I could make a connection with a deliberately bad password (successful connections are silent at default log verbosity) at say 5:42, then attempt the DMS job start, wait until it fails at say 5:45, then make another bad connection attempt from a normal VM at say 5:48, and the only thing logged were the 5:42 and 5:48 failures. So I was certain enough at this point the TCP traffic from GCP's DMS was being blocked.
Then I realized that a AWS Reachability Analysis was the other half of network route investigation that needed to be done. The GCP Connectivity test only checks the route as far as the VPN tunnel on GCP end. The AWS Reachability analysis is likewise partial, it only checks what happens within the AWS side of the VPN.
What it found, but only once I had included destination port=5432 and source IP = the private IP address of the AlloyDB as optional packet headers, was that the security group was blocking that. It already permitted port 5432 traffic for the normal GCP subnet ranges, but the GCP Private Services Access CIDR, the one for "servicenetworking-googleapis-com" which the AlloyDB services are in, is a different range.
Once I added an extra ingress filter rule to the AWS security group the connections starting getting through.
The connection from DMS in the AWS RDS Postgres log shows its IP address was in CIDR of the "Private Services Access" range that the AlloyDB instances are in. A new IP address, but close to them. xx.xx.xx.9 instead xx.xx.xx.2 in my case.
With some further testing, just for curiosity, I found the DMS connections will get through TCP-wise whether or not the DMS connection profile is using encryption. With AWS RDS Postgres you won't be totally successful with "None" as an encryption option, it will reject it, but you'll see the rejection message in the AWS RDS Postgres log.
Using SSL in a DMS connection profile requires uploading or pasting the CA cert. I found the pem file for my region supplied at https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/UsingWithRDS.SSL.html#UsingWithRDS.SSL.Region... works for this.
Thank you for sharing your solution and insights!