Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

Unable to connect to Google MySQL using JDBC Connector from Amazon EMR with SSL certificates

Hi There,

    We are provided with three pem SSL files to connect to Google MySQL and we are using MySQL Java Connector (mysql-connector-j-8.1.0.jar) to connect using JDBC connections. We're using PySPARK code on an Amazon EMR. The below is code we are using to connect to Google MySQL:

 

# Configure JDBC connection properties
jdbc_url = "jdbc:mysql://11.11.111.111:3306/db_name?useSSL=true"
connection_properties = {
"user": "user",
"password": "pwd",
"driver": "com.mysql.cj.jdbc.Driver",
"sslCa": "server-ca.pem", # Path to the SSL CA certificate
"sslCert": "client-cert.pem", # Path to the client certificate
"sslKey": "client-key.pem", # Path to the client private key
"useSSL": "true",
"sslMode": "VERIFY_CA",
"verifyServerCertificate": "true"
}

# Read data from MySQL table into a DataFrame
jdbc_df = spark.read.jdbc(url=jdbc_url, table="Session", properties=connection_properties)

 

But we are getting the below error when running the above command:

 

>>> jdbc_df = spark.read.jdbc(url=jdbc_url, table="Session", properties=connection_properties)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/spark/python/pyspark/sql/readwriter.py", line 716, in jdbc
return self._df(self._jreader.jdbc(url, table, jprop))
File "/usr/lib/spark/python/lib/py4j-0.10.9.5-src.zip/py4j/java_gateway.py", line 1322, in __call__
File "/usr/lib/spark/python/pyspark/sql/utils.py", line 190, in deco
return f(*a, **kw)
File "/usr/lib/spark/python/lib/py4j-0.10.9.5-src.zip/py4j/protocol.py", line 328, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling o76.jdbc.
: com.mysql.cj.jdbc.exceptions.CommunicationsException: Communications link failure

The last packet sent successfully to the server was 0 milliseconds ago. The driver has not received any packets from the server.
at com.mysql.cj.jdbc.exceptions.SQLError.createCommunicationsException(SQLError.java:175)
at com.mysql.cj.jdbc.exceptions.SQLExceptionsMapping.translateException(SQLExceptionsMapping.java:64)
at com.mysql.cj.jdbc.ConnectionImpl.createNewIO(ConnectionImpl.java:819)
at com.mysql.cj.jdbc.ConnectionImpl.<init>(ConnectionImpl.java:440)
at com.mysql.cj.jdbc.ConnectionImpl.getInstance(ConnectionImpl.java:239)
at com.mysql.cj.jdbc.NonRegisteringDriver.connect(NonRegisteringDriver.java:188)
at org.apache.spark.sql.execution.datasources.jdbc.connection.BasicConnectionProvider.getConnection(BasicConnectionProvider.scala:49)
at org.apache.spark.sql.execution.datasources.jdbc.connection.ConnectionProviderBase.create(ConnectionProvider.scala:102)
at org.apache.spark.sql.jdbc.JdbcDialect.$anonfun$createConnectionFactory$1(JdbcDialects.scala:122)
at org.apache.spark.sql.jdbc.JdbcDialect.$anonfun$createConnectionFactory$1$adapted(JdbcDialects.scala:118)
at org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$.getQueryOutputSchema(JDBCRDD.scala:63)
at org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$.resolveTable(JDBCRDD.scala:58)
at org.apache.spark.sql.execution.datasources.jdbc.JDBCRelation$.getSchema(JDBCRelation.scala:242)
at org.apache.spark.sql.execution.datasources.jdbc.JdbcRelationProvider.createRelation(JdbcRelationProvider.scala:37)
at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:350)
at org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:228)
at org.apache.spark.sql.DataFrameReader.$anonfun$load$2(DataFrameReader.scala:210)
at scala.Option.getOrElse(Option.scala:189)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:210)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:171)
at org.apache.spark.sql.DataFrameReader.jdbc(DataFrameReader.scala:248)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
at py4j.Gateway.invoke(Gateway.java:282)
at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
at py4j.commands.CallCommand.execute(CallCommand.java:79)
at py4j.ClientServerConnection.waitForCommands(ClientServerConnection.java:182)
at py4j.ClientServerConnection.run(ClientServerConnection.java:106)
at java.lang.Thread.run(Thread.java:750)
Caused by: com.mysql.cj.exceptions.CJCommunicationsException: Communications link failure

The last packet sent successfully to the server was 0 milliseconds ago. The driver has not received any packets from the server.
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at com.mysql.cj.exceptions.ExceptionFactory.createException(ExceptionFactory.java:62)
at com.mysql.cj.exceptions.ExceptionFactory.createException(ExceptionFactory.java:105)
at com.mysql.cj.exceptions.ExceptionFactory.createException(ExceptionFactory.java:150)
at com.mysql.cj.exceptions.ExceptionFactory.createCommunicationsException(ExceptionFactory.java:166)
at com.mysql.cj.protocol.a.NativeProtocol.negotiateSSLConnection(NativeProtocol.java:379)
at com.mysql.cj.protocol.a.NativeAuthenticationProvider.connect(NativeAuthenticationProvider.java:206)
at com.mysql.cj.protocol.a.NativeProtocol.connect(NativeProtocol.java:1430)
at com.mysql.cj.NativeSession.connect(NativeSession.java:134)
at com.mysql.cj.jdbc.ConnectionImpl.connectOneTryOnly(ConnectionImpl.java:939)
at com.mysql.cj.jdbc.ConnectionImpl.createNewIO(ConnectionImpl.java:809)
... 30 more
Caused by: javax.net.ssl.SSLHandshakeException: java.security.cert.CertPathValidatorException: Path does not chain with any of the trust anchors
at sun.security.ssl.Alert.createSSLException(Alert.java:131)
at sun.security.ssl.TransportContext.fatal(TransportContext.java:331)
at sun.security.ssl.TransportContext.fatal(TransportContext.java:274)
at sun.security.ssl.TransportContext.fatal(TransportContext.java:269)
at sun.security.ssl.CertificateMessage$T12CertificateConsumer.checkServerCerts(CertificateMessage.java:654)
at sun.security.ssl.CertificateMessage$T12CertificateConsumer.onCertificate(CertificateMessage.java:473)
at sun.security.ssl.CertificateMessage$T12CertificateConsumer.consume(CertificateMessage.java:369)
at sun.security.ssl.SSLHandshake.consume(SSLHandshake.java:377)
at sun.security.ssl.HandshakeContext.dispatch(HandshakeContext.java:444)
at sun.security.ssl.HandshakeContext.dispatch(HandshakeContext.java:422)
at sun.security.ssl.TransportContext.dispatch(TransportContext.java:182)
at sun.security.ssl.SSLTransport.decode(SSLTransport.java:152)
at sun.security.ssl.SSLSocketImpl.decode(SSLSocketImpl.java:1397)
at sun.security.ssl.SSLSocketImpl.readHandshakeRecord(SSLSocketImpl.java:1305)
at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:440)
at com.mysql.cj.protocol.ExportControlled.performTlsHandshake(ExportControlled.java:214)
at com.mysql.cj.protocol.StandardSocketFactory.performTlsHandshake(StandardSocketFactory.java:191)
at com.mysql.cj.protocol.a.NativeSocketConnection.performTlsHandshake(NativeSocketConnection.java:101)
at com.mysql.cj.protocol.a.NativeProtocol.negotiateSSLConnection(NativeProtocol.java:370)
... 35 more
Caused by: java.security.cert.CertificateException: java.security.cert.CertPathValidatorException: Path does not chain with any of the trust anchors
at com.mysql.cj.protocol.ExportControlled$X509TrustManagerWrapper.checkServerTrusted(ExportControlled.java:697)
at sun.security.ssl.AbstractTrustManagerWrapper.checkServerTrusted(SSLContextImpl.java:1258)
at sun.security.ssl.CertificateMessage$T12CertificateConsumer.checkServerCerts(CertificateMessage.java:638)
... 49 more
Caused by: java.security.cert.CertPathValidatorException: Path does not chain with any of the trust anchors
at sun.security.provider.certpath.PKIXCertPathValidator.validate(PKIXCertPathValidator.java:153)
at sun.security.provider.certpath.PKIXCertPathValidator.engineValidate(PKIXCertPathValidator.java:79)
at java.security.cert.CertPathValidator.validate(CertPathValidator.java:292)
at com.mysql.cj.protocol.ExportControlled$X509TrustManagerWrapper.checkServerTrusted(ExportControlled.java:691)

 


We have even verified the below documentation but couldn't get our problem resolved:

MySQL documentation: https://dev.mysql.com/doc/connector-j/8.1/en/connector-j-client-authentication.html

Google MySQL Connector: https://github.com/GoogleCloudPlatform/cloud-sql-jdbc-socket-factory/blob/main/docs/jdbc-mysql.md. In this Google documentation it only explains IAM Authentication, but it does not mention anything about SSL certificate verification process.

Any help provided is appreciated!

Thanks,

Avinash

 

Solved Solved
0 2 3,552
2 ACCEPTED SOLUTIONS

Here are some specific steps you can follow to try each of these solutions:

If the connection issue is due to an untrusted certificate, you can manually add it to the JRE's truststore.

To add the SSL certificate of the Google Cloud SQL instance to the JRE's truststore, follow these steps:

  1. Download the SSL certificate of the Google Cloud SQL instance. You can do this by running the following command on the Amazon EMR cluster:

openssl s_client -showcerts -servername <google_cloud_sql_instance_hostname> -connect <google_cloud_sql_instance_hostname>:<google_cloud_sql_instance_port> | openssl x509 -out server.crt

  1. Import the SSL certificate into the JRE's truststore. You can do this by running the following command on the Amazon EMR cluster:

keytool -import -trustcacerts -file server.crt -alias google_cloud_sql -keystore <jre_truststore_path>

Note: The keytool command may require additional flags or changes based on the specific environment. For example, you may need to ensure that you have the appropriate permissions to modify the JRE's truststore, or you may need to specify the location of the truststore using the -keystore option. The <jre_truststore_path> is typically $JAVA_HOME/jre/lib/security/cacerts. You may also need to provide a password, which is typically changeit for the default Java truststore.

SSL Modes

The VERIFY_IDENTITY mode checks both the server's CA and if the hostname matches the name in the server's certificate. The VERIFY_CA mode checks only the server's CA.

Switching between SSL modes should be done with an understanding of these implications. For example, if you switch to the VERIFY_IDENTITY mode and the hostname in the server's certificate does not match the hostname of the Google Cloud SQL instance, you will not be able to connect.

Disabling SSL Verification

Disabling SSL verification can make the connection insecure. This option should be used only for debugging or in a controlled, non-production environment where security risks have been considered.

Potential Additional Debugging Steps

If you are still unable to connect to the Google Cloud SQL instance after trying the above solutions, you can try the following:

  • Check the networking between the Amazon EMR and the Google Cloud SQL instance, such as firewall rules or VPC settings.
  • Ensure that the Google Cloud SQL instance allows connections from the IP address of the Amazon EMR instance.
  • Consider using a tool like telnet or nc to test raw connectivity to the Google Cloud SQL instance port. This can help identify if the issue is at the network level or at the SSL/Google Cloud SQL level.
Documentation

Always ensure that you are looking at the most up-to-date documentation for your specific versions of Google Cloud SQL, the JDBC driver, and Java, as features and behavior can change over time.

 

View solution in original post

I already had the server-ca.pem, client-cert.pem and client-key.pem files. But I was having trouble making jdbc calls with the three pem files. 

So I followed the MysSQL documentation and created the truststore file from server-ca.pem file for server authentication, the below is the documentation:

MySQL :: MySQL Connector/J 8.1 Developer Guide :: 6.9.1 Setting up Server Authentication

Then I created keystore file from client-cert.pem and client-key.pem files as per the MySQL documentation:

MySQL :: MySQL Connector/J 8.1 Developer Guide :: 6.9.2 Setting up Client Authentication

Finally once the truststore and keystore files are ready we need copy the files to both driver and executors because we are working on distributed systems like EMR. So to copy files to both driver and executors, I had copied the truststore and keystore files to a S3 bucket and launched another EMR with a bootstrap script below to copy to a same path in both executors and driver. The bootstrap script is as below:

#!/bin/bash

aws s3 cp s3://test-bucket-us-east-2/truststore /home/hadoop
sudo chmod +x /home/hadoop/truststore


aws s3 cp s3://test-bucket-us-east-2/keystore /home/hadoop
sudo chmod +x /home/hadoop/keystore

Finally once the files are ready in the EMR I started a pyspark shell with the below command:

pyspark --jars mysql-connector-j-8.1.0.jar ## You can download the MySQL Connector/J jar file from this location https://downloads.mysql.com/archives/c-j/

I used this MySQL Connector/J - mysql-connector-j-8.1.0.jar

Finally in the pyspark shell I ran the below spark JDBC connection command to extract data from a MySQL table hosted on GCP:

df = spark.read \
.format("jdbc") \
.option("driver","com.mysql.cj.jdbc.Driver") \
.option("url", "jdbc:mysql://<MySQL hostname or IP>:<Port>/<db_name>") \
.option("dbtable", "table_name") \
.option("user", "user_id") \
.option("password", "pwd") \
.option("sslMode", "VERIFY_CA") \ # This needs to be changed basing on your requirement. This determines what type of SSL authentication happens, so this is important
.option("trustCertificateKeyStoreUrl", "file:/path/to/truststore") \
.option("trustCertificateKeyStorePassword", "truststore_pwd") \
.option("clientCertificateKeyStoreUrl", "file:/path/to/keystore") \
.option("clientCertificateKeyStorePassword", "keystore_pwd") \
.load()

View solution in original post

2 REPLIES 2

Here are some specific steps you can follow to try each of these solutions:

If the connection issue is due to an untrusted certificate, you can manually add it to the JRE's truststore.

To add the SSL certificate of the Google Cloud SQL instance to the JRE's truststore, follow these steps:

  1. Download the SSL certificate of the Google Cloud SQL instance. You can do this by running the following command on the Amazon EMR cluster:

openssl s_client -showcerts -servername <google_cloud_sql_instance_hostname> -connect <google_cloud_sql_instance_hostname>:<google_cloud_sql_instance_port> | openssl x509 -out server.crt

  1. Import the SSL certificate into the JRE's truststore. You can do this by running the following command on the Amazon EMR cluster:

keytool -import -trustcacerts -file server.crt -alias google_cloud_sql -keystore <jre_truststore_path>

Note: The keytool command may require additional flags or changes based on the specific environment. For example, you may need to ensure that you have the appropriate permissions to modify the JRE's truststore, or you may need to specify the location of the truststore using the -keystore option. The <jre_truststore_path> is typically $JAVA_HOME/jre/lib/security/cacerts. You may also need to provide a password, which is typically changeit for the default Java truststore.

SSL Modes

The VERIFY_IDENTITY mode checks both the server's CA and if the hostname matches the name in the server's certificate. The VERIFY_CA mode checks only the server's CA.

Switching between SSL modes should be done with an understanding of these implications. For example, if you switch to the VERIFY_IDENTITY mode and the hostname in the server's certificate does not match the hostname of the Google Cloud SQL instance, you will not be able to connect.

Disabling SSL Verification

Disabling SSL verification can make the connection insecure. This option should be used only for debugging or in a controlled, non-production environment where security risks have been considered.

Potential Additional Debugging Steps

If you are still unable to connect to the Google Cloud SQL instance after trying the above solutions, you can try the following:

  • Check the networking between the Amazon EMR and the Google Cloud SQL instance, such as firewall rules or VPC settings.
  • Ensure that the Google Cloud SQL instance allows connections from the IP address of the Amazon EMR instance.
  • Consider using a tool like telnet or nc to test raw connectivity to the Google Cloud SQL instance port. This can help identify if the issue is at the network level or at the SSL/Google Cloud SQL level.
Documentation

Always ensure that you are looking at the most up-to-date documentation for your specific versions of Google Cloud SQL, the JDBC driver, and Java, as features and behavior can change over time.

 

I already had the server-ca.pem, client-cert.pem and client-key.pem files. But I was having trouble making jdbc calls with the three pem files. 

So I followed the MysSQL documentation and created the truststore file from server-ca.pem file for server authentication, the below is the documentation:

MySQL :: MySQL Connector/J 8.1 Developer Guide :: 6.9.1 Setting up Server Authentication

Then I created keystore file from client-cert.pem and client-key.pem files as per the MySQL documentation:

MySQL :: MySQL Connector/J 8.1 Developer Guide :: 6.9.2 Setting up Client Authentication

Finally once the truststore and keystore files are ready we need copy the files to both driver and executors because we are working on distributed systems like EMR. So to copy files to both driver and executors, I had copied the truststore and keystore files to a S3 bucket and launched another EMR with a bootstrap script below to copy to a same path in both executors and driver. The bootstrap script is as below:

#!/bin/bash

aws s3 cp s3://test-bucket-us-east-2/truststore /home/hadoop
sudo chmod +x /home/hadoop/truststore


aws s3 cp s3://test-bucket-us-east-2/keystore /home/hadoop
sudo chmod +x /home/hadoop/keystore

Finally once the files are ready in the EMR I started a pyspark shell with the below command:

pyspark --jars mysql-connector-j-8.1.0.jar ## You can download the MySQL Connector/J jar file from this location https://downloads.mysql.com/archives/c-j/

I used this MySQL Connector/J - mysql-connector-j-8.1.0.jar

Finally in the pyspark shell I ran the below spark JDBC connection command to extract data from a MySQL table hosted on GCP:

df = spark.read \
.format("jdbc") \
.option("driver","com.mysql.cj.jdbc.Driver") \
.option("url", "jdbc:mysql://<MySQL hostname or IP>:<Port>/<db_name>") \
.option("dbtable", "table_name") \
.option("user", "user_id") \
.option("password", "pwd") \
.option("sslMode", "VERIFY_CA") \ # This needs to be changed basing on your requirement. This determines what type of SSL authentication happens, so this is important
.option("trustCertificateKeyStoreUrl", "file:/path/to/truststore") \
.option("trustCertificateKeyStorePassword", "truststore_pwd") \
.option("clientCertificateKeyStoreUrl", "file:/path/to/keystore") \
.option("clientCertificateKeyStorePassword", "keystore_pwd") \
.load()