Hi There,
We are provided with three pem SSL files to connect to Google MySQL and we are using MySQL Java Connector (mysql-connector-j-8.1.0.jar) to connect using JDBC connections. We're using PySPARK code on an Amazon EMR. The below is code we are using to connect to Google MySQL:
# Configure JDBC connection properties
jdbc_url = "jdbc:mysql://11.11.111.111:3306/db_name?useSSL=true"
connection_properties = {
"user": "user",
"password": "pwd",
"driver": "com.mysql.cj.jdbc.Driver",
"sslCa": "server-ca.pem", # Path to the SSL CA certificate
"sslCert": "client-cert.pem", # Path to the client certificate
"sslKey": "client-key.pem", # Path to the client private key
"useSSL": "true",
"sslMode": "VERIFY_CA",
"verifyServerCertificate": "true"
}
# Read data from MySQL table into a DataFrame
jdbc_df = spark.read.jdbc(url=jdbc_url, table="Session", properties=connection_properties)
But we are getting the below error when running the above command:
>>> jdbc_df = spark.read.jdbc(url=jdbc_url, table="Session", properties=connection_properties)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/spark/python/pyspark/sql/readwriter.py", line 716, in jdbc
return self._df(self._jreader.jdbc(url, table, jprop))
File "/usr/lib/spark/python/lib/py4j-0.10.9.5-src.zip/py4j/java_gateway.py", line 1322, in __call__
File "/usr/lib/spark/python/pyspark/sql/utils.py", line 190, in deco
return f(*a, **kw)
File "/usr/lib/spark/python/lib/py4j-0.10.9.5-src.zip/py4j/protocol.py", line 328, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling o76.jdbc.
: com.mysql.cj.jdbc.exceptions.CommunicationsException: Communications link failure
The last packet sent successfully to the server was 0 milliseconds ago. The driver has not received any packets from the server.
at com.mysql.cj.jdbc.exceptions.SQLError.createCommunicationsException(SQLError.java:175)
at com.mysql.cj.jdbc.exceptions.SQLExceptionsMapping.translateException(SQLExceptionsMapping.java:64)
at com.mysql.cj.jdbc.ConnectionImpl.createNewIO(ConnectionImpl.java:819)
at com.mysql.cj.jdbc.ConnectionImpl.<init>(ConnectionImpl.java:440)
at com.mysql.cj.jdbc.ConnectionImpl.getInstance(ConnectionImpl.java:239)
at com.mysql.cj.jdbc.NonRegisteringDriver.connect(NonRegisteringDriver.java:188)
at org.apache.spark.sql.execution.datasources.jdbc.connection.BasicConnectionProvider.getConnection(BasicConnectionProvider.scala:49)
at org.apache.spark.sql.execution.datasources.jdbc.connection.ConnectionProviderBase.create(ConnectionProvider.scala:102)
at org.apache.spark.sql.jdbc.JdbcDialect.$anonfun$createConnectionFactory$1(JdbcDialects.scala:122)
at org.apache.spark.sql.jdbc.JdbcDialect.$anonfun$createConnectionFactory$1$adapted(JdbcDialects.scala:118)
at org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$.getQueryOutputSchema(JDBCRDD.scala:63)
at org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$.resolveTable(JDBCRDD.scala:58)
at org.apache.spark.sql.execution.datasources.jdbc.JDBCRelation$.getSchema(JDBCRelation.scala:242)
at org.apache.spark.sql.execution.datasources.jdbc.JdbcRelationProvider.createRelation(JdbcRelationProvider.scala:37)
at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:350)
at org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:228)
at org.apache.spark.sql.DataFrameReader.$anonfun$load$2(DataFrameReader.scala:210)
at scala.Option.getOrElse(Option.scala:189)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:210)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:171)
at org.apache.spark.sql.DataFrameReader.jdbc(DataFrameReader.scala:248)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
at py4j.Gateway.invoke(Gateway.java:282)
at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
at py4j.commands.CallCommand.execute(CallCommand.java:79)
at py4j.ClientServerConnection.waitForCommands(ClientServerConnection.java:182)
at py4j.ClientServerConnection.run(ClientServerConnection.java:106)
at java.lang.Thread.run(Thread.java:750)
Caused by: com.mysql.cj.exceptions.CJCommunicationsException: Communications link failure
The last packet sent successfully to the server was 0 milliseconds ago. The driver has not received any packets from the server.
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at com.mysql.cj.exceptions.ExceptionFactory.createException(ExceptionFactory.java:62)
at com.mysql.cj.exceptions.ExceptionFactory.createException(ExceptionFactory.java:105)
at com.mysql.cj.exceptions.ExceptionFactory.createException(ExceptionFactory.java:150)
at com.mysql.cj.exceptions.ExceptionFactory.createCommunicationsException(ExceptionFactory.java:166)
at com.mysql.cj.protocol.a.NativeProtocol.negotiateSSLConnection(NativeProtocol.java:379)
at com.mysql.cj.protocol.a.NativeAuthenticationProvider.connect(NativeAuthenticationProvider.java:206)
at com.mysql.cj.protocol.a.NativeProtocol.connect(NativeProtocol.java:1430)
at com.mysql.cj.NativeSession.connect(NativeSession.java:134)
at com.mysql.cj.jdbc.ConnectionImpl.connectOneTryOnly(ConnectionImpl.java:939)
at com.mysql.cj.jdbc.ConnectionImpl.createNewIO(ConnectionImpl.java:809)
... 30 more
Caused by: javax.net.ssl.SSLHandshakeException: java.security.cert.CertPathValidatorException: Path does not chain with any of the trust anchors
at sun.security.ssl.Alert.createSSLException(Alert.java:131)
at sun.security.ssl.TransportContext.fatal(TransportContext.java:331)
at sun.security.ssl.TransportContext.fatal(TransportContext.java:274)
at sun.security.ssl.TransportContext.fatal(TransportContext.java:269)
at sun.security.ssl.CertificateMessage$T12CertificateConsumer.checkServerCerts(CertificateMessage.java:654)
at sun.security.ssl.CertificateMessage$T12CertificateConsumer.onCertificate(CertificateMessage.java:473)
at sun.security.ssl.CertificateMessage$T12CertificateConsumer.consume(CertificateMessage.java:369)
at sun.security.ssl.SSLHandshake.consume(SSLHandshake.java:377)
at sun.security.ssl.HandshakeContext.dispatch(HandshakeContext.java:444)
at sun.security.ssl.HandshakeContext.dispatch(HandshakeContext.java:422)
at sun.security.ssl.TransportContext.dispatch(TransportContext.java:182)
at sun.security.ssl.SSLTransport.decode(SSLTransport.java:152)
at sun.security.ssl.SSLSocketImpl.decode(SSLSocketImpl.java:1397)
at sun.security.ssl.SSLSocketImpl.readHandshakeRecord(SSLSocketImpl.java:1305)
at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:440)
at com.mysql.cj.protocol.ExportControlled.performTlsHandshake(ExportControlled.java:214)
at com.mysql.cj.protocol.StandardSocketFactory.performTlsHandshake(StandardSocketFactory.java:191)
at com.mysql.cj.protocol.a.NativeSocketConnection.performTlsHandshake(NativeSocketConnection.java:101)
at com.mysql.cj.protocol.a.NativeProtocol.negotiateSSLConnection(NativeProtocol.java:370)
... 35 more
Caused by: java.security.cert.CertificateException: java.security.cert.CertPathValidatorException: Path does not chain with any of the trust anchors
at com.mysql.cj.protocol.ExportControlled$X509TrustManagerWrapper.checkServerTrusted(ExportControlled.java:697)
at sun.security.ssl.AbstractTrustManagerWrapper.checkServerTrusted(SSLContextImpl.java:1258)
at sun.security.ssl.CertificateMessage$T12CertificateConsumer.checkServerCerts(CertificateMessage.java:638)
... 49 more
Caused by: java.security.cert.CertPathValidatorException: Path does not chain with any of the trust anchors
at sun.security.provider.certpath.PKIXCertPathValidator.validate(PKIXCertPathValidator.java:153)
at sun.security.provider.certpath.PKIXCertPathValidator.engineValidate(PKIXCertPathValidator.java:79)
at java.security.cert.CertPathValidator.validate(CertPathValidator.java:292)
at com.mysql.cj.protocol.ExportControlled$X509TrustManagerWrapper.checkServerTrusted(ExportControlled.java:691)
We have even verified the below documentation but couldn't get our problem resolved:
MySQL documentation: https://dev.mysql.com/doc/connector-j/8.1/en/connector-j-client-authentication.html
Google MySQL Connector: https://github.com/GoogleCloudPlatform/cloud-sql-jdbc-socket-factory/blob/main/docs/jdbc-mysql.md. In this Google documentation it only explains IAM Authentication, but it does not mention anything about SSL certificate verification process.
Any help provided is appreciated!
Thanks,
Avinash
Solved! Go to Solution.
Here are some specific steps you can follow to try each of these solutions:
If the connection issue is due to an untrusted certificate, you can manually add it to the JRE's truststore.
To add the SSL certificate of the Google Cloud SQL instance to the JRE's truststore, follow these steps:
Download the SSL certificate of the Google Cloud SQL instance. You can do this by running the following command on the Amazon EMR cluster:
openssl s_client -showcerts -servername <google_cloud_sql_instance_hostname> -connect <google_cloud_sql_instance_hostname>:<google_cloud_sql_instance_port> | openssl x509 -out server.crt
Import the SSL certificate into the JRE's truststore. You can do this by running the following command on the Amazon EMR cluster:
keytool -import -trustcacerts -file server.crt -alias google_cloud_sql -keystore <jre_truststore_path>
Note: The keytool
command may require additional flags or changes based on the specific environment. For example, you may need to ensure that you have the appropriate permissions to modify the JRE's truststore, or you may need to specify the location of the truststore using the -keystore
option. The <jre_truststore_path>
is typically $JAVA_HOME/jre/lib/security/cacerts
. You may also need to provide a password, which is typically changeit
for the default Java truststore.
The VERIFY_IDENTITY
mode checks both the server's CA and if the hostname matches the name in the server's certificate. The VERIFY_CA
mode checks only the server's CA.
Switching between SSL modes should be done with an understanding of these implications. For example, if you switch to the VERIFY_IDENTITY
mode and the hostname in the server's certificate does not match the hostname of the Google Cloud SQL instance, you will not be able to connect.
Disabling SSL verification can make the connection insecure. This option should be used only for debugging or in a controlled, non-production environment where security risks have been considered.
If you are still unable to connect to the Google Cloud SQL instance after trying the above solutions, you can try the following:
telnet
or nc
to test raw connectivity to the Google Cloud SQL instance port. This can help identify if the issue is at the network level or at the SSL/Google Cloud SQL level.Always ensure that you are looking at the most up-to-date documentation for your specific versions of Google Cloud SQL, the JDBC driver, and Java, as features and behavior can change over time.
I already had the server-ca.pem, client-cert.pem and client-key.pem files. But I was having trouble making jdbc calls with the three pem files.
So I followed the MysSQL documentation and created the truststore file from server-ca.pem file for server authentication, the below is the documentation:
MySQL :: MySQL Connector/J 8.1 Developer Guide :: 6.9.1 Setting up Server Authentication
Then I created keystore file from client-cert.pem and client-key.pem files as per the MySQL documentation:
MySQL :: MySQL Connector/J 8.1 Developer Guide :: 6.9.2 Setting up Client Authentication
Finally once the truststore and keystore files are ready we need copy the files to both driver and executors because we are working on distributed systems like EMR. So to copy files to both driver and executors, I had copied the truststore and keystore files to a S3 bucket and launched another EMR with a bootstrap script below to copy to a same path in both executors and driver. The bootstrap script is as below:
#!/bin/bash
aws s3 cp s3://test-bucket-us-east-2/truststore /home/hadoop
sudo chmod +x /home/hadoop/truststore
aws s3 cp s3://test-bucket-us-east-2/keystore /home/hadoop
sudo chmod +x /home/hadoop/keystore
Finally once the files are ready in the EMR I started a pyspark shell with the below command:
pyspark --jars mysql-connector-j-8.1.0.jar ## You can download the MySQL Connector/J jar file from this location https://downloads.mysql.com/archives/c-j/
I used this MySQL Connector/J - mysql-connector-j-8.1.0.jar
Finally in the pyspark shell I ran the below spark JDBC connection command to extract data from a MySQL table hosted on GCP:
df = spark.read \
.format("jdbc") \
.option("driver","com.mysql.cj.jdbc.Driver") \
.option("url", "jdbc:mysql://<MySQL hostname or IP>:<Port>/<db_name>") \
.option("dbtable", "table_name") \
.option("user", "user_id") \
.option("password", "pwd") \
.option("sslMode", "VERIFY_CA") \ # This needs to be changed basing on your requirement. This determines what type of SSL authentication happens, so this is important
.option("trustCertificateKeyStoreUrl", "file:/path/to/truststore") \
.option("trustCertificateKeyStorePassword", "truststore_pwd") \
.option("clientCertificateKeyStoreUrl", "file:/path/to/keystore") \
.option("clientCertificateKeyStorePassword", "keystore_pwd") \
.load()
Here are some specific steps you can follow to try each of these solutions:
If the connection issue is due to an untrusted certificate, you can manually add it to the JRE's truststore.
To add the SSL certificate of the Google Cloud SQL instance to the JRE's truststore, follow these steps:
Download the SSL certificate of the Google Cloud SQL instance. You can do this by running the following command on the Amazon EMR cluster:
openssl s_client -showcerts -servername <google_cloud_sql_instance_hostname> -connect <google_cloud_sql_instance_hostname>:<google_cloud_sql_instance_port> | openssl x509 -out server.crt
Import the SSL certificate into the JRE's truststore. You can do this by running the following command on the Amazon EMR cluster:
keytool -import -trustcacerts -file server.crt -alias google_cloud_sql -keystore <jre_truststore_path>
Note: The keytool
command may require additional flags or changes based on the specific environment. For example, you may need to ensure that you have the appropriate permissions to modify the JRE's truststore, or you may need to specify the location of the truststore using the -keystore
option. The <jre_truststore_path>
is typically $JAVA_HOME/jre/lib/security/cacerts
. You may also need to provide a password, which is typically changeit
for the default Java truststore.
The VERIFY_IDENTITY
mode checks both the server's CA and if the hostname matches the name in the server's certificate. The VERIFY_CA
mode checks only the server's CA.
Switching between SSL modes should be done with an understanding of these implications. For example, if you switch to the VERIFY_IDENTITY
mode and the hostname in the server's certificate does not match the hostname of the Google Cloud SQL instance, you will not be able to connect.
Disabling SSL verification can make the connection insecure. This option should be used only for debugging or in a controlled, non-production environment where security risks have been considered.
If you are still unable to connect to the Google Cloud SQL instance after trying the above solutions, you can try the following:
telnet
or nc
to test raw connectivity to the Google Cloud SQL instance port. This can help identify if the issue is at the network level or at the SSL/Google Cloud SQL level.Always ensure that you are looking at the most up-to-date documentation for your specific versions of Google Cloud SQL, the JDBC driver, and Java, as features and behavior can change over time.
I already had the server-ca.pem, client-cert.pem and client-key.pem files. But I was having trouble making jdbc calls with the three pem files.
So I followed the MysSQL documentation and created the truststore file from server-ca.pem file for server authentication, the below is the documentation:
MySQL :: MySQL Connector/J 8.1 Developer Guide :: 6.9.1 Setting up Server Authentication
Then I created keystore file from client-cert.pem and client-key.pem files as per the MySQL documentation:
MySQL :: MySQL Connector/J 8.1 Developer Guide :: 6.9.2 Setting up Client Authentication
Finally once the truststore and keystore files are ready we need copy the files to both driver and executors because we are working on distributed systems like EMR. So to copy files to both driver and executors, I had copied the truststore and keystore files to a S3 bucket and launched another EMR with a bootstrap script below to copy to a same path in both executors and driver. The bootstrap script is as below:
#!/bin/bash
aws s3 cp s3://test-bucket-us-east-2/truststore /home/hadoop
sudo chmod +x /home/hadoop/truststore
aws s3 cp s3://test-bucket-us-east-2/keystore /home/hadoop
sudo chmod +x /home/hadoop/keystore
Finally once the files are ready in the EMR I started a pyspark shell with the below command:
pyspark --jars mysql-connector-j-8.1.0.jar ## You can download the MySQL Connector/J jar file from this location https://downloads.mysql.com/archives/c-j/
I used this MySQL Connector/J - mysql-connector-j-8.1.0.jar
Finally in the pyspark shell I ran the below spark JDBC connection command to extract data from a MySQL table hosted on GCP:
df = spark.read \
.format("jdbc") \
.option("driver","com.mysql.cj.jdbc.Driver") \
.option("url", "jdbc:mysql://<MySQL hostname or IP>:<Port>/<db_name>") \
.option("dbtable", "table_name") \
.option("user", "user_id") \
.option("password", "pwd") \
.option("sslMode", "VERIFY_CA") \ # This needs to be changed basing on your requirement. This determines what type of SSL authentication happens, so this is important
.option("trustCertificateKeyStoreUrl", "file:/path/to/truststore") \
.option("trustCertificateKeyStorePassword", "truststore_pwd") \
.option("clientCertificateKeyStoreUrl", "file:/path/to/keystore") \
.option("clientCertificateKeyStorePassword", "keystore_pwd") \
.load()