when i am running the same pyspark job to read the cloud sql data using jdbc in pycharm on local machine where is it able to fetch the data however when i am trying to run the same job in dataproc i am facing this error .
: com.microsoft.sqlserver.jdbc.SQLServerException: The driver could not establish a secure connection to SQL Server by using Secure Sockets Layer (SSL) encryption. Error: "SQL Server returned an incomplete response. The connection has been closed. ClientConnectionId:431e9cba-6f40-4239-b537-5822e2d7b127".
we have tried to use the higher version of jar files but still facing the issue .
We are using a cloud sql SQL Server 2022 Standard with lowest configuration 1vcpt 3.75 gb ram 20gb ssd. opened the allowed the authorized network to all 0.0.0.0/0 .
using dataproc cluster configs given below
Based on the error message and details provided, it appears the issue lies in establishing a secure SSL connection between Dataproc and Cloud SQL. To address this, consider the following steps:
Verify Driver and Connectivity:
Validate SSL Configuration:
encrypt=true
or trustServerCertificate=true
, and adjust them as needed.Additional Tips:
nc
to test connectivity and SSL handshake between Dataproc and Cloud SQL.Dataproc Cluster Configuration:
Logging and Monitoring:
Security and Best Practices:
Thanks for replying to my post , i have tested most of these things after seeing it in your replies to previous posts related to similar issue by other on the community.
I have retried and here i am writing my output for the pointwise suggestions you gave. the issue is not getting resolved not sure if that's possible but could i request a call with you or someone from your team where i can share my screen and try fixing this issue ?
Verify Driver and Connectivity:
Validate SSL Configuration:
Additional Tips:
Based on your latest findings, here are some additional suggestions:
JDBC Driver and Connection String:
SSL Certificate and Truststore Configuration:
gcloud dataproc clusters create
command with the --initialization-actions
flag to specify your script when creating your Dataproc cluster.Network Connectivity:
nc
didn't provide output, it's worth ensuring that there's no network issue between Dataproc and Cloud SQL. The fact that curl
could connect suggests that the basic network path is open, but it doesn't confirm if SSL handshake can be successfully completed.Driver Compatibility with Dataproc Environment:
Reviewing Dataproc Cluster Logs:
Testing with a Simplified Setup: