Hello
I am a newbie datafuson user. The flows worked great for me for a few weeks. Unfortunately, after upgrading the DataFusion version yesterday, most of the flows stopped working
I have many issues with almost all of my pipelines from SQL Server, PostreSQL to Google BigQuery after upgrade DataFusion to latest version. I have errors in the logs, including the conversion "NoClassDefFoundError: javax/xml/bind/DatatypeConverter" I am sending some of my flow configuration: flow
{
"name": "PROD_TEST_SQL-ALERTS_dbo",
"description": "PROD_TEST_SQL-ALERTS_dbo",
"artifact": {
"name": "cdap-data-pipeline",
"version": "6.10.1",
"scope": "SYSTEM"
},
"config": {
"resources": {
"memoryMB": 2048,
"virtualCores": 1
},
"driverResources": {
"memoryMB": 2048,
"virtualCores": 1
},
"connections": [
{
"from": "Multiple Database Tables v2",
"to": "BigQuery Multi Table ERP TEST v2"
}
],
"comments": [],
"postActions": [
{
"name": "EmailBySendgrid-1",
"id": "EmailBySendgrid0969aab8-1be8-4088-89b6-7ccf8402c6a8",
"plugin": {
"name": "EmailBySendgrid",
"artifact": {
"name": "sendgrid",
"version": "1.3.0",
"scope": "USER"
},
"type": "postaction",
"properties": {
"runCondition": "success",
"includeWorkflowToken": "false",
"apiKey": "my_apki_key",
"from": "my_mail",
"to": "my_group_mail",
"subject": "Success - PROD_TEST_SQL-ALERTS_dbo-ALERTS - StartTime pipeline at S{logicalStartTime(yyyy-MM-dd'T'HH:mm,0d-2h+0m)} nedrivers",
"content": "Success PROD_TEST_SQL-ALERTS_dbo-ALERTS :)\nStartTime: S{logicalStartTime(yyyy-MM-dd'T'HH:mm,0d-2h+0m)}"
}
},
"description": "Sends an email at the end of a pipeline run. You can configure it to send an email when the pipeline completes, succeeds or fails. Uses the Sendgrid service to send emails, and requires you to sign up for a Sendgrid account."
},
{
"name": "EmailBySendgrid-2",
"id": "EmailBySendgrid298a6a68-8cea-4bef-af0e-7bade4d5f6ea",
"plugin": {
"name": "EmailBySendgrid",
"artifact": {
"name": "sendgrid",
"version": "1.3.0",
"scope": "USER"
},
"type": "postaction",
"properties": {
"runCondition": "failure",
"includeWorkflowToken": "false",
"apiKey": "my_apki_key",
"from": "my_mail",
"to": "my_group_mail",
"subject": "Failure - PROD_TEST_SQL-ALERTS_dbo - StartTime pipeline at StartTime: S{logicalStartTime(yyyy-MM-dd'T'HH:mm,0d-2h+0m)} - newdriver",
"content": "Failure - PROD_TEST_SQL-ALERTS_dbo-ALERTS\nStartTime: S{logicalStartTime(yyyy-MM-dd'T'HH:mm,0d-2h+0m)}"
}
},
"description": "Sends an email at the end of a pipeline run. You can configure it to send an email when the pipeline completes, succeeds or fails. Uses the Sendgrid service to send emails, and requires you to sign up for a Sendgrid account."
}
],
"properties": {},
"processTimingEnabled": true,
"stageLoggingEnabled": false,
"stages": [
{
"name": "Multiple Database Tables v2",
"plugin": {
"name": "MultiTableDatabase",
"type": "batchsource",
"label": "Multiple Database Tables v2",
"artifact": {
"name": "multi-table-plugins",
"version": "1.4.0",
"scope": "USER"
},
"properties": {
"referenceName": "multitable-database-erp",
"connectionString": "jdbc:sqlserver://;serverName={my_ip};databaseName={my_database}",
"jdbcPluginName": "sqlserver42",
"user": "gcp_elt",
"password": "my_password",
"dataSelectionMode": "allow-list",
"schemaNamePattern": "dbo",
"whiteList": "My_table_name",
"enableAutoCommit": "false",
"splitsPerTable": "1",
"fetchSize": "2000",
"transactionIsolationLevel": "TRANSACTION_NONE",
"errorHandlingMode": "fail-pipeline"
}
},
"outputSchema": [
{
"name": "etlSchemaBody",
"schema": ""
}
],
"id": "Multiple-Database-Tables-v2",
"type": "batchsource",
"label": "Multiple Database Tables v2",
"icon": "fa-plug",
"ShashKey": "object:318",
"isPluginAvailable": true,
"_uiPosition": {
"left": "710px",
"top": "336.5px"
}
},
{
"name": "BigQuery Multi Table ERP",
"plugin": {
"name": "BigQueryMultiTable",
"type": "batchsink",
"label": "BigQuery Multi Table ERP",
"artifact": {
"name": "google-cloud",
"version": "0.23.1",
"scope": "SYSTEM"
},
"properties": {
"useConnection": "true",
"connection": "S{conn(BigQuery Default)}",
"referenceName": "ref-bq-erp",
"dataset": "ERP_My_table",
"truncateTable": "true",
"allowFlexibleSchema": "true",
"allowSchemaRelaxation": "true",
"location": "europe-west1"
}
},
"outputSchema": [
{
"name": "etlSchemaBody",
"schema": ""
}
],
"inputSchema": [
{
"name": "Multiple Database Tables v2",
"schema": ""
}
],
"id": "BigQuery-Multi-Table-ERP",
"type": "batchsink",
"label": "BigQuery Multi Table ERP",
"icon": "fa-plug",
"ShashKey": "object:319",
"isPluginAvailable": true,
"_uiPosition": {
"left": "1010px",
"top": "336.5px"
}
}
],
"schedule": "0 1 */1 * *",
"engine": "spark",
"numOfRecordsPreview": 100,
"rangeRecordsPreview": {
"min": 1,
"max": "5000"
},
"description": "PROD_FLOW_TEST",
"maxConcurrentRuns": 1,
"pushdownEnabled": false,
"transformationPushdown": {}
},
"version": "c98d4925-1cb6-11ef-b4b7-76c0559116d9"
}
and logs
2024-05-28 06:02:58,168 - DEBUG [provisioning-task-3:i.c.c.i.p.t.ProvisioningTask@128] - Executing PROVISION subtask REQUESTING_CREATE for program run program_run:default.PROD__SQL-ALERTS_copy_dbo_My_table_v2.e40cc637-1cb7-11ef-89a3-76c0559116d9.workflow.DataPipelineWorkflow.ede05698-1cb7-11ef-b388-76c0559116d9.
2024-05-28 06:02:58,183 - DEBUG [provisioning-task-3:i.c.c.r.s.p.d.DataprocProvisioner@266] - Not checking cluster reuse, enabled: true, skip delete: false, idle ttl: 30, reuse threshold: 15
2024-05-28 06:02:58,274 - INFO [provisioning-task-3:i.c.c.r.s.p.d.DataprocProvisioner@212] - Creating Dataproc cluster cdap-prodoptim-ede05698-1cb7-11ef-b388-76c0559116d9 in project ds-dev-prj-1, in region europe-west1, with image 2.1, with labels {goog-datafusion-instance=df-instance-gaja3-priv, goog-datafusion-version=6_10, cdap-version=6_10_1-1715685199823, goog-datafusion-edition=developer, goog-datafusion-project=ds-dev-prj-1}, endpoint dataproc.googleapis.com:443
2024-05-28 06:03:00,774 - WARN [provisioning-task-3:i.c.c.r.s.p.d.DataprocProvisioner@223] - Encountered 3 warnings while creating Dataproc cluster:
For PD-Standard without local SSDs, we strongly recommend provisioning 1TB or larger to ensure consistently high I/O performance. See https://cloud.google.com/compute/docs/disks/performance for information on disk I/O performance.
The firewall rules for specified network or subnetwork would allow ingress traffic from 0.0.0.0/0, which could be a security risk.
The specified custom staging bucket 'dataproc-staging-europe-west1-27292267370-rnxtsl25' is not using uniform bucket level access IAM configuration. It is recommended to update bucket to enable the same. See https://cloud.google.com/storage/docs/uniform-bucket-level-access.
2024-05-28 06:03:00,775 - DEBUG [provisioning-task-3:i.c.c.i.p.t.ProvisioningTask@133] - Completed PROVISION subtask REQUESTING_CREATE for program run program_run:default.PROD__SQL-ALERTS_copy_dbo_My_table_v2.e40cc637-1cb7-11ef-89a3-76c0559116d9.workflow.DataPipelineWorkflow.ede05698-1cb7-11ef-b388-76c0559116d9.
2024-05-28 06:04:06,887 - DEBUG [provisioning-task-3:i.c.c.i.p.t.ProvisioningTask@128] - Executing PROVISION subtask POLLING_CREATE for program run program_run:default.PROD__SQL-ALERTS_copy_dbo_My_table_v2.e40cc637-1cb7-11ef-89a3-76c0559116d9.workflow.DataPipelineWorkflow.ede05698-1cb7-11ef-b388-76c0559116d9.
2024-05-28 06:04:07,000 - DEBUG [provisioning-task-3:i.c.c.i.p.t.ProvisioningTask@133] - Completed PROVISION subtask POLLING_CREATE for program run program_run:default.PROD__SQL-ALERTS_copy_dbo_My_table_v2.e40cc637-1cb7-11ef-89a3-76c0559116d9.workflow.DataPipelineWorkflow.ede05698-1cb7-11ef-b388-76c0559116d9.
2024-05-28 06:04:38,041 - DEBUG [provisioning-task-3:i.c.c.i.p.t.ProvisioningTask@128] - Executing PROVISION subtask POLLING_CREATE for program run program_run:default.PROD__SQL-ALERTS_copy_dbo_My_table_v2.e40cc637-1cb7-11ef-89a3-76c0559116d9.workflow.DataPipelineWorkflow.ede05698-1cb7-11ef-b388-76c0559116d9.
2024-05-28 06:04:38,120 - DEBUG [provisioning-task-3:i.c.c.i.p.t.ProvisioningTask@133] - Completed PROVISION subtask POLLING_CREATE for program run program_run:default.PROD__SQL-ALERTS_copy_dbo_My_table_v2.e40cc637-1cb7-11ef-89a3-76c0559116d9.workflow.DataPipelineWorkflow.ede05698-1cb7-11ef-b388-76c0559116d9.
2024-05-28 06:05:08,974 - DEBUG [provisioning-task-3:i.c.c.i.p.t.ProvisioningTask@128] - Executing PROVISION subtask POLLING_CREATE for program run
(....)
Caused by: java.lang.ClassNotFoundException: javax.xml.bind.DatatypeConverter
at java.net.URLClassLoader.findClass(URLClassLoader.java:476)
at io.cdap.cdap.common.lang.InterceptableClassLoader.findClass(InterceptableClassLoader.java:43)
at java.lang.ClassLoader.loadClass(ClassLoader.java:594)
at java.lang.ClassLoader.loadClass(ClassLoader.java:527)
... 47 common frames omitted
2024-05-28 06:08:38,884 - ERROR [WorkflowDriver:i.c.c.d.SmartWorkflow@542] - Pipeline 'PROD__SQL-ALERTS_copy_dbo_My_table_v2' failed.
2024-05-28 06:08:39,092 - ERROR [WorkflowDriver:i.c.c.i.a.r.w.WorkflowProgramController@90] - Workflow service 'workflow.default.PROD__SQL-ALERTS_copy_dbo_My_table_v2.DataPipelineWorkflow.ede05698-1cb7-11ef-b388-76c0559116d9' failed.
java.util.concurrent.ExecutionException: java.lang.RuntimeException: java.lang.Exception: javax/xml/bind/DatatypeConverter
at java.util.concurrent.FutureTask.report(FutureTask.java:122)
at java.util.concurrent.FutureTask.get(FutureTask.java:191)
at io.cdap.cdap.internal.app.runtime.workflow.WorkflowDriver.executeAction(WorkflowDriver.java:354)
at io.cdap.cdap.internal.app.runtime.workflow.WorkflowDriver.executeNode(WorkflowDriver.java:489)
at io.cdap.cdap.internal.app.runtime.workflow.WorkflowDriver.executeAll(WorkflowDriver.java:667)
at io.cdap.cdap.internal.app.runtime.workflow.WorkflowDriver.run(WorkflowDriver.java:651)
at com.google.common.util.concurrent.AbstractExecutionThreadServiceS1S1.run(AbstractExecutionThreadService.java:52)
at java.lang.Thread.run(Thread.java:829)
Caused by: java.lang.RuntimeException: java.lang.Exception: javax/xml/bind/DatatypeConverter
at com.google.common.base.Throwables.propagate(Throwables.java:160)
at io.cdap.cdap.internal.app.runtime.workflow.DefaultProgramWorkflowRunnerS1.run(DefaultProgramWorkflowRunner.java:145)
at io.cdap.cdap.internal.app.runtime.workflow.WorkflowDriverS1.call(WorkflowDriver.java:348)
at io.cdap.cdap.internal.app.runtime.workflow.WorkflowDriverS1.call(WorkflowDriver.java:331)
at java.util.concurrent.FutureTask.run(FutureTask.java:264)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.util.concurrent.ThreadPoolExecutorSWorker.run(ThreadPoolExecutor.java:628)
... 1 common frames omitted
Caused by: java.lang.Exception: javax/xml/bind/DatatypeConverter
at io.cdap.cdap.internal.app.runtime.AbstractContext.lambdaSinitializeProgramS8(AbstractContext.java:655)
at io.cdap.cdap.internal.app.runtime.AbstractContext.execute(AbstractContext.java:608)
at io.cdap.cdap.internal.app.runtime.AbstractContext.initializeProgram(AbstractContext.java:647)
at io.cdap.cdap.app.runtime.spark.SparkRuntimeService.initialize(SparkRuntimeService.java:550)
at io.cdap.cdap.app.runtime.spark.SparkRuntimeService.startUp(SparkRuntimeService.java:234)
at com.google.common.util.concurrent.AbstractExecutionThreadServiceS1S1.run(AbstractExecutionThreadService.java:47)
at io.cdap.cdap.app.runtime.spark.SparkRuntimeService.lambdaSnullS2(SparkRuntimeService.java:525)
... 1 common frames omitted
Caused by: java.lang.NoClassDefFoundError: javax/xml/bind/DatatypeConverter
at com.microsoft.sqlserver.jdbc.SQLServerConnection.sendLogon(SQLServerConnection.java:4098)
at com.microsoft.sqlserver.jdbc.SQLServerConnection.logon(SQLServerConnection.java:3160)
at com.microsoft.sqlserver.jdbc.SQLServerConnection.accessS100(SQLServerConnection.java:43)
at com.microsoft.sqlserver.jdbc.SQLServerConnectionSLogonCommand.doExecute(SQLServerConnection.java:3123)
at com.microsoft.sqlserver.jdbc.TDSCommand.execute(IOBuffer.java:7505)
at com.microsoft.sqlserver.jdbc.SQLServerConnection.executeCommand(SQLServerConnection.java:2445)
at com.microsoft.sqlserver.jdbc.SQLServerConnection.connectHelper(SQLServerConnection.java:1981)
at com.microsoft.sqlserver.jdbc.SQLServerConnection.login(SQLServerConnection.java:1628)
at com.microsoft.sqlserver.jdbc.SQLServerConnection.connectInternal(SQLServerConnection.java:1459)
at com.microsoft.sqlserver.jdbc.SQLServerConnection.connect(SQLServerConnection.java:773)
at com.microsoft.sqlserver.jdbc.SQLServerDriver.connect(SQLServerDriver.java:1168)
at io.cdap.plugin.JDBCDriverShim.connect(JDBCDriverShim.java:55)
at java.sql.DriverManager.getConnection(DriverManager.java:677)
at java.sql.DriverManager.getConnection(DriverManager.java:228)
at io.cdap.plugin.format.MultiTableConf.getConnection(MultiTableConf.java:306)
at io.cdap.plugin.format.MultiTableDBInputFormat.setInput(MultiTableDBInputFormat.java:82)
at io.cdap.plugin.MultiTableDBSource.setContextForMultiTableDBInput(MultiTableDBSource.java:162)
at io.cdap.plugin.MultiTableDBSource.prepareRun(MultiTableDBSource.java:101)
at io.cdap.plugin.MultiTableDBSource.prepareRun(MultiTableDBSource.java:61)
at io.cdap.cdap.etl.common.plugin.WrappedBatchSource.lambdaSprepareRunS0(WrappedBatchSource.java:53)
at io.cdap.cdap.etl.common.plugin.CallerS1.call(Caller.java:30)
at io.cdap.cdap.etl.common.plugin.WrappedBatchSource.prepareRun(WrappedBatchSource.java:52)
at io.cdap.cdap.etl.common.plugin.WrappedBatchSource.prepareRun(WrappedBatchSource.java:35)
at io.cdap.cdap.etl.common.submit.SubmitterPlugin.lambdaSprepareRunS2(SubmitterPlugin.java:74)
at io.cdap.cdap.internal.app.runtime.AbstractContext.lambdaSexecuteS5(AbstractContext.java:558)
at io.cdap.cdap.data2.transaction.TransactionsSCacheBasedTransactional.finishExecute(Transactions.java:234)
at io.cdap.cdap.data2.transaction.TransactionsSCacheBasedTransactional.execute(Transactions.java:221)
at io.cdap.cdap.internal.app.runtime.AbstractContext.execute(AbstractContext.java:554)
at io.cdap.cdap.internal.app.runtime.AbstractContext.execute(AbstractContext.java:540)
at io.cdap.cdap.app.runtime.spark.BasicSparkClientContext.execute(BasicSparkClientContext.java:357)
at io.cdap.cdap.etl.common.submit.SubmitterPlugin.prepareRun(SubmitterPlugin.java:72)
at io.cdap.cdap.etl.common.submit.PipelinePhasePreparer.prepare(PipelinePhasePreparer.java:158)
at io.cdap.cdap.etl.spark.AbstractSparkPreparer.prepare(AbstractSparkPreparer.java:87)
at io.cdap.cdap.etl.spark.batch.SparkPreparer.prepare(SparkPreparer.java:94)
at io.cdap.cdap.etl.spark.batch.ETLSpark.initialize(ETLSpark.java:131)
at io.cdap.cdap.api.spark.AbstractSpark.initialize(AbstractSpark.java:131)
at io.cdap.cdap.api.spark.AbstractSpark.initialize(AbstractSpark.java:33)
at io.cdap.cdap.app.runtime.spark.SparkRuntimeServiceS1.initialize(SparkRuntimeService.java:192)
at io.cdap.cdap.app.runtime.spark.SparkRuntimeServiceS1.initialize(SparkRuntimeService.java:187)
at io.cdap.cdap.internal.app.runtime.AbstractContext.lambdaSinitializeProgramS8(AbstractContext.java:650)
... 7 common frames omitted
Caused by: java.lang.ClassNotFoundException: javax.xml.bind.DatatypeConverter
at java.net.URLClassLoader.findClass(URLClassLoader.java:476)
at io.cdap.cdap.common.lang.InterceptableClassLoader.findClass(InterceptableClassLoader.java:43)
at java.lang.ClassLoader.loadClass(ClassLoader.java:594)
at java.lang.ClassLoader.loadClass(ClassLoader.java:527)
... 47 common frames omitted
2024-05-28 06:08:39,254 - DEBUG [WorkflowDriver:i.c.c.c.l.c.UncaughtExceptionHandler@40] - Uncaught exception in thread Thread[WorkflowDriver,5,main]
java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.RuntimeException: java.lang.Exception: javax/xml/bind/DatatypeConverter
at com.google.common.base.Throwables.propagate(Throwables.java:160)
at com.google.common.util.concurrent.AbstractExecutionThreadServiceS1S1.run(AbstractExecutionThreadService.java:69)
at java.lang.Thread.run(Thread.java:829)
Caused by: java.util.concurrent.ExecutionException: java.lang.RuntimeException: java.lang.Exception: javax/xml/bind/DatatypeConverter
at java.util.concurrent.FutureTask.report(FutureTask.java:122)
at java.util.concurrent.FutureTask.get(FutureTask.java:191)
at io.cdap.cdap.internal.app.runtime.workflow.WorkflowDriver.executeAction(WorkflowDriver.java:354)
at io.cdap.cdap.internal.app.runtime.workflow.WorkflowDriver.executeNode(WorkflowDriver.java:489)
at io.cdap.cdap.internal.app.runtime.workflow.WorkflowDriver.executeAll(WorkflowDriver.java:667)
at io.cdap.cdap.internal.app.runtime.workflow.WorkflowDriver.run(WorkflowDriver.java:651)
at com.google.common.util.concurrent.AbstractExecutionThreadServiceS1S1.run(AbstractExecutionThreadService.java:52)
... 1 common frames omitted
Caused by: java.lang.RuntimeException: java.lang.Exception: javax/xml/bind/DatatypeConverter
at com.google.common.base.Throwables.propagate(Throwables.java:160)
at io.cdap.cdap.internal.app.runtime.workflow.DefaultProgramWorkflowRunnerS1.run(DefaultProgramWorkflowRunner.java:145)
at io.cdap.cdap.internal.app.runtime.workflow.WorkflowDriverS1.call(WorkflowDriver.java:348)
at io.cdap.cdap.internal.app.runtime.workflow.WorkflowDriverS1.call(WorkflowDriver.java:331)
at java.util.concurrent.FutureTask.run(FutureTask.java:264)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.util.concurrent.ThreadPoolExecutorSWorker.run(ThreadPoolExecutor.java:628)
... 1 common frames omitted
Caused by: java.lang.Exception: javax/xml/bind/DatatypeConverter
at io.cdap.cdap.internal.app.runtime.AbstractContext.lambdaSinitializeProgramS8(AbstractContext.java:655)
at io.cdap.cdap.internal.app.runtime.AbstractContext.execute(AbstractContext.java:608)
at io.cdap.cdap.internal.app.runtime.AbstractContext.initializeProgram(AbstractContext.java:647)
at io.cdap.cdap.app.runtime.spark.SparkRuntimeService.initialize(SparkRuntimeService.java:550)
at io.cdap.cdap.app.runtime.spark.SparkRuntimeService.startUp(SparkRuntimeService.java:234)
at com.google.common.util.concurrent.AbstractExecutionThreadServiceS1S1.run(AbstractExecutionThreadService.java:47)
at io.cdap.cdap.app.runtime.spark.SparkRuntimeService.lambdaSnullS2(SparkRuntimeService.java:525)
... 1 common frames omitted
Caused by: java.lang.NoClassDefFoundError: javax/xml/bind/DatatypeConverter
at com.microsoft.sqlserver.jdbc.SQLServerConnection.sendLogon(SQLServerConnection.java:4098)
at com.microsoft.sqlserver.jdbc.SQLServerConnection.logon(SQLServerConnection.java:3160)
at com.microsoft.sqlserver.jdbc.SQLServerConnection.accessS100(SQLServerConnection.java:43)
at com.microsoft.sqlserver.jdbc.SQLServerConnectionSLogonCommand.doExecute(SQLServerConnection.java:3123)
at com.microsoft.sqlserver.jdbc.TDSCommand.execute(IOBuffer.java:7505)
at com.microsoft.sqlserver.jdbc.SQLServerConnection.executeCommand(SQLServerConnection.java:2445)
at com.microsoft.sqlserver.jdbc.SQLServerConnection.connectHelper(SQLServerConnection.java:1981)
at com.microsoft.sqlserver.jdbc.SQLServerConnection.login(SQLServerConnection.java:1628)
at com.microsoft.sqlserver.jdbc.SQLServerConnection.connectInternal(SQLServerConnection.java:1459)
at com.microsoft.sqlserver.jdbc.SQLServerConnection.connect(SQLServerConnection.java:773)
at com.microsoft.sqlserver.jdbc.SQLServerDriver.connect(SQLServerDriver.java:1168)
at io.cdap.plugin.JDBCDriverShim.connect(JDBCDriverShim.java:55)
at java.sql.DriverManager.getConnection(DriverManager.java:677)
at java.sql.DriverManager.getConnection(DriverManager.java:228)
at io.cdap.plugin.format.MultiTableConf.getConnection(MultiTableConf.java:306)
at io.cdap.plugin.format.MultiTableDBInputFormat.setInput(MultiTableDBInputFormat.java:82)
at io.cdap.plugin.MultiTableDBSource.setContextForMultiTableDBInput(MultiTableDBSource.java:162)
at io.cdap.plugin.MultiTableDBSource.prepareRun(MultiTableDBSource.java:101)
at io.cdap.plugin.MultiTableDBSource.prepareRun(MultiTableDBSource.java:61)
at io.cdap.cdap.etl.common.plugin.WrappedBatchSource.lambdaSprepareRunS0(WrappedBatchSource.java:53)
at io.cdap.cdap.etl.common.plugin.CallerS1.call(Caller.java:30)
at io.cdap.cdap.etl.common.plugin.WrappedBatchSource.prepareRun(WrappedBatchSource.java:52)
at io.cdap.cdap.etl.common.plugin.WrappedBatchSource.prepareRun(WrappedBatchSource.java:35)
at io.cdap.cdap.etl.common.submit.SubmitterPlugin.lambdaSprepareRunS2(SubmitterPlugin.java:74)
at io.cdap.cdap.internal.app.runtime.AbstractContext.lambdaSexecuteS5(AbstractContext.java:558)
at io.cdap.cdap.data2.transaction.TransactionsSCacheBasedTransactional.finishExecute(Transactions.java:234)
at io.cdap.cdap.data2.transaction.TransactionsSCacheBasedTransactional.execute(Transactions.java:221)
at io.cdap.cdap.internal.app.runtime.AbstractContext.execute(AbstractContext.java:554)
at io.cdap.cdap.internal.app.runtime.AbstractContext.execute(AbstractContext.java:540)
at io.cdap.cdap.app.runtime.spark.BasicSparkClientContext.execute(BasicSparkClientContext.java:357)
at io.cdap.cdap.etl.common.submit.SubmitterPlugin.prepareRun(SubmitterPlugin.java:72)
at io.cdap.cdap.etl.common.submit.PipelinePhasePreparer.prepare(PipelinePhasePreparer.java:158)
at io.cdap.cdap.etl.spark.AbstractSparkPreparer.prepare(AbstractSparkPreparer.java:87)
at io.cdap.cdap.etl.spark.batch.SparkPreparer.prepare(SparkPreparer.java:94)
at io.cdap.cdap.etl.spark.batch.ETLSpark.initialize(ETLSpark.java:131)
at io.cdap.cdap.api.spark.AbstractSpark.initialize(AbstractSpark.java:131)
at io.cdap.cdap.api.spark.AbstractSpark.initialize(AbstractSpark.java:33)
at io.cdap.cdap.app.runtime.spark.SparkRuntimeServiceS1.initialize(SparkRuntimeService.java:192)
at io.cdap.cdap.app.runtime.spark.SparkRuntimeServiceS1.initialize(SparkRuntimeService.java:187)
at io.cdap.cdap.internal.app.runtime.AbstractContext.lambdaSinitializeProgramS8(AbstractContext.java:650)
... 7 common frames omitted
Caused by: java.lang.ClassNotFoundException: javax.xml.bind.DatatypeConverter
at java.net.URLClassLoader.findClass(URLClassLoader.java:476)
at io.cdap.cdap.common.lang.InterceptableClassLoader.findClass(InterceptableClassLoader.java:43)
at java.lang.ClassLoader.loadClass(ClassLoader.java:594)
at java.lang.ClassLoader.loadClass(ClassLoader.java:527)
... 47 common frames omitted
2024-05-28 06:08:39,254 - ERROR [TwillContainerService:i.c.c.i.a.r.d.AbstractProgramTwillRunnable@293] - Program DataPipelineWorkflow execution failed.
java.util.concurrent.ExecutionException: java.util.concurrent.ExecutionException: java.lang.RuntimeException: java.lang.Exception: javax/xml/bind/DatatypeConverter
at java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:395)
at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1999)
at io.cdap.cdap.internal.app.runtime.distributed.AbstractProgramTwillRunnable.run(AbstractProgramTwillRunnable.java:289)
at org.apache.twill.internal.container.TwillContainerService.doRun(TwillContainerService.java:224)
at org.apache.twill.internal.AbstractTwillService.run(AbstractTwillService.java:192)
at com.google.common.util.concurrent.AbstractExecutionThreadServiceS1S1.run(AbstractExecutionThreadService.java:52)
at java.lang.Thread.run(Thread.java:829)
Caused by: java.util.concurrent.ExecutionException: java.lang.RuntimeException: java.lang.Exception: javax/xml/bind/DatatypeConverter
at java.util.concurrent.FutureTask.report(FutureTask.java:122)
at java.util.concurrent.FutureTask.get(FutureTask.java:191)
at io.cdap.cdap.internal.app.runtime.workflow.WorkflowDriver.executeAction(WorkflowDriver.java:354)
at io.cdap.cdap.internal.app.runtime.workflow.WorkflowDriver.executeNode(WorkflowDriver.java:489)
at io.cdap.cdap.internal.app.runtime.workflow.WorkflowDriver.executeAll(WorkflowDriver.java:667)
at io.cdap.cdap.internal.app.runtime.workflow.WorkflowDriver.run(WorkflowDriver.java:651)
... 2 common frames omitted
Caused by: java.lang.RuntimeException: java.lang.Exception: javax/xml/bind/DatatypeConverter
at com.google.common.base.Throwables.propagate(Throwables.java:160)
at io.cdap.cdap.internal.app.runtime.workflow.DefaultProgramWorkflowRunnerS1.run(DefaultProgramWorkflowRunner.java:145)
at io.cdap.cdap.internal.app.runtime.workflow.WorkflowDriverS1.call(WorkflowDriver.java:348)
at io.cdap.cdap.internal.app.runtime.workflow.WorkflowDriverS1.call(WorkflowDriver.java:331)
at java.util.concurrent.FutureTask.run(FutureTask.java:264)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.util.concurrent.ThreadPoolExecutorSWorker.run(ThreadPoolExecutor.java:628)
... 1 common frames omitted
Caused by: java.lang.Exception: javax/xml/bind/DatatypeConverter
at io.cdap.cdap.internal.app.runtime.AbstractContext.lambdaSinitializeProgramS8(AbstractContext.java:655)
at io.cdap.cdap.internal.app.runtime.AbstractContext.execute(AbstractContext.java:608)
at io.cdap.cdap.internal.app.runtime.AbstractContext.initializeProgram(AbstractContext.java:647)
at io.cdap.cdap.app.runtime.spark.SparkRuntimeService.initialize(SparkRuntimeService.java:550)
at io.cdap.cdap.app.runtime.spark.SparkRuntimeService.startUp(SparkRuntimeService.java:234)
at com.google.common.util.concurrent.AbstractExecutionThreadServiceS1S1.run(AbstractExecutionThreadService.java:47)
at io.cdap.cdap.app.runtime.spark.SparkRuntimeService.lambdaSnullS2(SparkRuntimeService.java:525)
... 1 common frames omitted
Caused by: java.lang.NoClassDefFoundError: javax/xml/bind/DatatypeConverter
at com.microsoft.sqlserver.jdbc.SQLServerConnection.sendLogon(SQLServerConnection.java:4098)
at com.microsoft.sqlserver.jdbc.SQLServerConnection.logon(SQLServerConnection.java:3160)
at com.microsoft.sqlserver.jdbc.SQLServerConnection.accessS100(SQLServerConnection.java:43)
at com.microsoft.sqlserver.jdbc.SQLServerConnectionSLogonCommand.doExecute(SQLServerConnection.java:3123)
at com.microsoft.sqlserver.jdbc.TDSCommand.execute(IOBuffer.java:7505)
at com.microsoft.sqlserver.jdbc.SQLServerConnection.executeCommand(SQLServerConnection.java:2445)
at com.microsoft.sqlserver.jdbc.SQLServerConnection.connectHelper(SQLServerConnection.java:1981)
at com.microsoft.sqlserver.jdbc.SQLServerConnection.login(SQLServerConnection.java:1628)
at com.microsoft.sqlserver.jdbc.SQLServerConnection.connectInternal(SQLServerConnection.java:1459)
at com.microsoft.sqlserver.jdbc.SQLServerConnection.connect(SQLServerConnection.java:773)
at com.microsoft.sqlserver.jdbc.SQLServerDriver.connect(SQLServerDriver.java:1168)
at io.cdap.plugin.JDBCDriverShim.connect(JDBCDriverShim.java:55)
at java.sql.DriverManager.getConnection(DriverManager.java:677)
at java.sql.DriverManager.getConnection(DriverManager.java:228)
at io.cdap.plugin.format.MultiTableConf.getConnection(MultiTableConf.java:306)
at io.cdap.plugin.format.MultiTableDBInputFormat.setInput(MultiTableDBInputFormat.java:82)
at io.cdap.plugin.MultiTableDBSource.setContextForMultiTableDBInput(MultiTableDBSource.java:162)
at io.cdap.plugin.MultiTableDBSource.prepareRun(MultiTableDBSource.java:101)
at io.cdap.plugin.MultiTableDBSource.prepareRun(MultiTableDBSource.java:61)
at io.cdap.cdap.etl.common.plugin.WrappedBatchSource.lambdaSprepareRunS0(WrappedBatchSource.java:53)
at io.cdap.cdap.etl.common.plugin.CallerS1.call(Caller.java:30)
at io.cdap.cdap.etl.common.plugin.WrappedBatchSource.prepareRun(WrappedBatchSource.java:52)
at io.cdap.cdap.etl.common.plugin.WrappedBatchSource.prepareRun(WrappedBatchSource.java:35)
at io.cdap.cdap.etl.common.submit.SubmitterPlugin.lambdaSprepareRunS2(SubmitterPlugin.java:74)
at io.cdap.cdap.internal.app.runtime.AbstractContext.lambdaSexecuteS5(AbstractContext.java:558)
at io.cdap.cdap.data2.transaction.TransactionsSCacheBasedTransactional.finishExecute(Transactions.java:234)
at io.cdap.cdap.data2.transaction.TransactionsSCacheBasedTransactional.execute(Transactions.java:221)
at io.cdap.cdap.internal.app.runtime.AbstractContext.execute(AbstractContext.java:554)
at io.cdap.cdap.internal.app.runtime.AbstractContext.execute(AbstractContext.java:540)
at io.cdap.cdap.app.runtime.spark.BasicSparkClientContext.execute(BasicSparkClientContext.java:357)
at io.cdap.cdap.etl.common.submit.SubmitterPlugin.prepareRun(SubmitterPlugin.java:72)
at io.cdap.cdap.etl.common.submit.PipelinePhasePreparer.prepare(PipelinePhasePreparer.java:158)
at io.cdap.cdap.etl.spark.AbstractSparkPreparer.prepare(AbstractSparkPreparer.java:87)
at io.cdap.cdap.etl.spark.batch.SparkPreparer.prepare(SparkPreparer.java:94)
at io.cdap.cdap.etl.spark.batch.ETLSpark.initialize(ETLSpark.java:131)
at io.cdap.cdap.api.spark.AbstractSpark.initialize(AbstractSpark.java:131)
at io.cdap.cdap.api.spark.AbstractSpark.initialize(AbstractSpark.java:33)
at io.cdap.cdap.app.runtime.spark.SparkRuntimeServiceS1.initialize(SparkRuntimeService.java:192)
at io.cdap.cdap.app.runtime.spark.SparkRuntimeServiceS1.initialize(SparkRuntimeService.java:187)
at io.cdap.cdap.internal.app.runtime.AbstractContext.lambdaSinitializeProgramS8(AbstractContext.java:650)
... 7 common frames omitted
Caused by: java.lang.ClassNotFoundException: javax.xml.bind.DatatypeConverter
at java.net.URLClassLoader.findClass(URLClassLoader.java:476)
at io.cdap.cdap.common.lang.InterceptableClassLoader.findClass(InterceptableClassLoader.java:43)
at java.lang.ClassLoader.loadClass(ClassLoader.java:594)
at java.lang.ClassLoader.loadClass(ClassLoader.java:527)
... 47 common frames omitted
2024-05-28 06:08:39,399 - INFO [TwillContainerService:i.c.c.i.a.r.d.AbstractProgramTwillRunnable@296] - Program run program_run:default.PROD__SQL-ALERTS_copy_dbo_My_table_v2.e40cc637-1cb7-11ef-89a3-76c0559116d9.workflow.DataPipelineWorkflow.ede05698-1cb7-11ef-b388-76c0559116d9 completed. Releasing resources.
2024-05-28 06:08:39,406 - DEBUG [TwillContainerService:i.c.c.l.a.LogAppenderInitializer@137] - Stopping log appender TMSLogAppender
2024-05-28 06:08:40,120 - INFO [ApplicationMasterService:o.a.t.i.a.ApplicationMasterService@590] - Container container_1716876289111_0001_01_000002 completed with COMPLETE:[2024-05-28 06:08:39.988]Exception from container-launch.
Container id: container_1716876289111_0001_01_000002
Exit code: 1
[2024-05-28 06:08:40.027]Container exited with a non-zero exit code 1. Error file: prelaunch.err.
Last 4096 bytes of prelaunch.err :
Last 4096 bytes of stderr :
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/hadoop/yarn/nm-local-dir/usercache/root/appcache/application_1716876289111_0001/container_1716876289111_0001_01_000002/application.jar/lib/ch.qos.logback.logback-classic-1.2.11.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/hadoop/yarn/nm-local-dir/usercache/root/appcache/application_1716876289111_0001/container_1716876289111_0001_01_000002/twill.jar/lib/ch.qos.logback.logback-classic-1.2.11.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/lib/hadoop/lib/slf4j-reload4j-1.7.36.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [ch.qos.logback.classic.util.ContextSelectorStaticBinder]
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/hadoop/yarn/nm-local-dir/usercache/root/appcache/application_1716876289111_0001/container_1716876289111_0001_01_000002/application.jar/lib/ch.qos.logback.logback-classic-1.2.11.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/hadoop/yarn/nm-local-dir/usercache/root/appcache/application_1716876289111_0001/container_1716876289111_0001_01_000002/twill.jar/lib/ch.qos.logback.logback-classic-1.2.11.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/lib/hadoop/lib/slf4j-reload4j-1.7.36.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [ch.qos.logback.classic.util.ContextSelectorStaticBinder]
WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by com.google.inject.internal.cglib.core.SReflectUtilsS2 (file:/hadoop/yarn/nm-local-dir/usercache/root/appcache/application_1716876289111_0001/container_1716876289111_0001_01_000002/application.jar/lib/com.google.inject.guice-4.0.jar) to method java.lang.ClassLoader.defineClass(java.lang.String,byte[],int,int,java.security.ProtectionDomain)
WARNING: Please consider reporting this to the maintainers of com.google.inject.internal.cglib.core.SReflectUtilsS2
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
[2024-05-28 06:08:40.027]Container exited with a non-zero exit code 1. Error file: prelaunch.err.
Last 4096 bytes of prelaunch.err :
Last 4096 bytes of stderr :
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/hadoop/yarn/nm-local-dir/usercache/root/appcache/application_1716876289111_0001/container_1716876289111_0001_01_000002/application.jar/lib/ch.qos.logback.logback-classic-1.2.11.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/hadoop/yarn/nm-local-dir/usercache/root/appcache/application_1716876289111_0001/container_1716876289111_0001_01_000002/twill.jar/lib/ch.qos.logback.logback-classic-1.2.11.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/lib/hadoop/lib/slf4j-reload4j-1.7.36.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [ch.qos.logback.classic.util.ContextSelectorStaticBinder]
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/hadoop/yarn/nm-local-dir/usercache/root/appcache/application_1716876289111_0001/container_1716876289111_0001_01_000002/application.jar/lib/ch.qos.logback.logback-classic-1.2.11.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/hadoop/yarn/nm-local-dir/usercache/root/appcache/application_1716876289111_0001/container_1716876289111_0001_01_000002/twill.jar/lib/ch.qos.logback.logback-classic-1.2.11.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/lib/hadoop/lib/slf4j-reload4j-1.7.36.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [ch.qos.logback.classic.util.ContextSelectorStaticBinder]
WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by com.google.inject.internal.cglib.core.SReflectUtilsS2 (file:/hadoop/yarn/nm-local-dir/usercache/root/appcache/application_1716876289111_0001/container_1716876289111_0001_01_000002/application.jar/lib/com.google.inject.guice-4.0.jar) to method java.lang.ClassLoader.defineClass(java.lang.String,byte[],int,int,java.security.ProtectionDomain)
WARNING: Please consider reporting this to the maintainers of com.google.inject.internal.cglib.core.SReflectUtilsS2
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
.
2024-05-28 06:08:40,131 - WARN [ApplicationMasterService:o.a.t.i.a.RunningContainers@509] - Container container_1716876289111_0001_01_000002 exited abnormally with state COMPLETE, exit code 1.
2024-05-28 06:08:40,131 - INFO [ApplicationMasterService:o.a.t.i.a.RunningContainers@542] - Retries exhausted for instance 0 of runnable DataPipelineWorkflow.
2024-05-28 06:08:40,134 - INFO [ApplicationMasterService:o.a.t.i.a.ApplicationMasterService@503] - All containers completed. Shutting down application master.
2024-05-28 06:08:40,138 - INFO [ApplicationMasterService:o.a.t.i.a.ApplicationMasterService@337] - Stop application master with spec: {"fsUser":"root","twillAppDir":"hdfs://cdap-prodoptim-ede05698-1cb7-11ef-b388-76c0559116d9-m:8020/workflow.default.PROD__SQL-ALERTS_copy_dbo_My_table_v2.DataPipelineWorkflow/9c45a758-d528-4af4-8c80-224c74643e0b","zkConnectStr":"cdap-prodoptim-ede05698-1cb7-11ef-b388-76c0559116d9-m:42521","twillRunId":"9c45a758-d528-4af4-8c80-224c74643e0b","twillAppName":"workflow.default.PROD__SQL-ALERTS_copy_dbo_My_table_v2.DataPipelineWorkflow","rmSchedulerAddr":"cdap-prodoptim-ede05698-1cb7-11ef-b388-76c0559116d9-m:8030","twillSpecification":{"name":"workflow.default.PROD__SQL-ALERTS_copy_dbo_My_table_v2.DataPipelineWorkflow","runnables":{"DataPipelineWorkflow":{"name":"DataPipelineWorkflow","runnable":{"classname":"io.cdap.cdap.internal.app.runtime.distributed.WorkflowTwillRunnable","name":"DataPipelineWorkflow","arguments":{}},"resources":{"cores":1,"memorySize":1024,"instances":1,"uplink":-1,"downlink":-1},"files":[{"name":"artifacts_archive.jar","uri":"hdfs://cdap-prodoptim-ede05698-1cb7-11ef-b388-76c0559116d9-m:8020/workflow.default.PROD__SQL-ALERTS_copy_dbo_My_table_v2.DataPipelineWorkflow/9c45a758-d528-4af4-8c80-224c74643e0b/artifacts_archive.jar.7d90166b-ba07-4226-a122-6d0fcc88e8a0.jar","lastModified":1716876455893,"size":262425694,"archive":false,"pattern":null},{"name":"py4j-0.10.9.5-src.zip","uri":"hdfs://cdap-prodoptim-ede05698-1cb7-11ef-b388-76c0559116d9-m:8020/workflow.default.PROD__SQL-ALERTS_copy_dbo_My_table_v2.DataPipelineWorkflow/9c45a758-d528-4af4-8c80-224c74643e0b/py4j-0.10.9.5-src.zip.a52ea38e-68c5-4241-b3ce-33d08affb5d5.zip","lastModified":1716876455966,"size":42404,"archive":false,"pattern":null},{"name":"program_fd4eee48431fd8a09e607104171d6143573843d1aec3086d280f09553ddbeb6b_1717977600000.jar","uri":"hdfs://cdap-prodoptim-ede05698-1cb7-11ef-b388-76c0559116d9-m:8020/workflow.default.PROD__SQL-ALERTS_copy_dbo_My_table_v2.DataPipelineWorkflow/9c45a758-d528-4af4-8c80-224c74643e0b/program_fd4eee48431fd8a09e607104171d6143573843d1aec3086d280f09553ddbeb6b_1717977600000.jar.f8813429-ec5f-42d2-bbad-a38fa673a2ea.jar","lastModified":1716876454624,"size":10495223,"archive":false,"pattern":null},{"name":"appSpec.json","uri":"hdfs://cdap-prodoptim-ede05698-1cb7-11ef-b388-76c0559116d9-m:8020/workflow.default.PROD__SQL-ALERTS_copy_dbo_My_table_v2.DataPipelineWorkflow/9c45a758-d528-4af4-8c80-224c74643e0b/appSpec.json.5807d1e8-ad57-4217-8106-cd99b970fefa.json","lastModified":1716876454673,"size":68763,"archive":false,"pattern":null},{"name":"cConf.xml","uri":"hdfs://cdap-prodoptim-ede05698-1cb7-11ef-b388-76c0559116d9-m:8020/workflow.default.PROD__SQL-ALERTS_copy_dbo_My_table_v2.DataPipelineWorkflow/9c45a758-d528-4af4-8c80-224c74643e0b/cConf.xml.27d57509-bd45-446a-9e72-523fae005f26.xml","lastModified":1716876454942,"size":179322,"archive":false,"pattern":null},{"name":"pyspark.zip","uri":"hdfs://cdap-prodoptim-ede05698-1cb7-11ef-b388-76c0559116d9-m:8020/workflow.default.PROD__SQL-ALERTS_copy_dbo_My_table_v2.DataPipelineWorkflow/9c45a758-d528-4af4-8c80-224c74643e0b/pyspark.zip.ecf925ed-d8d5-46b6-9eda-a249e95fe145.zip","lastModified":1716876454460,"size":1532101,"archive":false,"pattern":null},{"name":"hConf.xml","uri":"hdfs://cdap-prodoptim-ede05698-1cb7-11ef-b388-76c0559116d9-m:8020/workflow.default.PROD__SQL-ALERTS_copy_dbo_My_table_v2.DataPipelineWorkflow/9c45a758-d528-4af4-8c80-224c74643e0b/hConf.xml.44549db4-63aa-4423-bb1e-a9dde8d63979.xml","lastModified":1716876454733,"size":250322,"archive":false,"pattern":null},{"name":"log-appender-ext","uri":"hdfs://cdap-prodoptim-ede05698-1cb7-11ef-b388-76c0559116d9-m:8020/workflow.default.PROD__SQL-ALERTS_copy_dbo_My_table_v2.DataPipelineWorkflow/9c45a758-d528-4af4-8c80-224c74643e0b/log-appender-ext.5250cf46-17a0-4b7e-b01c-ab58e6ff9478.zip","lastModified":1716876454893,"size":22798132,"archive":true,"pattern":null},{"name":"spark.archive-spark3_2.12-3.3.6.zip","uri":"hdfs://cdap-prodoptim-ede05698-1cb7-11ef-b388-76c0559116d9-m:8020/cdap/framework/spark/spark.archive-spark3_2.12-3.3.6.zip","lastModified":1716876441174,"size":727770711,"archive":true,"pattern":null},{"name":"spark-defaults.conf","uri":"hdfs://cdap-prodoptim-ede05698-1cb7-11ef-b388-76c0559116d9-m:8020/workflow.default.PROD__SQL-ALERTS_copy_dbo_My_table_v2.DataPipelineWorkflow/9c45a758-d528-4af4-8c80-224c74643e0b/spark-defaults.conf.dbc03b95-f39d-4466-bab7-6a19bd3deccd.tmp","lastModified":1716876455936,"size":2262,"archive":false,"pattern":null},{"name":"logback.xml","uri":"hdfs://cdap-prodoptim-ede05698-1cb7-11ef-b388-76c0559116d9-m:8020/workflow.default.PROD__SQL-ALERTS_copy_dbo_My_table_v2.DataPipelineWorkflow/9c45a758-d528-4af4-8c80-224c74643e0b/logback.xml.d42115c5-17ff-4a19-93bd-cfbbcbe7909e.xml","lastModified":1716876454505,"size":4013,"archive":false,"pattern":null},{"name":"__spark_conf__","uri":"hdfs://cdap-prodoptim-ede05698-1cb7-11ef-b388-76c0559116d9-m:8020/workflow.default.PROD__SQL-ALERTS_copy_dbo_My_table_v2.DataPipelineWorkflow/9c45a758-d528-4af4-8c80-224c74643e0b/__spark_conf__.e1602647-12ee-435a-8344-f8baacfde82c.zip","lastModified":1716876454544,"size":33398,"archive":true,"pattern":null},{"name":"program.options.json","uri":"hdfs://cdap-prodoptim-ede05698-1cb7-11ef-b388-76c0559116d9-m:8020/workflow.default.PROD__SQL-ALERTS_copy_dbo_My_table_v2.DataPipelineWorkflow/9c45a758-d528-4af4-8c80-224c74643e0b/program.options.json.ff76557d-b629-489e-b4a0-36e1f2000507.json","lastModified":1716876454768,"size":4838,"archive":false,"pattern":null},{"name":"artifacts","uri":"hdfs://cdap-prodoptim-ede05698-1cb7-11ef-b388-76c0559116d9-m:8020/workflow.default.PROD__SQL-ALERTS_copy_dbo_My_table_v2.DataPipelineWorkflow/9c45a758-d528-4af4-8c80-224c74643e0b/artifacts.2d5154c3-bef4-4602-9c6b-1957a0721102.jar","lastModified":1716876457008,"size":262425694,"archive":true,"pattern":null}]}},"orders":[{"names":["DataPipelineWorkflow"],"type":"STARTED"}],"placementPolicies":[],"handler":{"classname":"io.cdap.cdap.common.twill.TwillAppLifecycleEventHandler","configs":{"abortIfNotFull":"false","abortTime":"120000"}}},"logLevels":{"DataPipelineWorkflow":{}},"maxRetries":{"DataPipelineWorkflow":0},"config":{"twill.yarn.max.app.attempts":"1","twill.log.collection.enabled":"false"},"runnableConfigs":{"DataPipelineWorkflow":{}}}
2024-05-28 06:08:40,151 - INFO [ApplicationMasterService:o.a.t.i.a.RunningContainers@393] - Stopping all instances of DataPipelineWorkflow
2024-05-28 06:08:40,153 - INFO [ApplicationMasterService:o.a.t.i.a.RunningContainers@417] - Terminated all instances of DataPipelineWorkflow
2024-05-28 06:08:40,183 - INFO [ApplicationMasterService:o.a.t.i.a.ApplicationMasterService@461] - Application directory deleted: hdfs://cdap-prodoptim-ede05698-1cb7-11ef-b388-76c0559116d9-m:8020/workflow.default.PROD__SQL-ALERTS_copy_dbo_My_table_v2.DataPipelineWorkflow/9c45a758-d528-4af4-8c80-224c74643e0b
2024-05-28 06:08:40,249 - DEBUG [ApplicationMasterService:i.c.c.l.a.LogAppenderInitializer@137] - Stopping log appender TMSLogAppender
2024-05-28 06:08:40,486 - DEBUG [main:i.c.c.l.a.LogAppenderInitializer@137] - Stopping log appender TMSLogAppender
2024-05-28 06:08:43,631 - DEBUG [provisioning-task-4:i.c.c.i.p.t.ProvisioningTask@128] - Executing DEPROVISION subtask REQUESTING_DELETE for program run program_run:default.PROD__SQL-ALERTS_copy_dbo_My_table_v2.e40cc637-1cb7-11ef-89a3-76c0559116d9.workflow.DataPipelineWorkflow.ede05698-1cb7-11ef-b388-76c0559116d9.
2024-05-28 06:08:43,688 - ERROR [provisioning-task-4:i.c.c.r.s.p.d.AbstractDataprocProvisioner@129] - Dataproc job failed with the status details: Job failed with message [java.lang.reflect.InvocationTargetException: null]. Additional details can be found at:
https://console.cloud.google.com/dataproc/jobs/default_PROD__SQL-ALERTS_copy_dbo_My_table_v2_Da_ede05698-1cb7-11ef-b388-76c0559116d9?project=ds-dev-prj-1®ion=europe-west1
gcloud dataproc jobs wait 'default_PROD__SQL-ALERTS_copy_dbo_My_table_v2_Da_ede05698-1cb7-11ef-b388-76c0559116d9' --region 'europe-west1' --project 'ds-dev-prj-1'
https://console.cloud.google.com/storage/browser/dataproc-staging-europe-west1-27292267370-rnxtsl25/google-cloud-dataproc-metainfo/172c2ea3-22dd-4af6-bb60-35068beb7abe/jobs/default_PROD__SQL-ALERTS_copy_dbo_My_table_v2_Da_ede05698-1cb7-11ef-b388-76c0559116d9/
gs://dataproc-staging-europe-west1-27292267370-rnxtsl25/google-cloud-dataproc-metainfo/172c2ea3-22dd-4af6-bb60-35068beb7abe/jobs/default_PROD__SQL-ALERTS_copy_dbo_My_table_v2_Da_ede05698-1cb7-11ef-b388-76c0559116d9/driveroutput.*
2024-05-28 06:08:44,736 - DEBUG [provisioning-task-4:i.c.c.i.p.t.ProvisioningTask@133] - Completed DEPROVISION subtask REQUESTING_DELETE for program run program_run:default.PROD__SQL-ALERTS_copy_dbo_My_table_v2.e40cc637-1cb7-11ef-89a3-76c0559116d9.workflow.DataPipelineWorkflow.ede05698-1cb7-11ef-b388-76c0559116d9.
2024-05-28 06:09:14,933 - DEBUG [provisioning-task-8:i.c.c.i.p.t.ProvisioningTask@128] - Executing DEPROVISION subtask POLLING_DELETE for program run program_run:default.PROD__SQL-ALERTS_copy_dbo_My_table_v2.e40cc637-1cb7-11ef-89a3-76c0559116d9.workflow.DataPipelineWorkflow.ede05698-1cb7-11ef-b388-76c0559116d9.
2024-05-28 06:09:15,031 - DEBUG [provisioning-task-8:i.c.c.i.p.t.ProvisioningTask@133] - Completed DEPROVISION subtask POLLING_DELETE for program run program_run:default.PROD__SQL-ALERTS_copy_dbo_My_table_v2.e40cc637-1cb7-11ef-89a3-76c0559116d9.workflow.DataPipelineWorkflow.ede05698-1cb7-11ef-b388-76c0559116d9.
2024-05-28 06:09:44,168 - DEBUG [provisioning-task-8:i.c.c.i.p.t.ProvisioningTask@118] - Completed DEPROVISION task for program run program_run:default.PROD__SQL-ALERTS_copy_dbo_My_table_v2.e40cc637-1cb7-11ef-89a3-76c0559116d9.workflow.DataPipelineWorkflow.ede05698-1cb7-11ef-b388-76c0559116d9.
The NoClassDefFoundError related to javax/xml/bind/DatatypeConverter usually occurs due to changes in the classpath or the removal of libraries that were previously available. This error indicates that the DatatypeConverter class, which was part of Java's JAXB (Java Architecture for XML Binding), is not found. In Java 9 and above, JAXB is no longer included in the Java standard library and must be added as a dependency.
Given the upgrade to the latest version of Google Cloud Data Fusion, it is likely that the new version has a different classpath or dependency management that no longer includes JAXB by default.
To resolve this issue, you can try the following steps:
Ensure that the JAXB dependencies are included in your project. If you are using Maven, you can add the following dependencies to your pom.xml
file:
<dependency>
<groupId>javax.xml.bind</groupId>
<artifactId>jaxb-api</artifactId>
<version>2.3.1</version>
</dependency>
<dependency>
<groupId>org.glassfish.jaxb</groupId>
<artifactId>jaxb-runtime</artifactId>
<version>2.3.1</version>
</dependency>
<dependency>
<groupId>javax.activation</groupId>
<artifactId>activation</artifactId>
<version>1.1.1</version>
</dependency>
If you are using Gradle, add the following dependencies to your build.gradle
file:
implementation 'javax.xml.bind:jaxb-api:2.3.1'
implementation 'org.glassfish.jaxb:jaxb-runtime:2.3.1'
implementation 'javax.activation:activation:1.1.1'
Ensure that your pipeline configurations are updated to include the necessary dependencies. You may need to update your plugin configurations to include these libraries if they are not already part of the runtime classpath.
Verify if there are any known issues or breaking changes with the latest version of Google Cloud Data Fusion. You can check the release notes or documentation for any changes that might affect your pipelines. Sometimes, specific plugins might need updates or replacements due to version incompatibilities.
If the above steps do not resolve the issue, consider reaching out to Google Cloud Support for further assistance. They might have additional insights or patches to address compatibility issues introduced by the latest upgrade.
Here is how you might configure your pipeline to ensure that the necessary dependencies are included:
{
"name": "PROD_TEST_SQL-ALERTS_dbo",
"artifact": {
"name": "cdap-data-pipeline",
"version": "6.10.1",
"scope": "SYSTEM"
},
"config": {
"resources": {
"memoryMB": 2048,
"virtualCores": 1
},
"stages": [
{
"name": "Multiple Database Tables v2",
"plugin": {
"name": "MultiTableDatabase",
"type": "batchsource",
"properties": {
"connectionString": "jdbc:sqlserver://;serverName={my_ip};databaseName={my_database}",
"jdbcPluginName": "sqlserver42",
"user": "gcp_elt",
"password": "my_password",
"additionalJars": "path/to/jaxb-api-2.3.1.jar,path/to/jaxb-runtime-2.3.1.jar,path/to/activation-1.1.1.jar"
}
}
},
{
"name": "BigQuery Multi Table ERP",
"plugin": {
"name": "BigQueryMultiTable",
"type": "batchsink",
"properties": {
"dataset": "ERP_My_table",
"truncateTable": "true"
}
}
}
]
}
}
Continue to monitor the logs for any other related errors or warnings that might indicate additional issues. Adjust the configurations as necessary based on the error messages and stack traces provided in the logs.
Thank you very much for your answer. Please verify if I have understood the steps correctly.
I added Add JAXB Dependencies Manually but I don't know what to do next. Where to modify the pipeline.
1. First, I added additional libraries in Data Fusion.
Configuration:
#1 jaxb-api-2.3.1.jar
Name: jaxb-api
Type: spark
Class name: javax.xml.bind.annotation.adapters.XmlAdapter
Version: 2.3.1
Description: JAXB API library
#2 jaxb-runtime-2.3.1.jar
Name: jaxb-runtime
Type: spark
Class name: com.sun.xml.bind.v2.ContextFactory
Version: 2.3.1
Description: JAXB Runtime library
#3 activation-1.1.1.jar:
Name: activation
Type: spark
Class name: javax.activation.DataHandler
Version: 1.1.1
Description: Activation library
They have appeared available in the GUI in Data Fusion Studio. I don't know if I should have added them to the flow?
I'm attaching a screenshot.
If I try to add manually to my to file as example below.
And try to import .json file. This additionals not exist in file anymore.
If i miss something please advise where I can add a dependency to the flow in the Multitable object and what other object (plugin) I should use.
"properties": {
"connectionString": "my_string;",
"schemaNamePattern": "dbo",
"jdbcPluginName": "sqlserver42",
"splitsPerTable": "1",
"password": "my_pass",
"additionalJars": "lib:///jaxb-api-2.3.1.jar,lib:///jaxb-runtime-2.3.1.jar,lib:///activation-1.1.1.jar",
"user": "my_user",
"referenceName": "multi-table-custom-plugin",
"whiteList": "my_table"
}
The next step is to ensure that these dependencies are included in your pipeline configuration so that they are available during pipeline execution.
Here are the detailed steps to modify your pipeline configuration to include the added libraries:
Add Libraries to Data Fusion: You've correctly added the JAXB dependencies (jaxb-api, jaxb-runtime, and activation) to Data Fusion. This ensures that these libraries are available for use in your pipeline.
Include Libraries in Pipeline Configuration: Modify your pipeline JSON configuration to include the additionalJars
property in the properties
section of the relevant stages (plugins) that require these dependencies.
Here is how you can modify your pipeline configuration:
Updated Pipeline Configuration
Make sure to include the additionalJars
property in the relevant plugin configurations. For example, in your MultiTableDatabase source plugin:
{
"name": "PROD_TEST_SQL-ALERTS_dbo",
"description": "PROD_TEST_SQL-ALERTS_dbo",
"artifact": {
"name": "cdap-data-pipeline",
"version": "6.10.1",
"scope": "SYSTEM"
},
"config": {
"resources": {
"memoryMB": 2048,
"virtualCores": 1
},
"driverResources": {
"memoryMB": 2048,
"virtualCores": 1
},
"connections": [
{
"from": "Multiple Database Tables v2",
"to": "BigQuery Multi Table ERP TEST v2"
}
],
"stages": [
{
"name": "Multiple Database Tables v2",
"plugin": {
"name": "MultiTableDatabase",
"type": "batchsource",
"label": "Multiple Database Tables v2",
"artifact": {
"name": "multi-table-plugins",
"version": "1.4.0",
"scope": "USER"
},
"properties": {
"referenceName": "multitable-database-erp",
"connectionString": "jdbc:sqlserver://;serverName={my_ip};databaseName={my_database}",
"jdbcPluginName": "sqlserver42",
"user": "gcp_elt",
"password": "my_password",
"dataSelectionMode": "allow-list",
"schemaNamePattern": "dbo",
"whiteList": "My_table_name",
"enableAutoCommit": "false",
"splitsPerTable": "1",
"fetchSize": "2000",
"transactionIsolationLevel": "TRANSACTION_NONE",
"errorHandlingMode": "fail-pipeline",
"additionalJars": "lib:///jaxb-api-2.3.1.jar,lib:///jaxb-runtime-2.3.1.jar,lib:///activation-1.1.1.jar"
}
},
"outputSchema": [
{
"name": "etlSchemaBody",
"schema": ""
}
],
"id": "Multiple-Database-Tables-v2",
"type": "batchsource",
"label": "Multiple Database Tables v2",
"icon": "fa-plug",
"isPluginAvailable": true,
"_uiPosition": {
"left": "710px",
"top": "336.5px"
}
},
{
"name": "BigQuery Multi Table ERP",
"plugin": {
"name": "BigQueryMultiTable",
"type": "batchsink",
"label": "BigQuery Multi Table ERP",
"artifact": {
"name": "google-cloud",
"version": "0.23.1",
"scope": "SYSTEM"
},
"properties": {
"useConnection": "true",
"connection": "S{conn(BigQuery Default)}",
"referenceName": "ref-bq-erp",
"dataset": "ERP_My_table",
"truncateTable": "true",
"allowFlexibleSchema": "true",
"allowSchemaRelaxation": "true",
"location": "europe-west1"
}
},
"outputSchema": [
{
"name": "etlSchemaBody",
"schema": ""
}
],
"inputSchema": [
{
"name": "Multiple Database Tables v2",
"schema": ""
}
],
"id": "BigQuery-Multi-Table-ERP",
"type": "batchsink",
"label": "BigQuery Multi Table ERP",
"icon": "fa-plug",
"isPluginAvailable": true,
"_uiPosition": {
"left": "1010px",
"top": "336.5px"
}
}
],
"schedule": "0 1 */1 * *",
"engine": "spark",
"numOfRecordsPreview": 100,
"rangeRecordsPreview": {
"min": 1,
"max": "5000"
},
"description": "PROD_FLOW_TEST",
"maxConcurrentRuns": 1,
"pushdownEnabled": false,
"transformationPushdown": {}
},
"version": "c98d4925-1cb6-11ef-b4b7-76c0559116d9"
}
Steps to Import Modified Pipeline
additionalJars
property with the paths to your JAXB JAR files in the relevant plugin properties.Additional Tips
lib:///jaxb-api-2.3.1.jar
, etc.) are correct and accessible in the environment where the pipeline will run.additionalJars
property to those plugins as well.Thanks for your support in solving the problem.
ad.
Check File Paths: Ensure that the paths to the JAR files (lib:///jaxb-api-2.3.1.jar, etc.) are correct and accessible in the environment where the pipeline will run.
I don't know how to check the path. Please give me a hint.
STEPS
1. exports the pipeline
2. adding a line
(lib:///jaxb-api-2.3.1.jar, etc.)
"properties": {
"connectionString": "jdbc:sqlserver://my_host_name:1433;databaseName=ERP;",
"schemaNamePattern": "dbo",
"jdbcPluginName": "sqlserver42",
"splitsPerTable": "1",
"password": "my_pass",
"user": "my_login",
"referenceName": "multi-table-custom-plugin",
"whiteList": "my_table_or_view",
"additionalJars": "lib:///jaxb-api-2.3.1.jar,lib:///jaxb-runtime-2.3.1.jar,lib:///activation-1.1.1.jar"
}
4. Saves
5. There is no sign of uploading "additionalJars": "lib (..). When exporting the flow, the imported line is missing.
Please provide a screenshot if I should add these libraries to the flow differently.
To resolve the issue of adding external libraries to your Data Fusion pipeline, it's important to ensure the paths are correctly specified and the JAR files are accessible. However, it seems the additionalJars
property might not be correctly reflected or supported in the GUI. Here’s a more details on how to properly include external dependencies in your Data Fusion pipeline.
Upload JAR Files to GCS: Upload the JAR files to a GCS bucket. Make sure they are accessible from your Data Fusion environment.
Get the GCS Paths: Once uploaded, get the GCS URIs for these JAR files. They will look something like gs://your-bucket-name/path/to/jaxb-api-2.3.1.jar
.
Modify the Pipeline JSON: Use these GCS paths in your pipeline JSON configuration.
Upload JAR Files to GCS:
jaxb-api-2.3.1.jar
, jaxb-runtime-2.3.1.jar
, and activation-1.1.1.jar
to this bucket.Get the GCS URIs: After uploading, get the URIs which will be like gs://your-bucket-name/jaxb-api-2.3.1.jar
.
Modify and Import Pipeline JSON: Here’s how you modify your pipeline JSON:
{
"name": "PROD_TEST_SQL-ALERTS_dbo",
"description": "PROD_TEST_SQL-ALERTS_dbo",
"artifact": {
"name": "cdap-data-pipeline",
"version": "6.10.1",
"scope": "SYSTEM"
},
"config": {
"resources": {
"memoryMB": 2048,
"virtualCores": 1
},
"driverResources": {
"memoryMB": 2048,
"virtualCores": 1
},
"connections": [
{
"from": "Multiple Database Tables v2",
"to": "BigQuery Multi Table ERP TEST v2"
}
],
"stages": [
{
"name": "Multiple Database Tables v2",
"plugin": {
"name": "MultiTableDatabase",
"type": "batchsource",
"label": "Multiple Database Tables v2",
"artifact": {
"name": "multi-table-plugins",
"version": "1.4.0",
"scope": "USER"
},
"properties": {
"referenceName": "multitable-database-erp",
"connectionString": "jdbc:sqlserver://;serverName={my_ip};databaseName={my_database}",
"jdbcPluginName": "sqlserver42",
"user": "gcp_elt",
"password": "my_password",
"dataSelectionMode": "allow-list",
"schemaNamePattern": "dbo",
"whiteList": "My_table_name",
"enableAutoCommit": "false",
"splitsPerTable": "1",
"fetchSize": "2000",
"transactionIsolationLevel": "TRANSACTION_NONE",
"errorHandlingMode": "fail-pipeline",
"additionalJars": "gs://your-bucket-name/jaxb-api-2.3.1.jar,gs://your-bucket-name/jaxb-runtime-2.3.1.jar,gs://your-bucket-name/activation-1.1.1.jar"
}
},
"outputSchema": [
{
"name": "etlSchemaBody",
"schema": ""
}
],
"id": "Multiple-Database-Tables-v2",
"type": "batchsource",
"label": "Multiple Database Tables v2",
"icon": "fa-plug",
"isPluginAvailable": true,
"_uiPosition": {
"left": "710px",
"top": "336.5px"
}
},
{
"name": "BigQuery Multi Table ERP",
"plugin": {
"name": "BigQueryMultiTable",
"type": "batchsink",
"label": "BigQuery Multi Table ERP",
"artifact": {
"name": "google-cloud",
"version": "0.23.1",
"scope": "SYSTEM"
},
"properties": {
"useConnection": "true",
"connection": "S{conn(BigQuery Default)}",
"referenceName": "ref-bq-erp",
"dataset": "ERP_My_table",
"truncateTable": "true",
"allowFlexibleSchema": "true",
"allowSchemaRelaxation": "true",
"location": "europe-west1"
}
},
"outputSchema": [
{
"name": "etlSchemaBody",
"schema": ""
}
],
"inputSchema": [
{
"name": "Multiple Database Tables v2",
"schema": ""
}
],
"id": "BigQuery-Multi-Table-ERP",
"type": "batchsink",
"label": "BigQuery Multi Table ERP",
"icon": "fa-plug",
"isPluginAvailable": true,
"_uiPosition": {
"left": "1010px",
"top": "336.5px"
}
}
],
"schedule": "0 1 */1 * *",
"engine": "spark",
"numOfRecordsPreview": 100,
"rangeRecordsPreview": {
"min": 1,
"max": "5000"
},
"description": "PROD_FLOW_TEST",
"maxConcurrentRuns": 1,
"pushdownEnabled": false,
"transformationPushdown": {}
},
"version": "c98d4925-1cb6-11ef-b4b7-76c0559116d9"
}
Upload JAR files to GCS:
Get GCS URIs:
Modify JSON: You have already done this step. Just ensure you use the correct GCS URIs.
Import JSON in Data Fusion:
additionalJars
property in the plugin configuration.Thank you ms4446
I follow your tutorial step by step but it doesn't work for me.
The service account has access (including GCS admin rights)
* e.g. using the GCP source plugin I can download the contents of a folder from Google Storage. So the location
looks right.
"additionalJars": "gs://data_fusion_library/jaxb-api-2.3.1.jar,gs://data_fusion_library/jaxb-runtime-2.3.1.jar,gs://data_fusion_library/activation-1.1.1.jar "
Unfortunately, the imported file is not saved in Data Fusion.
The file is loaded into DF but the added line of code is not saved in the json file.
When exporting after saving the pipeline, there is no trace of the addition.
This means that DF sees the location in GCP where the libraries are added.
DF does not allow you to add a line of code
"additionalJars":
If it works for someone, please send me screenshots of the service account settings or other tips.
It seems that the additionalJars property might not be recognized or supported directly in Data Fusion's pipeline configuration JSON. Here are some alternative methods to ensure your dependencies are correctly included:
Alternative Methods
Method 1: Custom Plugins
Steps to Create a Custom Plugin:
<dependencies>
<dependency>
<groupId>javax.xml.bind</groupId>
<artifactId>jaxb-api</artifactId>
<version>2.3.1</version>
</dependency>
<dependency>
<groupId>org.glassfish.jaxb</groupId>
<artifactId>jaxb-runtime</artifactId>
<version>2.3.1</version>
</dependency>
<dependency>
<groupId>javax.activation</groupId>
<artifactId>activation</artifactId>
<version>1.1.1</version>
</dependency>
</dependencies>
mvn clean package
Method 2: Using Wrangler Transform
Method 3: Using GCS Connector
{
"name": "GCS Connector",
"plugin": {
"name": "GCS",
"type": "batchsource",
"properties": {
"referenceName": "gcs-connector",
"project": "your-project-id",
"path": "gs://your-bucket-name/path/to/jars/"
}
}
}