Hi everyone,
I am in the process of setting up Datastream to transfer data from AWS RDS using MariaDB engine to Big Query. I am using the documentation provided at https://cloud.google.com/datastream/docs/configure-your-source-mysql-database for guidance. However, I have come across three source-side configurations that I am not sure about:
net_read_timeout | 3600 |
net_write_timeout | 3600 |
wait_timeout | 86400 |
The values for these configurations are much higher than the default value for MariaDB, and the net_write_timeout is causing an "Incompatible Parameters" error for the RDS replica. Are they safe? Can anyone tell me if these configurations are necessary for Datastream to function correctly? I would really appreciate any advice or insight you may offer.
Thank you,
Solved! Go to Solution.
The three source-side configurations you have mentioned are related to how the MariaDB (or MySQL) database system behaves, not specifically how Datastream uses them. Datastream recommends specific values for these configurations to ensure smooth communication with the source database, but the definitions themselves are from the perspective of the database.
The default values for these configurations in MariaDB are:
net_read_timeout
: 30 secondsnet_write_timeout
: 60 secondswait_timeout
: 28800 seconds (8 hours)Datastream recommends the following values for these configurations:
net_read_timeout
: 3600 seconds (1 hour)net_write_timeout
: 3600 seconds (1 hour)wait_timeout
: 86400 seconds (24 hours)Datastream recommends these higher values because it streams data from the source database to BigQuery in real time. This means that Datastream needs to be able to handle long-running queries and network outages without losing data.
The net_write_timeout
configuration is causing an "Incompatible Parameters" error for your RDS replica because the default value for this configuration on RDS replicas is 60 seconds. To resolve this error, you can either increase the value of the net_write_timeout
configuration on your RDS replica or decrease the value of the net_write_timeout
configuration in Datastream. However, I recommend that you only decrease this value if you are absolutely sure that it is necessary.
Are the Datastream recommended values for these configurations safe?
Yes, the Datastream recommended values for these configurations are safe. Datastream has been tested with these values and has been found to be stable and reliable in typical use-cases and data loads.
Are these configurations necessary for Datastream to function correctly?
No, these configurations are not necessary for Datastream to function correctly. However, using the Datastream recommended values for these configurations can help to improve the performance and reliability of Datastream.
Recommendations
Based on Datastream's documentation, using the recommended values for the net_read_timeout
, net_write_timeout
, and wait_timeout
configurations can help ensure that Datastream is able to stream data from your AWS RDS database to BigQuery in a reliable and efficient manner.
If you are unable to increase the value of the net_write_timeout
configuration on your RDS replica, you can decrease the value of the net_write_timeout
configuration in Datastream. However, I recommend that you only decrease this value if you are absolutely sure that it is necessary.
Additionally, the longer timeouts help Datastream to handle long-running queries and delays in capturing changes. This ensures that Datastream does not prematurely disconnect or lose data.
User | Count |
---|---|
1 | |
1 | |
1 | |
1 | |
1 |