I am experiencing an issue with my BigQuery data streaming application, which uses the following dependency:
Issue Details:
Action Performed: The application uses JsonStreamWriter and BigQueryWriteClient to stream data to BigQuery. On 25/07/2024, it tried streaming data to bigquery.
Error Encountered: On 29/07/2024, we received an exception related to this operation.
Delay Concern:
Request for Support:
Retry Mechanism:
Issue Diagnosis:
We appreciate your assistance in resolving this issue. Please let us know if any additional information is required.
Best regards,
Sanjay Pratap T K.
Solved! Go to Solution.
In version 2.24.2 of the google-cloud-bigquerystorage library, the JsonStreamWriter's internal retry mechanism within the ConnectionWorker's appendLoop() did not include an explicit timeout. As a result, if a transient error, such as an SSL handshake failure, occurred, the retries could potentially continue indefinitely, leading to persistent issues.
The upgrade to version 3.5.1 addresses this limitation by introducing a default timeout for in-flight requests, set to approximately five minutes. This improvement prevents retries from continuing indefinitely in the face of persistent errors, making the retry behavior more robust and predictable.
While both versions offer flexibility through customizable retry settings (RetrySettings), version 3.5.1 provides a significant advantage with its built-in timeout mechanism, safeguarding against indefinite retry loops in the face of persistent errors. This key enhancement, along with the standard exponential backoff with jitter approach employed in both versions, ensures greater reliability and predictability in BigQuery data streaming operations.
Given the enhancements in version 3.5.1, particularly the introduction of a timeout for in-flight requests, it is strongly recommended to continue using the newer version of the library for your BigQuery data streaming applications.