Setting writeDisposition to "WRITE_TRUNCATE" when ...

sasubr · 11-23-2023 01:37 AM

Hello,

I'm developing a custom tool with Java for writing data to BigQuery. This tool sends data to the Storage Write API in "Pending" mode. I've understood that setting writeDisposition to WRITE_TRUNCATE will truncate the target table before writes are committed. My question is how do I set writeDisposition when using the Java client for the Storage Write API? Should it be set for the

BatchCommitWriteStreamsRequest request, and if so, how do I set it?

sasubr

Thanks for the help! This clears things up.

alvaroviebrantz

The BigQuery Storage Write API doesn't support any kind of WriteDisposition configuration. It's a Query/Load/Copy Job configuration via the BigQuery v2 API.

If you want to start a WriteStream with the BigQuery Storage Write API, but truncate the table before doing so, one interesting way for achieving that is by writing data using the BQ Storage Write API to a temporary table and then running a COPY job that truncates the target table with the contents of the temporary table. The COPY job can be set up with the WriteDisposition configuration to truncate the table.

There is a similar process that follow that pattern in the Spark Dataproc connector. But they use a QUERY Job with a MERGE SQL statement . For reference: https://github.com/GoogleCloudDataproc/spark-bigquery-connector/blob/14fa9b879b62a535c37c906ffb76386...

Setting writeDisposition to "WRITE_TRUNCATE" when using the Storage Write API (Pending mode) in Java