Solved! Go to Solution.
If the Spark job is the ONLY job appending to the BigQuery table then you can indeed use time travel to return the table to a given moment in time (I.e. before the Spark job started). Thus if you re-ran the spark job after "rolling-back" the changes, you would have idempotency. Another thought might be to have your Spark job manifest the data frame as a Google Cloud Storage object in a bucket and then perform a BigQuery LOAD of the object into the table. That is a transactional activity (its all or nothing).
If the Spark job is the ONLY job appending to the BigQuery table then you can indeed use time travel to return the table to a given moment in time (I.e. before the Spark job started). Thus if you re-ran the spark job after "rolling-back" the changes, you would have idempotency. Another thought might be to have your Spark job manifest the data frame as a Google Cloud Storage object in a bucket and then perform a BigQuery LOAD of the object into the table. That is a transactional activity (its all or nothing).
That's good 🙂, but can I use my android phone to do that also? Hope it works?