Lets say I create a table with a TIMESTAMP column whose default value set to CURRENT_TIMESTAMP().
I then keep inserting data into it using Storage Write API (e.g. via PubSub-to-BigQuery subscription).
I then use Storage Read API to read the whole table and keep the maximum timestamp I saw during this read.
If I then do another Storage Read with row filter "timestamp >= max_timestamp_i_have_seen", am I guaranteed to not miss any rows? In other words, do the rows become visible to the Read API in the same order as they have been written and timestamped?
If not, is there any way to achieve the similar behavior? I.e. ensuring that the consecutive reads do not miss any rows (moderate duplication is tolerable)?
Solved! Go to Solution.
Hi @sergiyprotsiv
Good question From what I’ve seen, BigQuery doesn’t guarantee that rows will appear in the same order they were written, especially when using CURRENT_TIMESTAMP().
Even if a row has a newer timestamp, it might show up a bit later because of how data is buffered and committed.So yes — if you do timestamp >= max_seen, there’s a small chance you might miss some late-arriving rows. Try to use a small safety window, like: WHERE timestamp >= TIMESTAMP_SUB(max_seen, INTERVAL 10 SECOND) That way you avoid missing anything — even if a few rows come in slightly late.
You might get a few duplicates, but that’s easier to handle. Hope this helps!