Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

BigQuery data random after data loading

I'm building an ETL from  by using Cloud Composer/Airflow and GoogleCloudStorageToBigQueryOperator. Wether the file is CSV or Parquet, loading it using WRITE_TRUNCATE method, the row is randomize. For example:

100 row from the raw CSV with sequential row id such as 1, 2, 4, 5, 6, then load to the BigQuery, and then get random like it can be start from 50, 22, 15, 59, 5, 1, 7 etc.

Any solve for this issue or how to avoid this?

0 3 173
3 REPLIES 3

Why are you trying to avoid this?

Sure I need avoid this because the data is not order properly.

Why do you need to have them ordered "properly" in the table?