Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

How to find the failed Inserts from DataFlow to BigTable?

 

I am writing a pipeline to migrate data from GCS to Bigtable. Data is in json format. Pipeline works fine but number of records written by dataflow job and Count I get from BigQuery using BigTable as external table doesn't match.

I have set "onlyReadLatest": false to get all records when I read from BigQuery. ``

CloudBigtableTableConfiguration bigtableTableConfig =
                new CloudBigtableTableConfiguration.Builder()
                .withProjectId(options.getBigtableProjectId())
                    .withInstanceId(options.getBigtableInstanceId())
                .withTableId(options.getBigtableTableId())
                    .build();       
PDone tableRows = btRow.get(successTag)
            .apply("WriteToBT", CloudBigtableIO.writeToTable(bigtableTableConfig));
1 1 217
1 REPLY 1

I recommend checking the logs to rule out it's not a dataflow job issue. To do so, you may use the cloud console on dataflow. Alternatively, you may run this command using cloud shell

 

gcloud logging read "resource.type=dataflow_step AND resource.labels.job_id=JOB_ID" --project=PROJECT_ID