Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

Need (workaround) solution to improve BQ table reads in Apache beam sdk 250 java [Urgent]

It takes 1min 5 sec to read 19 rows of  BQ full table scan that has around 10 columns (size is 15kb)  with Apache beam sdk 250.  On Apache beam 244 it takes just 4 seconds. Same behaviour with direct runner and also Dataflow runner. Is there a workaround solution? I cannot avoid full table scan 

I am using the below code

PCollection<TableRow> tableRows =
pipeline.apply(
"Read from BigQuery",
BigQueryIO.readTableRows()
.from(
new TableReference()
.setProjectId(params.getTableProjectName())
.setDatasetId(params.getTableDatasetName())
.setTableId(params.getTableName()))
.withMethod(Method.DIRECT_READ));
Solved Solved
1 3 411
1 ACCEPTED SOLUTION

Hi @dheerajpanyam,

I understand your concerns and the complexity of your production environment. Given the constraints and the information provided, it might be beneficial to reach out directly Google Cloud support with detailed logs and metrics to get more specialized assistance. 

View solution in original post

3 REPLIES 3