Solved: Need (workaround) solution to improve BQ table rea... - Page 2

dheerajpanyam · 10-03-2023 02:29 PM

It takes 1min 5 sec to read 19 rows of BQ full table scan that has around 10 columns (size is 15kb) with Apache beam sdk 250. On Apache beam 244 it takes just 4 seconds. Same behaviour with direct runner and also Dataflow runner. Is there a workaround solution? I cannot avoid full table scan

I am using the below code

PCollection<TableRow> tableRows =

pipeline.apply(

"Read from BigQuery",

BigQueryIO.readTableRows()

.from(

new TableReference()

.setProjectId(params.getTableProjectName())

.setDatasetId(params.getTableDatasetName())

.setTableId(params.getTableName()))

.withMethod(Method.DIRECT_READ));

ms4446

Hi @dheerajpanyam,

I understand your concerns and the complexity of your production environment. Given the constraints and the information provided, it might be beneficial to reach out directly Google Cloud support with detailed logs and metrics to get more specialized assistance.

View solution in original post

Need (workaround) solution to improve BQ table reads in Apache beam sdk 250 java [Urgent]