Hello,
We are seeing Dataflow pipelines taking 2x to 3x more time to run in Apache beam SDK ver 2.50 compared to Apache beam SDK ver 2.44. As part of troubleshooting we compared the DAGS in 2.44 and 2.50 and we are seeing BQ read from table step in DAG (full table scan using DIRECT_TABLE_ACCESS) taking 3 sec to read 19 records / 13KB size in 2.44 and same exact pipeline with exactly same 19 records and 13KB size taking 1 min 5 sec in 2.50. Is this because this API has degraded in ver 2.50 since I also see throughput for this DAG step is much higher in 2.44 than 2.50. Please find the throughput graph images (elements/sec) below for both versions below
Throughput in ver 2.44 --> 0.15 sec (High)
Throughput in ver 2.50 --> 0.083 sec (Low)
Apache beam ver 2.44
Apache beam ver 2.50
Solved! Go to Solution.
Thanks much @ms4446 I shall open a ticket with beam support.