Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

BigQuery Storage Read select columns from sub-records typed ARRAY<STRUCT<...>>

I am using the BigQuery Read API and have a question on using SelectedFields.

My data is of the shape {message_id : string, events: [ { time: timestamp, value: string } ] }, i.e. a row contains one field "message_id" and another field called "events" that is in turn an array of records each with two fields, "time" and "value".

I want to drop the "time" column and only select the "message_id" and "value", so I specify SelectedFields as ["message_id", "events.value"], because this is what should work in case "events" was just a sub-record, not an array.

However, I get the error "request failed: Query error: Cannot access field value on a value with type ARRAY<STRUCT<time TIMESTAMP, value STRING ...>> at [1:29]". 

Is there a syntax for SelectedFields that will work on selecting from ARRAY<STRUCT<...>>?

Solved Solved
0 1 218
1 ACCEPTED SOLUTION

Hi @sergiyprotsiv,

Welcome to Google Cloud Community!

As of now, BigQuery Read API selectedFields only supports STRUCT format with data type of the column RECORD. You can consider using the UNNEST function to flatten the repeated data and store your data to a temporary table(view table) or you can use ["message_id", "events"] then filter the only data you need(value) by transforming your own code.

Alternatively, you might consider submitting a feature request regarding this for enhanced functionality. Before filing, please take note on what to expect when opening an issue.

Was this helpful? If so, please accept this answer as “Solution”. If you need additional assistance, reply here within 2 business days and I’ll be happy to help.

View solution in original post

1 REPLY 1

Hi @sergiyprotsiv,

Welcome to Google Cloud Community!

As of now, BigQuery Read API selectedFields only supports STRUCT format with data type of the column RECORD. You can consider using the UNNEST function to flatten the repeated data and store your data to a temporary table(view table) or you can use ["message_id", "events"] then filter the only data you need(value) by transforming your own code.

Alternatively, you might consider submitting a feature request regarding this for enhanced functionality. Before filing, please take note on what to expect when opening an issue.

Was this helpful? If so, please accept this answer as “Solution”. If you need additional assistance, reply here within 2 business days and I’ll be happy to help.