Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

Nested Pubsub message to Bigquery

Is it possible to flatten the nested pubsub payload and insert to bigquery?

Eg payload :

 

 

{"data": [
{
"field1": "some_value1",
"field2": "some_value2",
"details": {
"field3": "some_value3",
"field4": "some_value4"
}
}
]
}

 

 I want the Bigquery columns to be field1,field2,field3,field4 etc

Solved Solved
0 3 875
1 ACCEPTED SOLUTION

If the message being written to the PubSub topic was nested ... for example:

{
  "A": "Hello",
  "B": {
    "X": "Val1",
    "Y": "Val2"
  },
  "C": 123
}

And the table you want to populate looks like:

A X Y C
Hello Val1 Val2 123

 Then I can't think of a way to "automatically" flatten the structured content of the topic message.  What you appear to be describing is a mapping / transformation.  Technology such as Dataflow, Data Fusion, Workflows, App Integration, Cloud Functions or Cloud Run could be used to sit between the topic containing messages and the insertion into the table.  Alternatively, land the structured data into BigQuery directly and then perform ELT to transform within the data warehouse.

To answer your question .... my answer to "Is it possible to flatten the payload directly using schema and sink to bigquery?" is then "no" ... 

View solution in original post

3 REPLIES 3

Maybe we take PubSub out of the equation (just for contemplation purposes).  I'm sensing that you have a nested and repeated JSON document and you want to manifest the records in such a document as target of a SQL SELECT statement.   We might want to look into the importation of the JSON data without transformation.    We would then up with a column in a table of type JSON where you can then refer to fields such as:

SELECT doc.field1, doc.field2 .... FROM my_table

If you are looking to flatten the record structure, consider creating a view:

CREATE VIEW my_view AS (
  SELECT doc.field1 as field1, doc.field2 as field2 ... FROM my_table
)

 

Ye this works. But, Is it possible to do with topic schema? Suppose i have nested message in topic. Is it possible to flatten the payload directly using schema and sink to bigquery?

If the message being written to the PubSub topic was nested ... for example:

{
  "A": "Hello",
  "B": {
    "X": "Val1",
    "Y": "Val2"
  },
  "C": 123
}

And the table you want to populate looks like:

A X Y C
Hello Val1 Val2 123

 Then I can't think of a way to "automatically" flatten the structured content of the topic message.  What you appear to be describing is a mapping / transformation.  Technology such as Dataflow, Data Fusion, Workflows, App Integration, Cloud Functions or Cloud Run could be used to sit between the topic containing messages and the insertion into the table.  Alternatively, land the structured data into BigQuery directly and then perform ELT to transform within the data warehouse.

To answer your question .... my answer to "Is it possible to flatten the payload directly using schema and sink to bigquery?" is then "no" ...