Defining JSON Objects as LookML Dimensions

When you come across JSON objects in Postgres and Snowflake, the obvious thing to do is to use a JSON parsing function to select JSON keys as LookML dimensions. You’ll be able to create a lot of dimensions without any issues, but there are some nuances to note. This article will outline the process of defining dimensions and iron out some issues you may encounter in the process.

Step 1: Select the raw JSON dimension to see what key-value pairs are included

Say there is a dimension defined as the following:

Postgres or Snowflake:
- dimension: test_json_object
  sql: ${TABLE}.test_json_object

If you select this dimension in an explore, you’ll get something like this:

{"key1":"abc","key2":"a44g6jX3","key3":"12345","key4":"2015-01-01 12:33:24"}

Now we know that our new dimensions are keys 1~4, with various data types ranging from string values to a timestamp.

Step 2: Manually define each key using a JSON parsing function or a single colon (’:’)

To define key2 in the above-described example, you would write the following:

Postgres:
- dimension: key2
  sql: json_extract_path(${TABLE}.test_json_object, 'key2')

Snowflake:
- dimension: key2
  sql: ${TABLE}.test_json_object:key2

This will return a string value by default, still in quotes:

"a44g6jX3"

Step 3: Cast data types for each dimension

Now that we have a baseline list of dimensions defined, we’ll explicitly cast these dimensions as appropriate data types in the SQL parameter:

Postgres:
- dimension: key2
  sql: CAST(JSON_EXTRACT_PATH(${TABLE}.test_json_object, 'key2') AS string)

Snowflake:
- dimension: key2
  sql: ${TABLE}.test_json_object:key2::string

This will now result in quotes being removed:

a44g6jX3

Simply declaring a LookML dimension type (string, number, etc.) may NOT remove the quotes (specifically in Snowflake). Even worse, if you have an integer dimension defined as the following (type declared, but not explicitly casted)…

Snowflake:
- dimension: key3
  type: number
  sql: ${TABLE}.test_json_object:key3

… you risk obtaining nulls for that dimension.
40bbd6fe1fc31358115bfec27d78e6fc7218de1d.png
As such, explicitly casting data types at this stage is crucial.

Step 4: Make sure timestamps come through correctly

This is an extension of step 3. LookML offers native timestamp conversion with dimension_group and type: time declaration:

Postgres and Snowflake:
- dimension_group: test
  type: time
  timeframes: []

While this might work for dates in Snowflake, you will most likely see errors cropping up when you try to select Time, Month, Year, etc.

Instead of assuming accurate timestamp handling by Looker, you should be explicitly casting your newly-defined dimensions, in this case as the following:

Postgres:
- dimension_group: new_time
  type: time
  timeframes: [time, date, week, month, year]
  sql: CAST(JSON_EXTRACT_PATH(json_extract_path(${TABLE}.test_json_object, 'key4') AS timestamp)

Snowflake:
- dimension_group: new_time
  type: time
  timeframes: [time, date, week, month, year]
  sql: ${TABLE}.test_json_object:key4::timestamp
4 11 16.6K
11 REPLIES 11
Top Labels in this Space
Top Solution Authors