Note: there is a newer version of this analytical pattern available: [Analytic Block] Daily, Weekly, Monthly Active Users. Check it out for a more detailed walkthrough and additional features!
We derive two tables, a simple date table with one column of just dates and a second table with two columns: activity_date and user_id.
We cross join these two tables to create a table with three columns, user_id, activity_date and window_date. In the destination table, there is a row in the table for every day in the 30 days following a transaction.
Some Notes:
- include: ecommerce.view.lookml
- view: dates
derived_table:
sql: |
-- ## 1) Create a Date table with a row for each date.
SELECT '2001-01-01'::DATE + d AS date
FROM
-- generate series should work, but it doesn't in redshift
-- generate_series(0,20000) d
(SELECT ROW_NUMBER() OVER(ORDER BY id) -1 AS d FROM orders ORDER BY id LIMIT 20000) AS d
sortkeys: [date]
persist_for: 10 minutes
- explore: use_rolling_30_day_window
joins:
- join: users
foreign_key: user_id
- view: use_rolling_30_day_window
derived_table:
sql: |
WITH daily_use AS (
-- ## 2 ) Create a table of days and activity by user id
SELECT
user_id
, DATE_TRUNC('day', created_at) as activity_date
FROM orders
)
-- ## 3) Cross join activity and dates to get a row for each
-- date in the window.
SELECT
daily_use.user_id
, wd.date as window_date
, daily_use.activity_date
FROM daily_use, ${dates.SQL_TABLE_NAME} AS wd
WHERE
wd.date >= daily_use.activity_date
AND wd.date < daily_use.activity_date + interval '30 day'
-- ## 4) Filter the range of dates we use so don't have to do all of history
AND {% condition activity_date %} daily_use.activity_date {% endcondition %}
AND {% condition window_date %} wd.date {% endcondition %}
fields:
- dimension_group: activity_date
type: date
sql: ${TABLE}.activity_date
- dimension_group: window_date
type: date
sql: ${TABLE}.window_date
- dimension: user_id
- dimension: active_day
type: yesno
sql: ${TABLE}.activity_date = ${TABLE}.window_date
- measure: user_count_active_30_days
type: count_distinct
sql: ${user_id}
detail: [users.id, users.name]
- measure: user_count_active_this_day
type: count_distinct
sql: ${user_id}
detail: [users.id, users.name]
filters:
active_day: yes
Count users as active on this date if they have bought anything in the last 30 days. Raw output from the rolling window table.
@errows asked how we might add a 7 day rolling average to this pattern.
This can be done by adding a couple of lookML fields.
- dimension: active_7_day
type: yesno
sql: (${TABLE}.window_date::date - ${TABLE}.activity_date::date) < 7
- measure: user_count_active_7_days
type: count_distinct
sql: ${user_id}
drill_fields: [users.id, users.name]
filters:
active_7_day: yes
Hi Lloyd,
Weโve used this exact approach but weโve noticed that the numbers we get as outputs for MAU, WAU and DAU are changing from report to report. For example, weโll get our MAU figure on Monday of 50,000 and then later in the week that number is +/- 5-10%.
We are racking our brains about what could cause that on our side, but can you think of anything that would cause those changes in terms of how the SQL is designed?
Cheers,
Justin
To debug this, I would group by activity_date and window_date and user_count and make sure Iโm seeing what I expect. Is it possible you are looking forward instead of a backward window?
The other thing that can be weird is timezones. If you have timezones turned on, truncating to dates it is possible that you might have records that show up in a window, but when you look at the normal transactions the values donโt line up. When truncating to date in a derived table, you need to be careful with timezones.
Hi,
I am new to looker and have been trying to make the 30 day rolling count distinct of user id. I tried following along these instructions but I am stuck. I know it is an old post but I would appreciate some input on how it can be done in Snowflake.
I know of the newer analytical block for MAU and DAU but that one calculated these as dimensions. I want to have them as measures for my end users. The table shown in the screenshot at the end of this post is exactly what i want to achieve.
Thanks,
Fizza