I just wanted to share something I built recently to analyze a customer or user journey using events data:
This post assume you have a big table with event type, event timestamp and a user_id.
I reused some code from this post
The only thing we need on the modeling side is to get, for each user, what was their first event, second event,...
SELECT
events.user_id AS user_id,
events.event_type AS event_type,
events.created_at AS event_date,
lead(events.event_type,0) over (partition by user_id order by created_at asc) as first_event,
lead(events.event_type,1) over (partition by user_id order by created_at asc)as second_event,
lead(events.event_type,2) over (partition by user_id order by created_at asc) as third_event,
lead(events.event_type,3) over (partition by user_id order by created_at asc) as fourth_event,
lead(events.event_type,4) over (partition by user_id order by created_at asc) as fifth_event,
lead(events.event_type,5) over (partition by user_id order by created_at asc) as sixth_event,
lead(events.event_type,6) over (partition by user_id order by created_at asc) as seventh_event
FROM looker-private-demo.ecomm.events AS events
GROUP BY
1,2,3
ORDER BY
3 DESC
then we just add it to a derived table:
view: sankey {
derived_table: {
sql:
SELECT
events.user_id AS user_id,
events.event_type AS event_type,
events.created_at AS event_date,
lead(events.event_type,0) over (partition by user_id order by created_at asc) as first_event,
lead(events.event_type,1) over (partition by user_id order by created_at asc)as second_event,
lead(events.event_type,2) over (partition by user_id order by created_at asc) as third_event,
lead(events.event_type,3) over (partition by user_id order by created_at asc) as fourth_event,
lead(events.event_type,4) over (partition by user_id order by created_at asc) as fifth_event,
lead(events.event_type,5) over (partition by user_id order by created_at asc) as sixth_event,
lead(events.event_type,6) over (partition by user_id order by created_at asc) as seventh_event
FROM looker-private-demo.ecomm.events AS events
GROUP BY
1,2,3
ORDER BY
3 DESC
;;
}
measure: count {
type: count_distinct
sql: ${user_id} ;;
}
dimension_group: event {
type: time
sql: ${TABLE}.event_date ;;
}
dimension: first_event {
type: string
sql: case when ${TABLE}.first_event is null then 'Bounce' else ${TABLE}.first_event end ;;
}
dimension: second_event {
type: string
sql: case when ${TABLE}.second_event is null then 'Bounce' else ${TABLE}.second_event end ;;
}
dimension: third_event {
type: string
sql: case when ${TABLE}.third_event is null then 'Bounce' else ${TABLE}.third_event end ;;
}
dimension: fourth_event {
type: string
sql: case when ${TABLE}.fourth_event is null then 'Bounce' else ${TABLE}.fourth_event end ;;
}
dimension: fifth_event {
type: string
sql: case when ${TABLE}.fifth_event is null then 'Bounce' else ${TABLE}.fifth_event end ;;
}
dimension: sixth_event {
type: string
sql: case when ${TABLE}.sixth_event is null then 'Bounce' else ${TABLE}.sixth_event end ;;
}
dimension: seventh_event {
type: string
sql: case when ${TABLE}.seventh_event is null then 'Bounce' else ${TABLE}.seventh_event end ;;
}
dimension:user_id {
type: number
sql: ${TABLE}.user_id ;;
}
}
Then you can explore it and get this sankey diagram:
you can go further be joining your user table to look at these funnel by device, country,....
This is awesome, thanks for sharing!