Customer cohort
1with
2 time_series_source as (
3 select
4 week
5 from
6 unnest (
7 generate_date_array(
8 date_trunc('2019-01-01', week(monday))
9 , current_date
10 , interval 1 week
11 )
12 ) as week
13 )
14 , time_series_sum as (
15 select
16 *
17 from
18 (
19 select distinct
20 (email)
21 from
22 {{core.shopify_orders}}
23 where
24 email is not null
25 and email <> ''
26 )
27 cross join time_series_source week
28 order by
29 week asc
30 )
31 , cohort as (
32 select
33 email
34 , cast(date_trunc(min(created_at), week(monday)) as date) customer_week
35 , from
36 {{core.shopify_orders}}
37 group by
38 email
39 )
40 , customer as (
41 select
42 email
43 , cast(date_trunc(created_at, week(monday)) as date) week
44 , sum(total_line_price_ex_tax) revenue
45 from
46 {{core.website.shopify_orders}}
47 group by
48 email
49 , cast(date_trunc(created_at, week(monday)) as date)
50 )
51 , customer_weeks as (
52 select
53 t.email
54 , t.week
55 , customer_week
56 , revenue
57 , case
58 when t.week = customer_week then 1
59 else 0
60 end became_customer
61 , from
62 time_series_sum t
63 left join customer c on t.email = c.email
64 and t.week = c.week
65 left join cohort h on t.email = h.email
66 )
67 , customer_cohorts as (
68 select
69 email
70 , customer_week
71 , week
72 , revenue
73 , became_customer
74 , sum(became_customer) over (
75 partition by
76 email
77 order by
78 week asc rows between unbounded preceding
79 and current ROW
80 ) customer_rt
81 , sum(revenue) over (
82 partition by
83 email
84 order by
85 week asc rows between unbounded preceding
86 and current ROW
87 ) revenue_rt
88 from
89 customer_weeks
90 )
91select
92 email
93 , revenue
94 , week
95 , row_number() over (
96 partition by
97 email
98 order by
99 week asc
100 ) as week_num
101 , revenue_rt
102 , customer_rt
103 , became_customer customer
104 , first_value(customer_week ignore nulls) over (
105 partition by
106 email
107 order by
108 week asc
109 ) cohort
110from
111 customer_cohorts
112where
113 customer_rt <> 0
114order by
115 week asc
+------------------------+---------+------------+----------+------------+--------------+----------+
| email | revenue | week | week_num | revenue_rt | customer_rt | customer |
+------------------------+---------+------------+----------+------------+--------------+----------+
| john.doe@example.com | 123.45 | 2019-01-07 | 1 | 123.45 | 1 | 1 |
| john.doe@example.com | 678.90 | 2019-01-14 | 2 | 802.35 | 1 | 0 |
| jane.doe@example.com | 234.56 | 2019-01-14 | 1 | 234.56 | 1 | 1 |
| jane.doe@example.com | 789.01 | 2019-01-21 | 2 | 1023.57 | 2 | 0 |
| jane.doe@example.com | 456.78 | 2019-01-28 | 3 | 1480.35 | 2 | 0 |
| bob.smith@example.com | 345.67 | 2019-01-28 | 1 | 345.67 | 1 | 1 |
| bob.smith@example.com | 901.23 | 2019-02-04 | 2 | 1246.90 | 2 | 0 |
| bob.smith@example.com | 678.90 | 2019-02-11 | 3 | 1925.80 | 2 | 0 |
| bob.smith@example.com | 234.56 | 2019-02-18 | 4 | 2160.36 | 2 | 0 |
| bob.smith@example.com | 789.01 | 2019-02-25 | 5 | 2949.37 | 2 | 0 |
| bob.smith@example.com | 456.78 | 2019-03-04 | 6 | 3406.15 | 2 | 0 |
+------------------------+---------+------------+----------+------------+--------------+----------+
This SQL template is designed to analyze customer behavior and revenue over time. It uses a time series source to generate a list of weeks, and then cross-joins this with a list of distinct customer emails from the Shopify orders table. The cohort is then created by grouping customers by the week they made their first purchase. The customer table is created by grouping customers by the week they made a purchase, and then summing their revenue. The customer weeks table is created by joining the time series and customer tables, and then joining the cohort table to identify which week each customer made their first purchase. The customer cohorts table is created by grouping customers by email and week, and then calculating the running total of revenue and the number of customers who made their first purchase in each week. Finally, the output is filtered to only include customers who have made a purchase, and then sorted by week. The output includes the customer email, revenue, week, running total of revenue, running total of customers, whether the customer made their first purchase in that week, and the week they made their first purchase.