Hello Guys I have a hard problem which I need your help in
We are trying to make one common place for all our metrics and there are two ways to do that
The second option have way more benefits for us but I have one problem with the second option
lets say a stakeholder want a daily and monthly chart for the number of users
1. with the first approach, I will do this
SELECT * FROM orders
and after that i will create a calculated filed called nbr_users
COUNT_DISINCT(userID)
this way it will work perfectly with the first approach
e.g.: I have user order 1 time for 3 days in a row so when i do daily count i should see 1 user one each day and when i do monthly I get only 1 user (it works this way with the first approach)
2. but with the second approach the problem appear
i will do this query
SELECT
COUNT(DISTINCT(userID)),
DATE(creationDate) AS day
FROM
orders
GROUP BY
DATE(creationDate)
and then i do nothing on the data source i just connect it with this custom BigQuery SQL
but this way when we get back to our example we will have an issue
on the daily we will get the user once daily for three days
but when we do monthly we will get 3 users but the truth its only one user that had orders on different days
The only way to fix this i can think about is doing different sql for each groupping (monthly or daily or weekly, ...)
but this way is hard to handle creating 5 sql just to do the same thing with different grouping
lets say stakeholders want daily, weekly, monthly, quarterly, and yearly its like 5 SQLs
Hi @DZlAmir
can you please clarify exactly what you are trying to achieve?
On your first point, that makes total sense- you are doing a distinct count and group it by different aggregates. A user that has done 3 orders on different dates within a month will have a count of 3 if grouped by date, and a count of 1 if grouped by month.
I cant figure out your second point. You are using a custom query that returns two columns only - date and distinct count, so it returns a date and an integer...how do you anticipate the count distinct on that integer to work if you group by monthly?
I have mocked up a quick table with what I think you are trying to achieve
I have date and customer ID and I `select * from table` as my datasource. I can use this then to show both use cases you want to show:
all data visible
data grouped by date, count distinct on user id
data grouped by month (same dimension as source), count distinct on user_id
Hey Thanks for your response, I know this is possible by using `SELECT * FROM table` and after that using the calculated fields to do both monthly and daily
I want to achieve the same thing but only using bigquery and this is like impossible
the aim of doing this is to create a robust view that have all the needed aggregations for the business and with this way I can use dataplex and its cataloging version control with git, but I couldn't find anyway to do that with only one custom SQL, while its possible in looker studio data source with the calculated fields
I am not sure I understand.
This is one view in BigQuery, no calculated fields required, to be used in Looker Studio.
Yeah I know but this is just a simple example metric lets we are trying to do order frequency it will need a calculated field but am trying to do it just by SQL and I don't think its possible
User | Count |
---|---|
3 | |
2 | |
2 | |
1 | |
1 |