Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

Why do I get different result with this queries

Hi,

I am currently completing the Data Analytics Google Career Certificate. As part of it, they are exercises with walk-through. I always try completing it on my own before reading the walk-through. The question was to find the bike had the most usage hours and from which station said bike departed the most often. I found the answer to this question through my method: bike 370 from station 3798.

However, out of curiosity I wanted to add an extra question: how many trips of that most used bike started from this departs most often from station.

With my method, I get 167 but the real answer is 177.

I cannot figure out why I have this discrepancy. Please help me understand/fix the issue in my script. Lots of thanks!

Here my script:

##Find the most used bike and create a temporary table to store the information for later##
WITH
longest_used_bike AS

(
SELECT

bike_id,
SUM(duration_minutes) AS total_use_time

FROM `bigquery-public-data.austin_bikeshare.bikeshare_trips`
GROUP BY bike_id
ORDER BY total_use_time DESC
LIMIT 1
)


##Find the station the most used bike departs from the most often##
SELECT

original_set.start_station_name,
original_set.start_station_id,
original_set.bike_id,
COUNT (start_station_id) AS unique_starts,

FROM `bigquery-public-data.austin_bikeshare.bikeshare_trips` AS original_set
INNER JOIN longest_used_bike ON original_set.bike_id = longest_used_bike.bike_id
GROUP BY start_station_name,start_station_id, bike_id
ORDER BY unique_starts DESC
LIMIT 1

Here Google Data Analytics Course' script:

WITH
longest_used_bike AS

(
SELECT


bike_id,
SUM(duration_minutes) AS total_use_time


FROM `bigquery-public-data.austin_bikeshare.bikeshare_trips`
GROUP BY bike_id
ORDER BY total_use_time DESC
LIMIT 1
)


##Find the station##
SELECT


original_set.start_station_id,
COUNT (*) AS unique_starts


FROM longest_used_bike
INNER JOIN `bigquery-public-data.austin_bikeshare.bikeshare_trips` AS original_set
ON longest_used_bike.bike_id = original_set.bike_id
GROUP BY original_set.start_station_id
ORDER BY unique_starts DESC

0 1 279
1 REPLY 1

Given the previous queries, the following SQL query should accomplish what you're asking for:

WITH
longest_used_bike AS
(
SELECT
bike_id,
SUM(duration_minutes) AS total_use_time
FROM `bigquery-public-data.austin_bikeshare.bikeshare_trips`
GROUP BY bike_id
ORDER BY total_use_time DESC
LIMIT 1
),
most_used_station AS
(
SELECT
original_set.start_station_id,
COUNT (*) AS unique_starts
FROM longest_used_bike
INNER JOIN `bigquery-public-data.austin_bikeshare.bikeshare_trips` AS original_set
ON longest_used_bike.bike_id = original_set.bike_id
GROUP BY original_set.start_station_id
ORDER BY unique_starts DESC
LIMIT 1
)

SELECT
COUNT (*) AS trips_from_most_used_station
FROM `bigquery-public-data.austin_bikeshare.bikeshare_trips`
WHERE bike_id = (SELECT bike_id FROM longest_used_bike)
AND start_station_id = (SELECT start_station_id FROM most_used_station)