Write a solution to report all the duplicate emails. Note that it's guaranteed that the email field is not NULL.
Column Name | Type |
---|---|
id | int |
varchar |
id is the primary key (column with unique values)
DISTINCT
를 사용하면 안된다! 중복이 존재하는 이메일만 출력해야하므로, 먼저 이메일로 묶어서 HAVING
으로 해당 이메일의 중복된 개수를 세자.SELECT Email
FROM Person
GROUP BY Email
HAVING COUNT(Email) > 1;
The scores should be ranked from the highest to the lowest.
If there is a tie between two scores, both should have the same ranking.
After a tie, the next ranking number should be the next consecutive integer value. In other words, there should be no holes between ranks.
Return the result table ordered by score in descending order.
Column Name | Type |
---|---|
id | int |
score | decimal |
윈도우 함수에서 그룹 내 순위 함수들은 '중복' 여부로 구분 가능하다.
RANK() OVER
: 동점자는 같은 순위이고, 동점자 수 만큼 건너뛰고 다음 순위를 부여한다.DENSE_RANK() OVER
: 동점자는 같은 순위이고, 이어서 다음 순위를 부여한다.ROW_NUMBER() OVER
: 동점 상관 없이 순차적으로 순위 부여SELECT score
, RANK() OVER(ORDER BY score DESC) AS rank_score
, DENSE_RANK() OVER (ORDER BY score DESC) AS dense_rank_score
, ROW_NUMBER() OVER (ORDER BY score DESC) AS row_number_score
FROM Scores
score | rank_score | dense_rank_score | row_number_score |
---|---|---|---|
4 | 1 | 1 | 1 |
4 | 1 | 1 | 2 |
3.85 | 3 | 2 | 3 |
3.65 | 4 | 3 | 4 |
3.65 | 4 | 3 | 5 |
3.5 | 6 | 4 | 6 |
문제에서 요구하는건
DENSE_RANK
SELECT score
, DENSE_RANK() OVER (ORDER BY score DESC) AS "rank"
FROM Scores
이때, "rank"로 써야 함수명과 겹치지 않고 출력되는 컬럼명 지정이 가능하다.
Table: Trips
Column Name | Type |
---|---|
id | int |
client_id | int |
driver_id | int |
city_id | int |
status | enum |
request_at | date |
id is the primary key (column with unique values) for this table.
The table holds all taxi trips. Each trip has a unique id, while client_id and driver_id are foreign keys to the users_id at the Users table.
Status is an ENUM (category) type of ('completed', 'cancelled_by_driver', 'cancelled_by_client').
Table: Users
Column Name | Type |
---|---|
users_id | int |
banned | enum |
role | enum |
users_id is the primary key (column with unique values) for this table.
The table holds all users. Each user has a unique users_id, and role is an ENUM type of ('client', 'driver', 'partner').
banned is an ENUM (category) type of ('Yes', 'No').
Input:
Trips table:
id | client_id | driver_id | city_id | status | request_at |
---|---|---|---|---|---|
1 | 1 | 10 | 1 | completed | 2013-10-01 |
2 | 2 | 11 | 1 | cancelled_by_driver | 2013-10-01 |
3 | 3 | 12 | 6 | completed | 2013-10-01 |
4 | 4 | 13 | 6 | cancelled_by_client | 2013-10-01 |
5 | 1 | 10 | 1 | completed | 2013-10-02 |
6 | 2 | 11 | 6 | completed | 2013-10-02 |
7 | 3 | 12 | 6 | completed | 2013-10-02 |
8 | 2 | 12 | 12 | completed | 2013-10-03 |
9 | 3 | 10 | 12 | completed | 2013-10-03 |
10 | 4 | 13 | 12 | cancelled_by_driver | 2013-10-03 |
Users table:
users_id | banned | role |
---|---|---|
1 | No | client |
2 | Yes | client |
3 | No | client |
4 | No | client |
10 | No | driver |
11 | No | driver |
12 | No | driver |
13 | No | driver |
Output:
Day | Cancellation Rate |
---|---|
2013-10-01 | 0.33 |
2013-10-02 | 0.00 |
2013-10-03 | 0.50 |
WITH status_table AS (
SELECT t.request_at AS Day
, COUNT(*) AS total_request
, COUNT(CASE WHEN t.status = "cancelled_by_client" OR t.status = "cancelled_by_driver" THEN 1 ELSE NULL END) AS canceled_request
FROM Trips t
LEFT JOIN Users client ON t.client_id = client.users_id
LEFT JOIN Users driver ON t.driver_id = driver.users_id
WHERE t.request_at BETWEEN "2013-10-01" and "2013-10-03"
AND client.banned = "No"
AND driver.banned = "No"
GROUP BY t.request_at
)
SELECT Day
, ROUND(canceled_request / total_request, 2) AS "Cancellation Rate"
FROM status_table
내 풀이와 비슷하지만, WITH 절이 아닌 FROM 절에서 서브쿼리를 사용함.
SELECT Day
, ROUND(cancel_count / total_count, 2) AS "Cancellation Rate"
FROM (
SELECT request_at AS Day
, COUNT(*) AS total_count
, SUM(CASE WHEN status != "completed" THEN 1 ELSE 0 END) AS cancel_count
FROM Trips t
INNER JOIN Users uc ON t.client_id = uc.users_id
INNER JOIN Users ud ON t.driver_id = ud.users_id
WHERE t.request_at BETWEEN "2013-10-01" AND "2013-10-03"
AND uc.banned = "No"
AND ud.banned = "No"
GROUP BY request_at
) t
그렇다면, WITH 절 vs FROM절 서브쿼리는?
문제의 조건을 잘 살피자.
기간은 "2013-10-01" ~ "2013-10-03"만 찾아야한다!!
id를 짝수와 홀수로 나누어 각각 WITH 절로 만든 후, 이를 UNION으로 통합하여 id 오름차순으로 정렬했다.
그러나.. 학생 수가 홀수 일 경우 마지막 id는 그대로 놔둬야하는데, 모든 홀수를 +1 해서 짝수로 만드는 바람에 실패.
-- 1, 3, 5: odd
-- 2, 4: even
WITH Odd_table AS (
SELECT id + 1 AS id
, student
FROM Seat
WHERE MOD(id, 2) = 1
),
Even_table AS (
SELECT id - 1 AS id
, student
FROM Seat
WHERE MOD(id, 2) = 0
),
Total_table AS (
SELECT *
FROM Odd_table
UNION
SELECT *
FROM Even_table
ORDER BY id
)
SELECT *
FROM Total_table;
전체 학생이 짝수라고 가정하고 풀면:
SELECT CASE
WHEN MOD(id, 2) = 1 THEN id + 1
ELSE id - 1
END new_id
, id
, student
FROM Seat
전체 학생이 홀수일 수도 있는 경우
SELECT id
, student
, COUNT(*) OVER () AS total_rows
FROM Seat
OVER안에 어떠한 파티션이나 오더도 없어서 그냥 전체 테이블의 테이블 개수를 세서 매 row마다 표시함.
그렇다면, 이 id와 total_rows가 같은 경우, 즉 마지막 학생이 같은 경우를 예외 케이스로 봐서 처리하자!
학생 수가 홀수이면서 and 마지막 학생이 아닌 경우, id + 1
학생 수가 홀수이면서 and 마지막 학생인 경우, id 그대로
학생 수가 짝수인 경우, id 그대로
SELECT CASE
WHEN MOD(id, 2) = 1 AND id != total_rows THEN id + 1
WHEN MOD(id, 2) = 1 AND id = total_rows THEN id
ELSE id - 1
END id
, student
FROM (
SELECT id
, student
, COUNT(*) OVER () AS total_rows
FROM Seat
) t
ORDER BY id