[SQL] 해커랭크: Advanced Join 문제풀이

juyeon·2024년 2월 27일
0

SQL

목록 보기
6/9

Medium

1. SQL Project Planning ⭐

You are given a table, Projects, containing three columns: Task_ID, Start_Date and End_Date. It is guaranteed that the difference between the End_Date and the Start_Date is equal to 1 day for each row in the table.

If the End_Date of the tasks are consecutive, then they are part of the same project. Samantha is interested in finding the total number of different projects completed.

Write a query to output the start and end dates of projects listed by the number of days it took to complete the project in ascending order. If there is more than one project that have the same number of completion days, then order by the start date of the project.

Sample Input

Sample Output

2015-10-28 2015-10-29
2015-10-30 2015-10-31
2015-10-13 2015-10-15
2015-10-01 2015-10-04

Explanation

The example describes following four projects:

  • Project 1: Tasks 1, 2 and 3 are completed on consecutive days, so these are part of the project. Thus start date of project is 2015-10-01 and end date is 2015-10-04, so it took 3 days to complete the project.
  • Project 2: Tasks 4 and 5 are completed on consecutive days, so these are part of the project. Thus, the start date of project is 2015-10-13 and end date is 2015-10-15, so it took 2 days to complete the project.
  • Project 3: Only task 6 is part of the project. Thus, the start date of project is 2015-10-28 and end date is 2015-10-29, so it took 1 day to complete the project.
  • Project 4: Only task 7 is part of the project. Thus, the start date of project is 2015-10-30 and end date is 2015-10-31, so it took 1 day to complete the project.

풀이

  • 프로젝트 시작 날짜, 종료 날짜 출력
  • 정렬: 프로젝트 완성에 걸린 날짜로 오름차순, 시작 날짜로 내림차순

1. 실패

SELECT Start_Date, End_Date
FROM Projects
WHERE Start_Date NOT IN (SELECT End_Date FROM Projects)
AND End_Date NOT IN (SELECT Start_Date FROM Projects)
  • 검색해서 조건을 찾긴 했는데.. 이게 완성된 쿼리가 아닌건 알지만, 왜 이렇게 했을때 데이터가 몇개 안 나오는거지? 이 쿼리의 문제점은 뭘까?

데이터리안 풀이

핵심은 연속되지 않는 날짜를 찾는 것!

먼저, 다른 task의 종료일과 연속되지 않는 시작일(즉, 새로운 task) 찾아서 번호를 매김

SELECT Start_Date
     , ROW_NUMBER() OVER (ORDER BY Start_Date) rn
FROM Projects
WHERE Start_Date NOT IN (SELECT DISTINCT End_Date FROM Projects);

그리고 다른 task의 시작일과 연속되지 않는 종료일(즉, 뒤에 더이상 연속된 task가 없는 찐 종료일) 찾아서 번호를 매김

SELECT End_Date
     , ROW_NUMBER() OVER (ORDER BY End_Date) rn
FROM Projects
WHERE End_Date NOT IN (SELECT DISTINCT Start_Date FROM Projects);

이 둘을 JOIN하면 된다!

  • INNER JOIN 사용
  • '기간' 정렬은 DATEDIFF 함수 사용
SELECT Start_Date
     , End_Date
FROM (
    SELECT Start_Date
         , ROW_NUMBER() OVER (ORDER BY Start_Date) rn
    FROM Projects
    WHERE Start_Date NOT IN (SELECT DISTINCT End_Date FROM Projects)
) s_date
    INNER JOIN (
        SELECT End_Date
             , ROW_NUMBER() OVER (ORDER BY End_Date) rn
        FROM Projects
        WHERE End_Date NOT IN (SELECT DISTINCT Start_Date FROM Projects)
) e_date ON s_date.rn = e_date.rn
ORDER BY DATEDIFF (End_Date, Start_Date), Start_Date;

2. Placements

Write a query to output the names of those students whose best friends got offered a higher salary than them. Names must be ordered by the salary amount offered to the best friends. It is guaranteed that no two students got same salary offer.

Sample Input


Sample Output

Samantha
Julia
Scarlet
  • Explanation

풀이

  • 이름 출력
  • 조건: 친구가 자신보다 더 높은 급여인 경우
  • 정렬: 친구의 급여 오름차순
SELECT sub.Name
FROM (SELECT S.Name, F.Friend_ID, P.Salary AS my_salary
      FROM Students AS S
       INNER JOIN Friends AS F ON S.ID = F.ID
       INNER JOIN Packages AS P ON S.ID = P.ID) AS sub
 INNER JOIN Packages AS P ON sub.Friend_ID = P.ID
WHERE sub.my_salary < P.Salary
ORDER BY P.Salary;

다시풀기

다시 풀어보니, 잠시 헷갈렸다. 내 월급보다 높은 친구 월급을 조건으로 찾으면 되는데,SELECT절에서 해야하나? 서브쿼리를 써야하나? 하면서 헷갈렸기 때문이다. 다시 찬찬히 생각해보니, WHERE절이라는 쉬운 정답이 있었다.

예시의 해설표처럼 나타내려면:

SELECT s.ID
     , s.Name
     , p.Salary
     , f.Friend_ID
     , p2.Salary
FROM Students s
	 JOIN Packages p ON s.ID = p.ID
	 JOIN Friends f ON s.ID = f.ID
     JOIN Packages p2 ON f.Friend_ID = p2.ID
WHERE p.Salary < p2.Salary
ORDER BY p2.Salary

정답 쿼리:

SELECT s.Name
FROM Students s
	 INNER JOIN Packages p ON s.ID = p.ID
	 INNER JOIN Friends f ON s.ID = f.ID
	 INNER JOIN Packages p2 ON f.Friend_ID = p2.ID
WHERE p.Salary < p2.Salary
ORDER BY p2.Salary

3. Symmetric Pairs

You are given a table, Functions, containing two columns: X and Y.

Two pairs (X1, Y1) and (X2, Y2) are said to be symmetric pairs if X1 = Y2 and X2 = Y1.

Write a query to output all such symmetric pairs in ascending order by the value of X. List the rows such that X1 ≤ Y1.

Sample Input

Sample Output

20 20
20 21
22 23

풀이

  • 조건 이해하기가 순간 어려웠지만..
    • x가 y에도 있고, y가 x에도 있는 값인 경우
    • y는 x보다 같거나 커야함
  • 정렬: x 기준으로 오름차순
SELECT F1.X, F1.Y
FROM FUNCTIONS AS F1, FUNCTIONS AS F2
WHERE F1.X = F2.Y
AND F1.Y = F2.X
GROUP BY F1.X, F1.Y
HAVING COUNT(F1.X) != 1 OR F1.X < F1.Y
ORDER BY F1.X;

Hard

1. Interviews

Samantha interviews many candidates from different colleges using coding challenges and contests. Write a query to print the contest_id, hacker_id, name, and the sums of total_submissions, total_accepted_submissions, total_views, and total_unique_views for each contest sorted by contest_id. Exclude the contest from the result if all four sums are 0.

Note: A specific contest can be used to screen candidates at more than one college, but each college only holds 1 screening contest.

Input Format

The following tables hold interview data:

  • Contests: The contest_id is the id of the contest, hacker_id is the id of the hacker who created the contest, and name is the name of the hacker.

  • Colleges: The college_id is the id of the college, and contest_id is the id of the contest that Samantha used to screen the candidates.

  • Challenges: The challenge_id is the id of the challenge that belongs to one of the contests whose contest_id Samantha forgot, and college_id is the id of the college where the challenge was given to candidates.

  • View_Stats: The challenge_id is the id of the challenge, total_views is the number of times the challenge was viewed by candidates, and total_unique_views is the number of times the challenge was viewed by unique candidates.

  • Submission_Stats: The challenge_id is the id of the challenge, total_submissions is the number of submissions for the challenge, and total_accepted_submission is the number of submissions that achieved full scores.

Sample Input

Contests Table:
Colleges Table:
Challenges Table:
View_Stats Table:
Submission_Stats Table:

Sample Output

66406 17973 Rose 111 39 156 56
66556 79153 Angela 0 0 11 10
94828 80275 Frank 150 38 41 15

Explanation

The contest 66406 is used in the college 11219. In this college 11219, challenges 18765 and 47127 are asked, so from the view and submission stats:

  • Sum of total submissions = 27 + 56 + 28 = 111
  • Sum of total accepted submissions = 10 + 18 + 11 = 39
  • Sum of total views = 43 + 72 + 26 + 15 = 156
  • Sum of total unique views = 10 + 13 + 19 + 14 = 56

Similarly, we can find the sums for contests 66556 and 94828.

풀이

  • contest_id, hacker_id, name, and the sums of total_submissions, total_accepted_submissions, total_views, and total_unique_views을 출력
  • 조건: sorted by contest_id
  • 단, Exclude the contest from the result if all four sums are 0.(4개의 합이 모두 0이면 해당 contest를 결과에서 제외)
  • View_Stats 과 Submission_Stats에는 챌린지 아이디가 유니크하지 않으니까, 그룹핑 해줘야함.
SELECT Contests.contest_id, 
        Contests.hacker_id, 
        Contests.name, 
        SUM(total_submissions), 
        SUM(total_accepted_submissions), 
        SUM(total_views), 
        SUM(total_unique_views)
FROM Contests
join Colleges ON Contests.contest_id = Colleges.contest_id
join Challenges ON Colleges.college_id = Challenges.college_id
left join (SELECT challenge_id, 
           SUM(total_views) AS total_views, 
           SUM(total_unique_views) AS total_unique_views
           FROM View_stats 
           GROUP BY challenge_id) AS view_s
           ON Challenges.challenge_id = view_s.challenge_id 
left join (SELECT challenge_id, 
            SUM(total_submissions) AS total_submissions, 
            SUM(total_accepted_submissions) AS total_accepted_submissions 
           FROM Submission_stats
           GROUP BY challenge_id) AS sub_s
           ON Challenges.challenge_id = sub_s.challenge_id
GROUP BY Contests.contest_id, Contests.hacker_id, Contests.name
HAVING SUM(total_submissions)
       + SUM(total_accepted_submissions)
       + SUM(total_views)
       + SUM(total_unique_views) > 0
ORDER BY Contests.contest_id;

2. 15 Days of Learning SQL

Julia conducted a 15 days of learning SQL contest. The start date of the contest was March 01, 2016 and the end date was March 15, 2016.

Write a query to print total number of unique hackers who made at least 1 submission each day (starting on the first day of the contest), and find the hacker_id and name of the hacker who made maximum number of submissions each day. If more than one such hacker has a maximum number of submissions, print the lowest hacker_id. The query should print this information for each day of the contest, sorted by the date.

tables

  • Hackers: The hacker_id is the id of the hacker, and name is the name of the hacker.

  • Submissions: The submission_date is the date of the submission, submission_id is the id of the submission, hacker_id is the id of the hacker who made the submission, and score is the score of the submission.

Sample Input

For the following sample input, assume that the end date of the contest was March 06, 2016.
Hackers Table:

Submissions Table:

Sample Output

2016-03-01 4 20703 Angela
2016-03-02 2 79722 Michael
2016-03-03 2 20703 Angela
2016-03-04 2 20703 Angela
2016-03-05 1 36396 Frank
2016-03-06 1 20703 Angela

Explanation

On March 01, 2016 hackers 20703, 36396, 53473, and 79722 made submissions. There are unique hackers who made at least one submission each day. As each hacker made one submission, 20703 is considered to be the hacker who made maximum number of submissions on this day. The name of the hacker is Angela.

On March 02, 2016 hackers 15758, 20703, and 79722 made submissions. Now 20703 and 79722 were the only ones to submit every day, so there are 2 unique hackers who made at least one submission each day. 79722 made 2 submissions, and name of the hacker is Michael.

On March 03, 2016 hackers 20703, 36396, 79722 and 79722 made submissions. Now 20703 and 79722 were the only ones, so there are 2 unique hackers who made at least one submission each day. As each hacker made one submission so 20703 is considered to be the hacker who made maximum number of submissions on this day. The name of the hacker is Angela.

On March 04, 2016 hackers 20703, 44065, 53473, and 79722 made submissions. Now 20703 and 79722 only submitted each day, so there are 2 unique hackers who made at least one submission each day. As each hacker made one submission so 20703 is considered to be the hacker who made maximum number of submissions on this day. The name of the hacker is Angela.

On March 05, 2016 hackers 20703, 36396, 38289 and 62529 made submissions. Now 20703 only submitted each day, so there is only 1 unique hacker who made at least one submission each day. 36396 made 2 submissions and name of the hacker is Frank.

On March 06, 2016 only 20703 made submission, so there is only 1 unique hacker who made at least one submission each day. 20703 made 1 submission and name of the hacker is Angela.

풀이

  • 매일 1회 이상 제출한 해커의 총 인원수(중복 제외), 매일 제출 수 최댓값인 해커 id와 이름을 출력(동점이면, 가장 낮은 해커 id 출력)
  • 중복 제거해서 뽑고, 중복 제거해서 뽑고..
SELECT submission_date ,
(SELECT COUNT(DISTINCT hacker_id)  
 FROM Submissions s2  
 WHERE s2.submission_date = s1.submission_date 
 AND (SELECT COUNT(DISTINCT s3.submission_date) 
 FROM Submissions s3 
 WHERE s3.hacker_id = s2.hacker_id 
 AND s3.submission_date < s1.submission_date) = dateDIFF(s1.submission_date , '2016-03-01')) ,
(SELECT hacker_id  
 FROM submissions s2 
 WHERE s2.submission_date = s1.submission_date 
 GROUP BY hacker_id 
 ORDER BY count(submission_id) DESC , hacker_id limit 1) AS shit,
(SELECT name FROM hackers WHERE hacker_id = shit)
FROM (SELECT DISTINCT submission_date FROM submissions) s1
GROUP BY submission_date;
profile
내 인생의 주연

0개의 댓글