인프런 강의 <데이터 분석을 위한 고급 SQL>을 듣고, 중요한 점을 정리한 글입니다.
Table: Employee
+--------------+---------+
| Column Name | Type |
+--------------+---------+
| id | int |
| name | varchar |
| salary | int |
| departmentId | int |
+--------------+---------+
id is the primary key (column with unique values) for this table.
departmentId is a foreign key (reference columns) of the ID from the Department table.
Each row of this table indicates the ID, name, and salary of an employee. It also contains the ID of their department.
Table: Department
+-------------+---------+
| Column Name | Type |
+-------------+---------+
| id | int |
| name | varchar |
+-------------+---------+
id is the primary key (column with unique values) for this table. It is guaranteed that department name is not NULL.
Each row of this table indicates the ID of a department and its name.
Write a solution to find employees who have the highest salary in each of the departments.
Return the result table in any order.
The result format is in the following example.
Example 1:
Input:
Employee table:
+----+-------+--------+--------------+
| id | name | salary | departmentId |
+----+-------+--------+--------------+
| 1 | Joe | 70000 | 1 |
| 2 | Jim | 90000 | 1 |
| 3 | Henry | 80000 | 2 |
| 4 | Sam | 60000 | 2 |
| 5 | Max | 90000 | 1 |
+----+-------+--------+--------------+
Department table:
+----+-------+
| id | name |
+----+-------+
| 1 | IT |
| 2 | Sales |
+----+-------+
Output:
+------------+----------+--------+
| Department | Employee | Salary |
+------------+----------+--------+
| IT | Jim | 90000 |
| Sales | Henry | 80000 |
| IT | Max | 90000 |
+------------+----------+--------+
Explanation: Max and Jim both have the highest salary in the IT department and Henry has the highest salary in the Sales department.
나의 풀이
: 윈도우 함수 개념을 완전히 이해하지 못해서 어떻게 구현해야할 지 감이 오지 않았다.
선생님 풀이
SELECT ms.department
, ms.name AS Employee
, ms.salary
FROM (
SELECT employee.name
, employee.salary
, department.name AS department
, MAX(salary) OVER (PARTITION BY departmentid) AS max_salary
FROM employee
INNER JOIN department ON employee.departmentid = department.id
) ms
WHERE ms.salary = ms.max_salary
SELECT에서 연산한 결과물을 바로 WHERE절에 쓸 수 없기 때문에 FROM 서브쿼리로 감싸준 것.
서브쿼리에 윈도우함수까지 같이 써줬더니, 이해는 되는데, 정확히 내가 다시 활용할 수 있을지 확신이 안 선다. 이 문제는 여러번 풀어봐야 그 원리를 체화할 수 있겠다.