SQL 6일차

한희수·2023년 3월 26일
0

빅데이터 분석 SQL

목록 보기
6/17

20230321 SQL

● 오라클 내 테이블 옵션 중 ‘model’ 들어가면 ERD (or table description)

  • 화살표 : 키
  • P : primary key of the table
  • 도메인 만드는 공부 필요!
  • 실제 회사에서는 column 이름이 다를 수도 있음 => 값으로 체크해야 함
  • 테이블 구조, PK(Primary Key), relation 먼저 다 살펴보아야 함

★프로그래밍 중요!!★

■ JOIN 종류 5가지
1) EQUI JOIN
2) INNER JOIN
3) OUTER JOIN
4) SELF JOIN
5) NON EQUI JOIN

■ ANSI 표준
1) NATURAL JOIN
2) JOIN USING
3) JOIN ON
4) (LEFT, RIGHT)OUTER JOIN
5) CROSS JOIN(CARTESIAN PRODUCT)

[문제33] 사원들의 employee_id, country_name 출력해주세요.

SELECT e.employee_id, c.country_name
FROM employees e, departments d, locations l, countries c
WHERE e.department_id = d.department_id
AND d.location_id = l.location_id
AND l.country_id = c.country_id;

● e d l c 테이블이 연결되는 과정
1)
SELECT e.employee_id, d.location_id
FROM employees e, departments d
WHERE e.department_id = d.department_id;

2)
SELECT e.employee_id, l.country_id
FROM (1)번 조인 결과 집합, locations l
WHERE d.location_id = l.location_id;

3)
SELECT e.employee_id, c.country_name
FROM (2)번 조인 결과 집합, countries c
WHERE l.country_id = c.country_id;

[문제34] 80 부서에 근무하는 사원들의 last_name, job_id, department_name, city 출력해주세요.

SELECT e.last_name, e.job_id, d.department_name, l.city
FROM employees e, departments d, locations l
WHERE e.department_id = d.department_id
AND d.location_id = l.location_id
AND d.department_id = 80; => 내가 한 것, 처리 순서 상관없음
▽▽▽
SELECT e.last_name, e.job_id, d.department_name, l.city
FROM employees e, departments d, locations l
WHERE e.department_id = 80
AND e.department_id = d.department_id
AND d.location_id = l.location_id;
▽▽▽
SELECT e.last_name, e.job_id, d.department_name, l.city
FROM employees e, departments d, locations l
WHERE d.department_id = 80 => 이게 제일 나은 방법
AND e.department_id = 80
AND d.location_id = l.location_id;

[문제35] locations 테이블에 있는 city컬럼에 Toronto도시에서 근무하는 모든 사원의 last_name, job_id, department_id, department_name 출력해주세요.

SELECT e.last_name, e.job_id, d.department_id, d.department_name
FROM employees e, departments d, locations l
WHERE l.city = 'Toronto'
AND e.department_id = d.department_id
AND d.location_id = l.location_id;

SELECT e.*, d.*
FROM employees e, departments d
WHERE e.department_id = d.department_id;

■ outer join

  • 키 값이 일치되는 데이터(equi join) 또는 키 값이 일치되지 않는 데이터도 출력하는 조인
  • (+)를 이용하여 outer join을 표현함(양쪽에 전부 쓰면 오류남)
  • 뽑으려는 데이터의 반대쪽에 (+)표현(결과가 덜 나오는쪽에 표현)

SELECT e.employee_id, e.department_id, d.department_id, d.department_name
FROM employees e, departments d
WHERE e.department_id = d.department_id(+);

(+)부호가 없는 곳(e) 키 값 일치하지 않는 사원 테이블에 있는 데이터는 모두 출력하겠다.
107개 행 출력됨(소속부서 없는 사원 데이터도 출력)

SELECT e.employee_id, e.department_id, d.department_id, d.department_name
FROM employees e, departments d
WHERE e.department_id(+) = d.department_id;

(+)부호가 없는 곳(d) 키 값 일치하지 않는 사원 테이블에 있는 데이터는 모두 출력하겠다.
122개 행 출력됨

(예시)
SELECT e.employee_id, e.department_id, d.department_name, l.city
FROM employees e, departments d, locations l
WHERE e.department_id = d.department_id(+)
AND d.location_id = l.location_id(+);

=> 키 값 일치하는 데이터(equi join) + 정보 없는 사원의 데이터도 출력(outer join)

[문제36] 부서장 정보를 employee_id, last_name, department_id, department_name을 표시하기 위한 query를 작성합니다.

SELECT e.employee_id, e.last_name, d.department_id, d.department_name
FROM employees e, departments d
WHERE e.employee_id(+) = d.manager_id
ORDER BY 3; => 내가 한 것

SELECT e.employee_id, e.last_name, d.department_id, d.department_name
FROM employees e, departments d
WHERE e.employee_id = d.manager_id;

SELECT count(*)
FROM departments
WHERE manager_id is not null; => 11개 나오는게 맞음[건수 검증]

[문제37] commission_pct이 null 이 아닌 사원들의 last_name, commission_pct, department_name을 출력해주세요.

SELECT e.last_name, e.commission_pct, d.department_name
FROM employees e, departments d
WHERE e.commission_pct IS NOT NULL
AND e.department_id = d.department_id; => 내가 한 것, 34개 나옴

SELECT count(*)
FROM employees
WHERE commission_pct is not null; => 35개 나옴[건수 검증]

SELECT e.last_name, e.commission_pct, d.department_name
FROM employees e, departments d
WHERE e.commission_pct IS NOT NULL
AND e.department_id = d.department_id(+); => 35개 나옴

[문제38] last_name에 소문자 ‘a’가 포함된 사원들의 last_name, department_name을 출력해주세요.

SELECT e.last_name, d.department_name
FROM employees e, departments d
WHERE last_name LIKE '%a%'
AND e.department_id = d.department_id; => 내가 한 것, 51개 나옴

SELECT count(*)
FROM employees
WHERE last_name LIKE '%a%'; => 52개 나옴[건수 검증]

SELECT e.last_name, d.department_name
FROM employees e, departments d
WHERE last_name LIKE '%a%'
AND e.department_id = d.department_id(+); => 52개 나옴

SELECT *
FROM employees
WHERE instr(last_name,'a',1,1) > 0; => 52개 나옴[건수 검증]

SELECT e.last_name, d.department_name
FROM employees e, departments d
WHERE instr(last_name,'a',1,1) > 0 => 데이터 多시, 함수 쓰는 것이 좋음
AND e.department_id = d.department_id(+);

=> 비조인조건이 데이터의 대부분일 경우 먼저 돌릴 필요 X 미비할 경우 먼저 돌려서 분류 O

■ self join

  • 자신의 테이블을 참조할 때 사용하는 조인

(일반 직원 테이블)
SELECT employee_id, last_name, manager_id
FROM employees;

(관리자 테이블)
SELECT employee_id, last_name
FROM employees;

▽▽▽
SELECT w.employee_id, w.last_name, m.employee_id, m.last_name
FROM employees w, employees m
WHERE w.manager_id = m. employee_id;
=> w 일반사원의 m 관리자 정보(106개 나옴, 최고경영자의 관리자 없으므로 1명 누락)

SELECT w.employee_id, w.last_name, m.employee_id, m.last_name
FROM employees w, employees m
WHERE w.manager_id = m. employee_id(+);
=> 107개 나옴

http://192.168.11.31/job_grades.txt => hr 하위 테이블 추가

DROP TABLE job_grades purge;
CREATE TABLE job_grades
( grade_level varchar2(3),
lowest_sal number,
highest_sal number);

INSERT INTO job_grades VALUES ('A',1000,2999);
INSERT INTO job_grades VALUES ('B',3000,5999);
INSERT INTO job_grades VALUES ('C',6000,9999);
INSERT INTO job_grades VALUES ('D',10000,14999);
INSERT INTO job_grades VALUES ('E',15000,24999);
INSERT INTO job_grades VALUES ('F',25000,40000);
commit;

■ non equi join : 비등가 조인

  • equi join (=) 할 수 없는 다른 비교연산자를 사용하는 조인의 유형
  • 값을 범위로 조인하려는 경우 많이 사용됨

SELECT e.employee_id, e.salary, j.grade_level
FROM employees e, job_grades j
WHERE e.salary >= j.lowest_sal
AND e.salary <= j.highest_sal;

SELECT e.employee_id, e.salary, j.grade_level
FROM employees e, job_grades j
WHERE e.salary BETWEEN j.lowest_sal AND j.highest_sal;

[문제39] 사원들의 급여의 등급 레이블의 빈도수를 출력해주세요.

SELECT count(decode(j.grade_level, 'A', '01'))"GRADE A",
count(decode(j.grade_level, 'B', '01'))"GRADE B",
count(decode(j.grade_level, 'C', '01'))"GRADE C",
count(decode(j.grade_level, 'D', '01'))"GRADE D",
count(decode(j.grade_level, 'E', '01'))"GRADE E",
count(decode(j.grade_level, 'F', '01'))"GRADE F"
FROM employees e, job_grades j
WHERE e.salary BETWEEN j.lowest_sal AND j.highest_sal;

SELECT j.grade_level 등급레이블, count(*) 빈도수
FROM employees e, job_grades j
WHERE e.salary BETWEEN j.lowest_sal AND j.highest_sal
GROUP BY j.grade_level
ORDER BY 1;

[문제40] 사원들의 사번, 급여, 급여등급, 부서이름을 출력해주세요.
부서배치를 받지 않은 사원은 제외시켜 주세요.

SELECT e.employee_id, e.salary, j.grade_level, d.department_name
FROM employees e, departments d, job_grades j
WHERE e.department_id = d.department_id
AND e.salary BETWEEN j.lowest_sal AND j.highest_sal;

[문제41] 사원들의 사번, 급여, 급여등급, 부서이름, 근무 도시 정보를 출력해주세요.
부서배치를 받지 않은 사원도 포함시켜 주세요.

SELECT e.employee_id, e.salary, j.grade_level, d.department_name, l.city
FROM employees e, departments d, job_grades j, locations l
WHERE e.department_id = d.department_id(+)
AND d.location_id = l.location_id(+)
AND e.salary BETWEEN j.lowest_sal AND j.highest_sal;

SELECT e.employee_id, d.department_name
FROM employees e, departments d
WHERE e.department_id = d.department_id(+); => 내가 한 것 [건수 검증]

■ ANSI(American National Standards Institute) SQL(Structured Query Language)

  • 안씨표준

1) natural join

  • equi join
  • 조인 조건 술어를 자동으로 만들어 줌(현장에서 잘 안 씀!)
  • 다만, 양쪽 테이블의 동일한 이름의 모든 컬럼을 기준으로 조인 조건 술어를 만듦
  • 동일한 이름의 컬럼이 한 개일 경우 사용하기 좋음
  • 양쪽 테이블에 동일한 이름의 컬럼이 타입이 다른 경우는 오류 발생

SELECT e.last_name, d.department_name
FROM employees e, departments d
WHERE e.department_id = d.department_id; -> 이런 조인 조건 술어를 자동으로!

SELECT e.last_name, d.department_name
FROM employees e NATURAL JOIN departments d;
▽▽▽
SELECT e.last_name, d.department_name
FROM employees e, departments d
WHERE e.department_id = d.department_id
AND e.manager_id = d.manager_id; -> 이름 동일 데이터 다른 컬럼!!!(주의)

SELECT department_name, city => 테이블 별칭 안 씀
FROM departments NATURAL JOIN locations; => 모호성 없으므로 괜춘함
▽▽▽
SELECT d.department_name, l.city
FROM departments d NATURAL JOIN locations l; => but, 작업 효율성에 좋음

2) join using

  • equi join
  • 조인조건의 기준 컬럼을 지정함
  • USING 절에 사용된 기준 컬럼은 양쪽 테이블에 동일한 이름의 컬럼
  • USING 절에 사용된 기준 컬럼을 테이블 지정하면 오류 발생 (테이블 지정XXXXX)

SELECT e.last_name, d.department_name
FROM employees e JOIN departments d
USING(department_id);

SELECT e.last_name, d.department_name
FROM employees e JOIN departments d
USING(department_id)
WHERE department_id IN (20, 30);

SELECT e.last_name, d.department_name, department_id, location_id, l.city
FROM employees e JOIN departments d
USING(department_id)
JOIN locations l
USING(location_id)
WHERE department_id IN (20, 30);

SELECT e.last_name, d.department_name, department_id, location_id, l.city
FROM employees e JOIN departments d
USING(department_id)
JOIN locations l
USING(location_id)
JOIN countries c
USING(country_id)
WHERE department_id IN (20, 30);

3) join on (꼭 기억 ★)

  • equi join, self join, non equi join
  • on 절을 이용해서 조인조건 술어를 직접 만들어서 사용함
  • on 절에도 테이블 별칭 써주어야 함

3-1) equi join
SELECT e.last_name, d.department_name
FROM employees e, departments d
WHERE e.department_id = d.department_id;
▽▽▽
SELECT e.last_name, d.department_name
FROM employees e JOIN departments d
ON e.department_id = d.department_id;
▽▽▽
SELECT e.last_name, d.department_name, l.city
FROM employees e JOIN departments d
ON e.department_id = d.department_id
JOIN locations l
ON d.location_id = l.location_id;
▽▽▽
SELECT e.last_name, d.department_name, l.city, c.country_name
FROM employees e JOIN departments d
ON e.department_id = d.department_id
JOIN locations l
ON d.location_id = l.location_id
JOIN countries c
ON l.country_id = c.country_id;
▽▽▽
SELECT e.last_name, d.department_name, l.city, c.country_name
FROM employees e JOIN departments d
ON e.department_id = d.department_id
JOIN locations l
ON d.location_id = l.location_id
JOIN countries c
ON l.country_id = c.country_id
WHERE e.last_name LIKE '%a%'
AND e.salary >= 10000; => 비 조인조건 술어는 조인조건 술어 다음에!!!

3-2) non equi join

SELECT e.employee_id, e.salary, j.grade_level
FROM employees e, job_grades j
WHERE e.salary BETWEEN j.lowest_sal AND j.highest_sal;
▽▽▽
SELECT e.employee_id, e.salary, j.grade_level, d.department_name
FROM employees e JOIN job_grades j
ON e.salary BETWEEN j.lowest_sal AND j.highest_sal;
JOIN departments d
ON e.department_id = d.department_id;

3-3) self join => 테이블 별칭 꼬옥 만들기!

SELECT w.employee_id, w.last_name, m.employee_id, m.last_name
FROM employees w, employees m
WHERE w.manager_id = m. employee_id(+);
▽▽▽(동일함)
SELECT w.employee_id, w.last_name, m.employee_id, m.last_name
FROM employees w JOIN employees m
ON w.manager_id = m. employee_id;

4) outer join (꼭 기억 ★)

SELECT e.employee_id, e.department_id, d.department_id, d.department_name
FROM employees e, departments d
WHERE e.department_id = d.department_id(+);
▽▽▽(동일함)
SELECT e.employee_id, e.department_id, d.department_id, d.department_name
FROM employees e LEFT OUTER JOIN departments d
ON e.department_id = d.department_id;

SELECT e.employee_id, e.department_id, d.department_id, d.department_name
FROM employees e, departments d
WHERE e.department_id(+) = d.department_id;
▽▽▽(동일함)
SELECT e.employee_id, e.department_id, d.department_id, d.department_name
FROM employees e RIGHT OUTER JOIN departments d
ON e.department_id = d.department_id;

SELECT e.employee_id, e.department_id, d.department_id, d.department_name, l.city
FROM employees e, departments d, locations l
WHERE e.department_id = d.department_id(+)
AND d.location_id = l.location_id(+);
▽▽▽(동일함)
SELECT e.employee_id, e.department_id, d.department_id, d.department_name
FROM employees e LEFT OUTER JOIN departments d
ON e.department_id = d.department_id
LEFT OUTER JOIN locations l
ON d.location_id = l.location_id;

SELECT e.employee_id, e.department_id, d.department_id, d.department_name
FROM employees e, departments d
WHERE e.department_id(+) = d.department_id(+); => 오류 발생
▽▽▽(동일함)
SELECT e.employee_id, e.department_id, d.department_id, d.department_name
FROM employees e FULL OUTER JOIN departments d => 해결!!
ON e.department_id = d.department_id;

=> 오라클에서 양쪽 (+)사용시 오류 발생, 안시 표준 문법의 FULL OUTER JOIN으로 해결!!

5) cartesian product(cross join)

  • 조인조건이 부적합하게 만들어진 경우
  • 카테시안 곱이 만들어짐

SELECT e.employee_id, e.department_id, d.department_id, d.department_name
FROM employees e CROSS JOIN departments d;

[문제42] 2006년도에 입사한 사원들의 부서이름별 급여의 총액, 평균을 출력해주세요.
1) 오라클 전용

SELECT d.department_name, sum(e.salary), avg(e.salary)
FROM employees e, departments d
WHERE e.department_id = d.department_id
AND e.hire_date BETWEEN '2006/01/01' AND '2007/01/01'
GROUP BY d.department_name
ORDER BY 1; => 내가 한 것

SELECT d.department_name 부서명, sum(e.salary) 급여총액, round(avg(e.salary)) 평균급여
FROM employees e, departments d
WHERE e.hire_date >= to_date('2006/01/01', 'yyyy/mm/dd')
AND e.hire_date < to_date('2007/01/01', 'yyyy/mm/dd')
AND e.department_id = d.department_id
GROUP BY d.department_name;

2) ANSI 표준

SELECT d.department_name 부서명, sum(e.salary) 급여총액, round(avg(e.salary)) 평균급여
FROM employees e JOIN departments d
ON e.department_id = d.department_id
WHERE e.hire_date >= to_date('2006/01/01', 'yyyy/mm/dd')
AND e.hire_date < to_date('2007/01/01', 'yyyy/mm/dd')
GROUP BY d.department_name;

SELECT d.department_name 부서명, sum(e.salary) 급여총액, round(avg(e.salary)) 평균급여
FROM employees e JOIN departments d
USING(department_id)
WHERE e.hire_date >= to_date('2006/01/01', 'yyyy/mm/dd')
AND e.hire_date < to_date('2007/01/01', 'yyyy/mm/dd')
GROUP BY d.department_name;

[문제43] 2006년도에 입사한 사원들의 도시이름별 급여의 총액, 평균을 출력해주세요.
1) 오라클 전용

SELECT l.city, sum(e.salary), avg(e.salary)
FROM employees e, departments d, locations l
WHERE e.department_id = d.department_id
AND d.location_id = l.location_id
AND e.hire_date BETWEEN '2006/01/01' AND '2007/01/01'
GROUP BY l.city
ORDER BY 1;

SELECT l.city 도시이름, sum(e.salary) 급여총액, round(avg(e.salary)) 평균급여
FROM employees e, departments d, locations l
WHERE e.hire_date >= to_date('2006/01/01', 'yyyy/mm/dd')
AND e.hire_date < to_date('2007/01/01', 'yyyy/mm/dd')
AND e.department_id = d.department_id
AND d.location_id = l.location_id
GROUP BY l.city;

2) ANSI 표준

SELECT l.city, sum(e.salary), avg(e.salary)
FROM employees e JOIN departments d
ON e.department_id = d.department_id
JOIN locations l
ON d.location_id = l.location_id
WHERE e.hire_date BETWEEN '2006/01/01' AND '2007/01/01'
GROUP BY l.city;

SELECT l.city 도시이름, sum(e.salary) 급여총액, round(avg(e.salary)) 평균급여
FROM employees e JOIN departments d
ON e.department_id = d.department_id
JOIN locations l
ON d.location_id = l.location_id
WHERE e.hire_date >= to_date('2006/01/01', 'yyyy/mm/dd')
AND e.hire_date < to_date('2007/01/01', 'yyyy/mm/dd')
GROUP BY l.city;

SELECT l.city 도시이름, sum(e.salary) 급여총액, round(avg(e.salary)) 평균급여
FROM employees e JOIN departments d
USING e.department_id = d.department_id
JOIN locations l
USING d.location_id = l.location_id
WHERE e.hire_date >= to_date('2006/01/01', 'yyyy/mm/dd')
AND e.hire_date < to_date('2007/01/01', 'yyyy/mm/dd')
GROUP BY l.city;

0개의 댓글