[SQL 문법 정리]

김성수·2023년 5월 1일

sql 문법 정리

SQL

목록 보기

1/3

SQL 문법은 한번 정리해놓으면 나중에 필요할 때 사용하기 좋다.
사람의 기억은 유한하니까!

DBeaver

데이터베이스 ServerHost를 연결하면

데이터베이스를 쉽게 CRUD할 수 있는 데이터베이스 관리 도구이다.

엑셀보다 쉬운 SQL 강의를 수강하면서 사용했다.

UI도 괜찮고, 단축키도 잘되어 있어서 SQL 쿼리 작성 연습을 할 때 큰 도움이 된다.

group by

동일한 범주의 갯수, 최솟값, 최댓값, 평균, 합계를 구하는데 유용하다.

부서가 '개발'인 그룹의 인원 수를 구하고 싶다

또는 부서가 '인사'인 그룹의 인원 수를 구하고 싶다.

또는 부서가 '운영'인 그룹의 인원 수를 구하고 싶다.

라고 했을 때, group by를 사용하지 않으면

세번의 쿼리문을 작성해야 한다.

아래 코드와 같이 말이다.

select count(*) from employee
where department = '개발';

select count(*) from employee
where department = '인사';

select count(*) from employee
where department = '운영';

만약 부서의 수가 50개라고 가정해보자.

각각의 부서의 직원 인원수를 모두 알아보고 싶다면 50번의 쿼리를 작성해야 한다.

매우 번거롭다.

이때 group by가 유용하다는 것이다.

group by는 각각의 부서의 인원수를 구하고 싶다면

select department, count(*) from employee
group by department;

그러면 세개의 부서 별로 인원수가 출력이될 것이다.

마찬가지로 부서 내에서 salary가 가장 높은 사람

또는 가장 낮은 사람을 출력하고 싶다면 아래와 같이 코드를 작성하면된다.

select department, min(salary) from employee
group by department;

group by 쿼리가 실행되는 순서

select name, count() from users
group by name
order by count()

위 쿼리에서 실행 순서는

from -> group by -> select -> order by이다.

일단 users 테이블을 찾는다. (from)
-> 테이블의 name 필드를 그룹핑한다 (group by)
-> 그룹핑된 데이터들을 출력한다 (select)
-> 출력된 데이터들을 정렬해서 보여준다. (order by)

만약 where 절이 있다면

select payment_method , COUNT() from orders
where course_title = '웹개발 종합반'
GROUP BY payment_method
ORDER BY COUNT()

위 코드의 실행 순서는
from -> where -> group by -> select -> order

만약 having과 join 절이 있다면

from -> join -> where -> group by -> having -> select -> order

쿼리가 실행되는 순서가 왜 중요할까?

쿼리문의 실행 순서가 왜 중요할까?

쿼리문의 실행 순서를 파악하는 것은 쿼리문을 작성할 때 큰 도움이 된다.

예를 들어 웹개발 종합반의 결제수단별 주문건수 세어보기라는 조건이 있을 때

일단 웹개발 종합반만을 출력해야 하기 때문에

select * from orders
where courses_title = '웹개발 종합반'

출력된 결과가 웹개발 종합반만 출력되었다면 그 다음으로 결제수단을 그룹화해야 한다.

select *
from orders
where course_title = '웹개발 종합반'
group by payment_method

주문건수 세어보기 이므로

select payment_method, count(*)
from orders
where course_title = '웹개발 종합반'
group by payment_method

로 선언한 뒤에 마지막으로 order by를 추가해주면 된다.

select payment_method , count()
from orders
where course_title = '웹개발 종합반'
group by payment_method
order by count();

inner join

inner join은 다른 테이블에 존재하는 컬럼을 함께 출력하고 싶을 때 사용할 수 있다.

유저들의 결제 수단과 이름을 함께 보고 싶을 때

select * from users u
inner join orders o
on u.user_id = o.user_id

join 쿼리가 실행되는 순서

select u.name, count(*)
from orders o
inner join users u
on u.user_id = o.user_id
where o.email like '%naver.com'
GROUP BY u.name

from -> join -> where -> group by -> select

left join

select * from users u
left join point_users pu
on pu.user_id = u.user_id

point_users 테이블에는 point를 갖고 있는 유저 속성이 있다.

하지만 users 테이블에는 없다.

"유저들중에서 point를 갖고 있지 않는 유저가 몇명일까?"

아래와 같이 코드를 작성할 수 있다.

select count(*) from users u
left join point_users pu
on pu.user_id = u.user_id
where pu.point_user_id is null;

그런데, 이번엔 유저들의 성씨별로 point를 갖고 있지 않는 유저들의 수를 알고 싶다면 아래와 같이 코드를 작성할 수 있다.

select u.name, count(*) from users u
left join point_users pu
on pu.user_id = u.user_id
where pu.point_user_id is null
group by u.name

inner join과 left join의 차이

inner join은 두 테이블간의 공통점만 출력한다.

left join은 두 테이블에서 왼쪽 테이블을 모두 출력함과 동시에 오른쪽 테이블에서 공통된 요소들만 출력한다.

쉽게 설명해보자.

유저들의 이름, 이메일, 나이 속성을 가지고 있는 user 테이블과

유저들의 포인트 정보를 가지고 있는 Point 테이블이 있다.

만약 inner join 한다면 어떤 정보를 얻고 싶은걸까?

모든 포인트를 가지고 있는 유저들의 정보만 가져오고 싶을 때

반대로, left join을 한다면 어떤 정보를 얻고 싶은걸까?

Point를 가진 사람만 가져오는게 아니라, Point를 가지지 않은 사람도 가져오고 싶을 때.

inner join은 포인트를 가진 사람만 셀 수 있지만,

left join은 포인트를 가진 사람도 셀 수 있고, 가지지 않은 사람도 셀 수 있다.

한번 더 쉽게 표현하자면,

inner join : 나는 포인트를 가진 사람만 필요해

left join : 나는 포인트를 가지고 있는 사람과 가지지 않은 사람의 정보 모두 필요해.

inner join -> 욕심이 덜 하다.

left join -> 욕심이 많다.

union all

두 쿼리문의 결과를 합치고 싶다?

그럴 땐 union all을 사용할 수 있다.

(
	select '7월' as month, c1.title, c2.week, count(*) as cnt from courses c1
	inner join checkins c2 on c1.course_id = c2.course_id
	inner join orders o on c2.user_id = o.user_id
	where o.created_at < '2020-08-01'
	group by c1.title, c2.week
	order by c1.title, c2.week // 적용 x
	)
union ALL 
(	
	select '8월' as month, c1.title, c2.week, count(*) as cnt from courses c1
	inner join checkins c2 on c1.course_id = c2.course_id
	inner join orders o on c2.user_id = o.user_id
	where o.created_at >= '2020-08-01'
	group by c1.title, c2.week
	order by c1.title, c2.week // 적용 x
)

주의 : union all은 쿼리문 내에서 order by가 먹히지 않는다.

그러면 어떻게 oder by를 적용할 수 있을까?

from절에 서브쿼리로 order by를 적용할 수 있다.

SELECT * FROM (
    SELECT '7월' AS month, c1.title, c2.week, 	COUNT(*) AS cnt 
    FROM courses c1
    INNER JOIN checkins c2 ON c1.course_id = c2.course_id
    INNER JOIN orders o ON c2.user_id = o.user_id
    WHERE o.created_at < '2020-08-01'
    GROUP BY c1.title, c2.week
    UNION ALL 
    SELECT '8월' AS month, c1.title, c2.week, COUNT(*) AS cnt 
    FROM courses c1
    INNER JOIN checkins c2 ON c1.course_id = c2.course_id
    INNER JOIN orders o ON c2.user_id = o.user_id
    WHERE o.created_at >= '2020-08-01'
    GROUP BY c1.title, c2.week
) AS result
ORDER BY result.title, result.week;

서브쿼리

서브쿼리는 where, select, from절에 들어갈 수 있다.

아래는 from절에 들어가는 서브쿼리 예시다.

point를 많이 쌓은 유저는 likes도 많을까?

select 
pu.user_id , 
pu.`point` , 
avg_likes_user 
from point_users pu
inner join (
	select c.user_id , 
    	c.likes , 
    	avg(c.likes) as avg_likes_user 
    from checkins c 
	group by c.user_id 
) a 
on pu.user_id = a.user_id
order by point desc

아래는 where절에 들어가는 서브쿼리 예시

유저별 likes 수와 likes 평균을 구하라.

select 	c.checkin_id, 
		c.user_id ,
		c.likes ,
		(
			select avg(likes) from checkins
			where user_id = c.user_id 
		) as avg_likes_user
from checkins c

select절에 들어가는 서브쿼리

과목별 평균 likes 수를 필드 우측에 붙여보기

select c.checkin_id, 
		c.course_id, 
		c.user_id, 
		c.likes,
		(
			select ROUND(avg(likes),1) 
            from checkins
			where course_id = c.course_id 
		) as avg		
from checkins c

with

with를 사용하여 가독성 증가시키기

with table1 as(
	select course_id, 
    	count(DISTINCT(user_id)) as cnt_user 
	from checkins 
	group by course_id
), table2 as(
	select course_id, 
    	count(*) as cnt_all_user 
    from orders 
	group by course_id
)
select  c.title, 
		a.cnt_user, 
		b.cnt_all_user , 
		(a.cnt_user/b.cnt_all_user) as ratio 
from table1 a
inner JOIN table2 b on a.course_id = b.course_id 
inner join courses c on c.course_id = a.course_id

with로 table1과 table2를 설정한 뒤 사용하면 가독성이 더 좋아진다.

SQL 문자열 쪼개보기, SUBSTRING_INDEX()

email을 @를 기준으로 id만 나오게할 때

select email, 
	user_id, 
	SUBSTRING_INDEX(email,'@', 1) 
from users

도메인만 나오게 할 때

select email, 
	user_id, 
	SUBSTRING_INDEX(email,'@', -1) 
from users

SQL 문자열 자르기, SUBSTRING()

생성일자에서 년-월-일만 출력하고 싶을 때

select order_no, created_at, SUBSTRING(created_at,1,10) from orders

해석 : created_at을 1번 인덱스 부터(SQL은 0번이 아닌 1번부터 시작) 1+10번 인덱스 전 까지 잘라내서 출력하라.

매일 주문된 건수를 확인하고 싶을 때(응용)

select order_no, 
	SUBSTRING(created_at,1,10) as date, 
	count(*) 
from orders
group by date

주문이 들어온 시간만 알고 싶을 때

select order_no, 
	SUBSTRING(created_at,12,8) as date 
from orders

해석 : created_at을 12번 인덱스부터 8자리 12+8 인덱스 전 까지 잘라내서 출력하라.

SQL case 문

프로그래밍 언어의 if문과 유사한 기능이라고 볼 수 있다.

select pu.user_id, pu.`point`,  
	(case when pu.`point` > 10000 then '잘하고 있어요'
	else '조금만더 힘냅시다!' END) as msg
	from point_users pu

문법 : select case when 컬럼 조건 조건 then 결과 else 결과 end

통계내는 방법(응용)

select a.lv, count(*) from(   
	select pu.user_id, pu.`point`,  
		(case when pu.`point` >= 10000 then '10000 이상'
			when pu.`point` >= 5000 then '5000 이상'
			else '5000 미만' END) as lv
		from point_users pu 		
) a
group by lv

해석 : from절에 서브쿼리로 넣고 lv별로 묶어서 수를 체크함으로써 통계를 낼 수 있다.

마지막으로 서브쿼리를 with로 더 예쁘게

with table1 as(
		select pu.user_id, pu.`point`,  
		(case when pu.`point` >= 10000 then '10000 이상'
			when pu.`point` >= 5000 then '5000 이상'
			else '5000 미만' END) as lv
		from point_users pu 	
)
select a.lv, count(*) from table1 a
group by lv

김성수

깊이 있는 소프트웨어 개발자가 되고 싶습니다.

다음 포스트