튜닝전 / 후 쿼리
아래 좋은 article 을 발견해서 두고 두고 보려고 내 velog 에 정리하고자 한다.
글을 읽으면서 그동안 리소스 관점에서 놓치고 짠 쿼리가 스쳐 지나갔다.. 🙉
특히 LIKE 구문에 대한 내용을 다룬 3 번은 내가 자주 했던 실수인데, 이번 기회에 좀 더 좋은 쿼리를 짤 수 있을 것 같다.
-- Inefficient
SELECT * FROM movie;
-- Improved
SELECT id FROM movie;
-- Inefficient
SELECT m.id, ANY_VALUE(m.title) title, COUNT(r.id) r_count
FROM movie m
INNER JOIN rating r
ON m.id = r.movie_id
WHERE FLOOR(r.value/2) = 2 -- <- 추가 연산
GROUP BY m.id;
-- Improved
SELECT m.id, ANY_VALUE(m.title) title, COUNT(r.id) r_count
FROM movie m
INNER JOIN rating r
ON m.id = r.movie_id
WHERE r.value BETWEEN 4 AND 5
GROUP BY m.id;
r.value
가 가지고 있는 index 를 그대로 활요할 수 있기 때무넹 모든 필드 값을 탐색할 필요가 없어 더 짧은 Running Time 을 가짐.value IN (...), value="...", value LIKE "...%"
와는 달리, value LIKE "%..."
는 Full Table Scan 을 활용한다. 따라서 같은 결과를 낸다면 와일드카드를 앞부분에 배치하는 것은 피하는 것이 좋다.아래 Comedy 와 Romantic Comedy 를 추출하려는 예시를 살펴보자.
-- Inefficient
SELECT g.value genre, COUNT(r.movie_id) r_cnt
FROM rating r
INNER JOIN genre g
ON r.movie_id = g.movie_id
WHERE g.value LIKE "%Comedy"
GROUP BY g.value;
-- Improved(1): value IN (...)
SELECT g.value genre, COUNT(r.movie_id) r_cnt
FROM rating r
INNER JOIN genre g
ON r.movie_id = g.movie_id
WHERE g.value IN ("Romantic Comedy", "Comedy")
GROUP BY g.value;
-- Improved(2): value = "..."
SELECT g.value genre, COUNT(r.movie_id) r_cnt
FROM rating r
INNER JOIN genre g
ON r.movie_id = g.movie_id
WHERE g.value = "Romantic Comedy" OR g.value = "Comedy"
GROUP BY g.value;
-- Improved(3): value LIKE "...%"
-- 🌟 모든 문자열을 탐색할 필요가 없어, 가장 좋은 성능을 냄
SELECT g.value genre, COUNT(r.movie_id) r_cnt
FROM rating r
INNER JOIN genre g
ON r.movie_id = g.movie_id
WHERE g.value LIKE "Romantic%" OR g.value LIKE "Comed%"
GROUP BY g.value
EXISTS
활용-- Inefficient
SELECT DISTINCT m.id, title
FROM movie m
INNER JOIN genre g
ON m.id = g.movie_id;
-- Improved
SELECT m.id, title
FROM movie m
WHERE EXISTS (SELECT 'X' FROM rating r WHERE m.id = r.movie_id
-- Inefficient
SELECT m.id, COUNT(r.id) AS rating_cnt, AVG(r.value) AS avg_rating
FROM movie m
INNER JOIN rating r
ON m.id = r.movie_id
GROUP BY id
HAVING m.id > 1000;
-- Improved
SELECT m.id, COUNT(r.id) AS rating_cnt, AVG(r.value) AS avg_rating
FROM movie m
INNER JOIN rating r
ON m.id = r.movie_id
WHERE m.id > 1000
GROUP BY id;
-- Query (A)
SELECT m.title, r.value_rating, g.value genre
FROM rating r
INNER JOIN genre g
ON. g.movei_id = r.movie_id
INNER JOIN movie m
ON m.id = r.movie_id:
-- Query (B) 🌟
SELECT m.title, r.value rating, g.value genre
FROM rating r -- 10,000,000 records
INNER JOIN movie m -- 100 records
ON r.movie_id = m.id
INNER JOIN genre g -- 100,000 records
ON r.movie_id = g.movie_id;