DataBase : instacart 설명
- orders (3.4m rows, 206k users):
.
order_id: order identifier
user_id: customer identifier
eval_set: which evaluation set this order belongs in (see SET described below)
order_number: the order sequence number for this user (1 = first, n = nth)
order_dow: the day of the week the order was placed on
order_hour_of_day: the hour of the day the order was placed on
days_since_prior: days since the last order, capped at 30 (with NAs for order_number = 1)
.- products (50k rows):
.
product_id: product identifier
product_name: name of the product
aisle_id: foreign key
department_id: foreign key
- aisles (134 rows):
.
aisle_id: aisle identifier
aisle: the name of the aisle
*deptartments (21 rows):
.
department_id: department identifierdepartment: the name of the department
- order_products__SET (30m+ rows):
.
order_id: foreign key
product_id: foreign key
add_to_cart_order: order in which each product was added to cart
reordered: 1 if this product has been ordered by this user in the past, 0 otherwise
where SET is one of the four following evaluation sets (eval_set in orders):
"prior": orders prior to that users most recent order (~3.2m orders)
문제
- 전체 주문 건수
- 구매자 수
- 상품별 주문 건수
- 카트에 가장 먼저 넣는 상품 10개
.- order_products_prior의 add_to_cart_order라는 컬럼은 상품이 몇 번째로 장바구니에 담겼는지 의미- 시간별 주문 건수
1. 전체 주문 건수
SELECT count(order_id) FROM orders;
2. 구매자 수
- SELECT count(DISTINCT(user_id)) FROM orders ;
3. 상품별 주문 건수
SELECT B.product_name, COUNT(DISTINCT A.order_id) F
FROM order_products__prior A
LEFT JOIN
products B
ON A.product_id = B.product_id
GROUP BY 1;
#위 코드와 동일( having)
SELECT B.product_name, COUNT(DISTINCT A.order_id) F
FROM order_products__prior A
LEFT JOIN
products B
USING (product_id) # 키로 잡는 컬럼이 같다면 having을 써도 됨
GROUP BY 1;
4. 카트에 가장 먼저 넣는 상품 10개
- order_products_prior의 add_to_cart_order라는 컬럼은 상품이 몇 번째로 장바구니에 담겼는지 의미
# add_to_cart_order가 1인 경우를 F_1st 칼럼에 1을 할당하고 아닌 경우는 0을 할당함
SELECT * FROM
(SELECT product_id,
case when add_to_cart_order = 1 then 1 ELSE 0 END F_1st
FROM order_products__prior) A
WHERE F_1st = 1;
SELECT product_id, SUM(case when add_to_cart_order = 1 then 1 ELSE 0 END) F_1st
FROM order_products__prior
GROUP BY product_id;
SELECT * FROM
(SELECT *, ROW_NUMBER() OVER(ORDER BY F_1st DESC) rnk
FROM
(SELECT product_id, SUM(case when add_to_cart_order = 1 then 1 ELSE 0 END) F_1st
FROM order_products__prior
GROUP BY product_id) A LIMIT 10) B
LEFT JOIN products C
ON B.product_id = C.product_id
ORDER BY rnk ;
5. 시간별 주문 건수
SELECT order_hour_of_day, COUNT(DISTINCT order_id) F
FROM orders
GROUP BY order_hour_of_day
ORDER BY F DESC;