관련 코호트 분석 - 생존자

ddoddo·2023년 1월 21일
0

생존자

id별 첫 임기 시작날짜와 마지막 임기 시작 날짜

SELECT
	id_bioguide,
	MIN(term_start) AS first_term,
	MAX(term_start) AS last_term
FROM legislators_terms
GROUP BY 1

tenure 계산

SELECT
	id_bioguide,
	MIN(term_start) AS first_term,
	MAX(term_start) AS last_term,
	DATE_PART('year', AGE(MAX(term_start), MIN(term_start))) AS tenure
FROM legislators_terms
GROUP BY 1

century별 코호트 숫자와 임기가 10년 이상인 경우의 비율

SELECT
	first_century,
	COUNT(DISTINCT id_bioguide) AS cohort_size,
	COUNT(DISTINCT CASE WHEN tenure >= 10 THEN id_bioguide END) AS survived_10,
	ROUND(COUNT(DISTINCT CASE WHEN tenure >= 10 THEN id_bioguide END)*1.0/COUNT(DISTINCT id_bioguide), 2) AS pct_survived_10
FROM
(
SELECT
	id_bioguide,
	DATE_PART('century', MIN(term_start)) AS first_century,
	MIN(term_start) AS first_term,
	MAX(term_start) AS last_term,
	DATE_PART('year', AGE(MAX(term_start), MIN(term_start))) AS tenure
FROM legislators_terms
GROUP BY 1
) AS sub
GROUP BY 1

century별 코호트 숫자와 임기가 5년 이상인 경우의 비율

SELECT
	first_century,
	COUNT(DISTINCT id_bioguide) AS cohort_size,
	COUNT(DISTINCT CASE WHEN total_terms >= 5 THEN id_bioguide END) AS survived_5,
	ROUND(COUNT(DISTINCT CASE WHEN total_terms >= 5 THEN id_bioguide END)*1.0/COUNT(DISTINCT id_bioguide), 2) AS pct_survived_5
FROM
(
SELECT
	id_bioguide,
	DATE_PART('century', MIN(term_start)) AS first_century,
	COUNT(term_start) AS total_terms
FROM legislators_terms
GROUP BY 1
) AS sub
GROUP BY 1

century, terms별 코호트 숫자와 비율

  • 1년부터 20년까지 terms을 생성하고 JOIN한다.
  • 각 세기별 고유한 id의 개수를 센다.
  • 각 terms에 해당하는 임기에 대해 고유한 id의 개수를 센다.
  • 비율을 계산한다.
SELECT
	A.first_century,
	B.terms,
	COUNT(DISTINCT A.id_bioguide) AS cohort,
	COUNT(DISTINCT CASE WHEN A.total_terms >= B.terms THEN A.id_bioguide END) AS cohort_survived,
	ROUND(COUNT(DISTINCT CASE WHEN A.total_terms >= B.terms THEN A.id_bioguide END)*1.0/COUNT(DISTINCT id_bioguide), 2) AS pct_cohort_survived
FROM
(
SELECT
	id_bioguide,
	DATE_PART('century', MIN(term_start)) AS first_century,
	COUNT(term_start) AS total_terms
FROM legislators_terms
GROUP BY 1
) AS A
	JOIN (SELECT generate_series AS terms FROM generate_series(1, 20, 1)) AS B
		ON 1 = 1
GROUP BY 1, 2

0개의 댓글