[R] factor levels 순위 지정하기

: ) YOUNG·2022년 6월 7일
2

R

목록 보기
7/8

factor형의 데이터에 spearman의 순위상관분석을 실시하기

예시 데이터

temp 


> temp
  [1] Graduate Graduate Masters  Graduate Masters  Graduate Graduate Graduate
  [9] Graduate Masters  Graduate Graduate Graduate Graduate Masters  Phd
 [17] Graduate Masters  Phd      Masters  Masters  Masters  Graduate Masters
 [25] Graduate Graduate Graduate Masters  Graduate Graduate Graduate Graduate
 [33] Graduate Graduate Graduate Graduate Graduate Masters  Graduate Graduate
 [41] Masters  Graduate Graduate Graduate Graduate Graduate Graduate Graduate
 [49] Masters  Graduate Graduate Masters  Graduate Graduate Graduate Graduate
 [57] Graduate Graduate Graduate Masters  Graduate Masters  Graduate Graduate
 [65] Phd      Graduate Graduate Graduate Graduate Graduate Graduate Graduate
 [73] Graduate Masters  Graduate Graduate Graduate Graduate Graduate Graduate
 [81] Graduate Masters  Masters  Masters  Graduate Graduate Graduate Graduate
 [89] Graduate Graduate Graduate Graduate Masters  Graduate Masters  Graduate
 [97] Graduate Graduate Graduate Graduate
Levels: Graduate Masters Phd

현재는 Graduate, Master, Phd의 3개로 분리된 명목형 데이터이다


하지만 이 명목형 데이터도 factor의 order 옵션을 사용하면 순위를 지정해 줄 수 있다.

temp <- factor(temp, ordered = TRUE)
temp

> temp
  [1] Graduate Graduate Masters  Graduate Masters  Graduate Graduate Graduate
  [9] Graduate Masters  Graduate Graduate Graduate Graduate Masters  Phd
 [17] Graduate Masters  Phd      Masters  Masters  Masters  Graduate Masters
 [25] Graduate Graduate Graduate Masters  Graduate Graduate Graduate Graduate
 [33] Graduate Graduate Graduate Graduate Graduate Masters  Graduate Graduate
 [41] Masters  Graduate Graduate Graduate Graduate Graduate Graduate Graduate
 [49] Masters  Graduate Graduate Masters  Graduate Graduate Graduate Graduate
 [57] Graduate Graduate Graduate Masters  Graduate Masters  Graduate Graduate
 [65] Phd      Graduate Graduate Graduate Graduate Graduate Graduate Graduate
 [73] Graduate Masters  Graduate Graduate Graduate Graduate Graduate Graduate
 [81] Graduate Masters  Masters  Masters  Graduate Graduate Graduate Graduate
 [89] Graduate Graduate Graduate Graduate Masters  Graduate Masters  Graduate
 [97] Graduate Graduate Graduate Graduate
Levels: Graduate < Masters < Phd

마지막 줄을 보면 Levels: Graduate < Masters < Phd 로 levels가 일반적인 형태와 다른것을 확인할 수 있다.

순서가 지정된 것인데, Graduate 가 가장 낮은 단계
다음은 Masters가 두번째 단계, Phd가 최고 등급이 된다.


이 형태를 숫자로 바꿀 수도 있습니다.
Graduate를 1, Masters를 2, Phd를 3으로 설정합니다.


temp2 <- as.numeric(factor(
    temp, ordered = TRUE
))



> temp2
  [1] 1 1 2 1 2 1 1 1 1 2 1 1 1 1 2 3 1 2 3 2 2 2 1 2 1 1 1 2 1 1 1 1 1 1 1 1 1
 [38] 2 1 1 2 1 1 1 1 1 1 1 2 1 1 2 1 1 1 1 1 1 1 2 1 2 1 1 3 1 1 1 1 1 1 1 1 2
 [75] 1 1 1 1 1 1 1 2 2 2 1 1 1 1 1 1 1 1 2 1 2 1 1 1 1 1

순위상관계수 확인


cor.test(
    as.numeric(
        factor(main$edu_level, order = TRUE)
    ),
    main$city_dev_idx,
    method = 'spearman'
)


        Spearman's rank correlation rho

data:  as.numeric(factor(main$edu_level, order = TRUE)) and main$city_dev_idx
S = 558082720, p-value = 0.7612
alternative hypothesis: true rho is not equal to 0
sample estimates:
      rho
0.0078525

0개의 댓글