[ML] 통신사 고객 이탈 예측 모델_1

YJ·2023년 6월 16일
0

[제로베이스 데이터 취업 스쿨]

Correlation matrix

모델 설계

  • 수치형 변수: Standard scaler
  • 데이터 불균형: oversamplig(SMOTE)
  • Train:Test Size(%) = 80:20
  • GridSearchCV(cv=5)

1. LogisticRegression

params = {"C" : [0.01, 0.1, 1, 5, 10]}  # best_params_: 5
# C: 규제의 정도(작을수록 규제가 강해짐)
  • Train ACC : 0.9566719829877725
  • Test ACC : 0.9499241274658573

Coefficient

2. RandomForestClassifier

params = {
    'max_depth':[6, 8, 10], 'n_estimators':[50, 100, 200],
    'min_samples_leaf':[8, 12, 18], 'min_samples_split':[8, 16, 20] }
# {'max_depth': 10, 'min_samples_leaf': 8, 'min_samples_split': 20, 'n_estimators': 100}
  • Train ACC : 0.9712918660287081
  • Test ACC : 0.9628224582701063

feature importances

3. DecisionTreeClassifier

params = {'max_depth': [6, 8, 10, 12, 16, 20, 24]}  # best_params_: 12
  • Train ACC : 0.9848484848484849
  • Test ACC : 0.9537177541729894

feature importances

4. Support Vector Machine (SVM)

params = {"C" : [1, 100, 10, 0.1, 0.01, 0.001]} # best_params_: 100
  • Train ACC : 0.9568048910154173
  • Test ACC : 0.952959028831563

5. XGBoost

params = {
	'max_depth' : [3, 5, 7, 9], 'n_estimators' : [100, 200, 300, 400],
    'learning_rate' : [0.1, 0.2] }
# {n_estimators: 100, learning_rate: 0.2, max_depth: 9}
  • Train ACC : 0.9917597022860181
  • Test ACC : 0.9582701062215478

feature importances

ROC curve

모델 성능 평가

0개의 댓글