머신러닝- 기초

화이팅·2023년 2월 19일

0

machine_learning

목록 보기

5/12

머신러닝

목표 : 일반적인 패턴 발견

X(설명변수) ,y(목표변수) 설정
training set(학습 데이터 ) & Validation set(검증 데이터) 분할
X_train, X_test, y_train, y_test = train_test_split(X,y,test_size=0.3, random_state=13)
-> train set 70%, test 30%
-> random_state는 숫자 상관 x
파이프라인 구축
스케일러, 모델의 종류 튜플 형태

pipe_list=[('scaler', 스케일러()),('model',모델())]
pipe_model= Pipeline(pipe_list)

하이퍼파라미터 튜닝
하이퍼파라미터 : 사용자가 직접 조정 -> 알고리즘 성능 개선
gridsearchcv 사용시, 교차검증과 최적의 하이퍼 파라미터 튜닝 동시에 가능

scoring= 성능 측정 지표

best_model 선택

bestmodel=grid_model.best_estimator

모델 평가를 위한 예측값 계산

Y_train_pred= best_model.predict(X_train)
Y_test_pred=best_model.predict(X_test)

모델 성능 평가

1) 학습 평가
print(classification_report(Y_train, Y_train_pred))

2) 일반화 평가
print(classification_report(Y_test, Y_test_pred)

출처 : https://bigdaheta.tistory.com/54

하하...하.

이전 포스트

머신러닝 - linear regression

다음 포스트

[error] y should be a 1d array, got an array of shape (227845, 29) instead.

0개의 댓글