머신러닝 Accuracy까지의 과정

suji·2022년 11월 29일

머신러닝

목록 보기

3/14

1. 데이터 셋 얻기

from sklearn.datasets import load_iris
import pandas as pd

iris = load_iris()

2. 데이터를 훈련 / 테스트로 분리

from sklearn.model_selection import train_test_split

features = iris.data  # 네 개 컬럼 다 쓰겠다
labels = iris.target  # 정답(종 0,1,2)

X_train, X_test, y_train, y_test = train_test_split(features, labels, 
                                                    test_size=0.2, stratify=labels, random_state=17)

3. train 데이터 대상 결정나무 모델 만들기

from sklearn.tree import DecisionTreeClassifier

iris_tree = DecisionTreeClassifier(max_depth=2, random_state=17)
iris_tree.fit(X_train, y_train)  # 문제, 답

4. plot_tree 확인

import matplotlib.pyplot as plt
from sklearn.tree import plot_tree   # 트리 plot

plt.figure(figsize=(8,4))
plot_tree(iris_tree);  # 학습시킨 모델

petal 특성만 사용하였다. x[2], x[3]

5. 만들어둔 모델에 테스트 데이터 넣고 모델 테스트 & accuracy 확인

from sklearn.metrics import accuracy_score
            # 모델
y_pred_tr = iris_tree.predict(X_train)
                # 정답        예측값
accuracy_score(y_train, y_pred_tr)

6. 훈련 데이터에 대한 결정경계 확인

from mlxtend.plotting import plot_decision_regions

plt.figure(figsize=(8,5))
plot_decision_regions(X=X_train, y=Y_train, clf=iris_tree, legend=2)
plt.show()   # 복잡하지 않은게 더 좋을지도 모른다!

suji

learning Data Science

이전 포스트

머신러닝 모델링 과정중 random_state의 의미

다음 포스트

머신러닝 Accuracy까지의 과정

머신러닝

1. 데이터 셋 얻기

2. 데이터를 훈련 / 테스트로 분리

3. train 데이터 대상 결정나무 모델 만들기

4. plot_tree 확인

5. 만들어둔 모델에 테스트 데이터 넣고 모델 테스트 & accuracy 확인

6. 훈련 데이터에 대한 결정경계 확인

머신러닝 모델링 과정중 random_state의 의미

Hisplot - seaborn 데이터 시각화

0개의 댓글