0602 퍼셉트론

이나겸·2022년 6월 2일

1. 학습내용

AI란?

기계와 사람의 대화를 제 3자가 듣고 누가 기계인지 사람인지 분별하기 어려운 경우 튜링 테스트 통과
컴퓨터로부터 반응을 인간과 구별할 수 없으면 컴퓨터는 생각할 수 있는것
AI란 기계에게 지식과 경험을 가르쳐주는 것이고 이를 통해 사람처럼 행동하는 모든 유형의 기계

Representation learning

= 특징점 찾는 알고리즘
T라는 과업에서 P의 향상을 위해 E를 학습했다.
훈련하는 알고리즘이 학습 과정에서 데이터(E)의 중요한 특징을 스스로 파악한다는 것
사람은 낱말카드나 사물을 인식할 때 낱말카드, 사물카드에 대한 답은 이거다. 라고 답을 맞췄다.
Representation learning은 데이터에서 중요한 특징을 알고리즘이 감지하는 것

Deep Learning

결국 AI라는 광범위한 분야 중 머신러닝 이면서 동시에 데이터에서 핵심을 잘 선별해내는 기술
이 분야에서 매우 유능한 알고리즘 하나가 등장하는데, 인공신경망 이다.
ANN > DNN > CNN

인공신경망 용어

Input Layer : 데이터가 입력되는 계층
Hidden Layer : 데이터가 전달되는 계층
Output Layer : 데이터가 출력되는 계층, 분석된 것을 원핫벡터 형태로 나오게 함
Units : 데이터를 받아들여 다음 계층으로 전달할지 판단
Weights : 각 Units의 연결강도를 결정하는 가중치
Activation (Function) : Units에서 다음 신호로 보낼지 판별하는 함수

퍼셉트론

각 노트의 입력치 n개와 가중치 n을 곱한 값을 모두 합한 것
합한 것을 활성함수로 판단해 True, False로 출력한다.
활성함수에는 CrossEntropy, sigmoid 등 다양하다.
가중치의 크기는 일반적으로 입력값의 중요도를 나타낸다.

다층 퍼셉트론

앞에서 input, output 사이에 hidden layer가 있었는데 단층은 hidden layer가 없다.
단층과 다층을 구분할 때는 은닉층이 있는지 여부에 따라 구분한다.
단층 : 데이터 입력층과 출력층만 존재
다층 : 입력층과 출력층 사이에 하나 이상의 은닉층 존재
복잡한 문제를 해결하려면 다층 퍼셉트론 내 은닉층 개수가 많아진다 -> 심층 신경망 (DNN)
여러개 은닉층으로 구성된 인공신경망 (ANN)

2. 중요내용

단층 퍼셉트론을 이용한 게이트 구현

"""
단층 퍼셉트론 이용한 AND NAND OR 게이트 쉽게 구현 
1. AND 
- 두 개의 입력 값이 모두1인 경우만 output 1 아닌경우 0 
input : x1, x2 / output : y
w1 , w2 , b 라고 하면 (w 가중치 , b 편향 값)
"""

def AND_gate(x1, x2):
    w1 = 0.5
    w2 = 0.5
    b = -0.7
    result = x1 * w1 + x2 * w2 + b
    print("result >> ", result)
    if result <= 0:
        return 0
    else:
        return 1


"""
NAND 게이트 :두개의 입력값이 1인 경우에만 출력값이 0 나머지 쌍에 대해서는 모두 출력 값 1 
"""

def NAND_gate(x1, x2):
    w1 = -0.5
    w2 = -0.5
    b = 0.7
    result = x1 * w1 + x2 * w2 + b  # 단층 퍼셉트론 공식
    if result <= 0:
        return 0
    else:
        return 1

# test = NAND_gate(1, 0)
# print("test >> ", test)

# # [0,0] [0,1] [1,0] [1,1]
# test = AND_gate(0, 1)
# print("test >> ", test)


"""
OR 게이트 두값이 0 , 0 - >  0 / 0, 1 -> 1 두값이 서로 다르면 1 
"""

def OR_gate(x1, x2):
    w1 = 0.5
    w2 = 0.5
    b = -0.4
    result = x1 * w1 + x2 * w2 + b  # 단층 퍼셉트론 공식
    if result <= 0:
        return 0
    else:
        return 1

test = OR_gate(1, 1)
print("Test>>", test)

파이토치로 다중 퍼셉트론 구하기

"""
파이토치로 다층 퍼셉트론 구현하기 
"""
import torch
import torch.nn as nn
"""
# GPU 사용가능한 여부 파악 test code -> CPU 인텔 엔비디아 GPU / AMD 엔비디아 GPU 맥북 M1 
device = "cuda" if torch.cuda.is_available() else "cpu"
"""

# M1 사용중인 분들
device = torch.device("cpu")
# str_device = str(device)
# print("device info >> ", type(str_device))

# seed
torch.manual_seed(777)

if device == "cuda":
    torch.cuda.manual_seed_all(777)

# 데이터 생성
x = [[0, 0], [0, 1], [1, 0], [1, 1]]  # 학습 데이터
y = [[0], [1], [1], [0]]              # 정답지

# 데이터 텐서 변경
x = torch.tensor(x, dtype=torch.float32).to(device)
y = torch.tensor(y, dtype=torch.float32).to(device)

"""
다층퍼셉트론 설계 
"""
model = nn.Sequential(
    nn.Linear(2, 10, bias=True),
    nn.Sigmoid(),
    nn.Linear(10, 10, bias=True),
    nn.Sigmoid(),
    nn.Linear(10, 10, bias=True),
    nn.Sigmoid(),
    nn.Linear(10, 1, bias=True),
    nn.Sigmoid(),
)

model.to(device)

"""
Loss function BCELoss() 이진분류에서 사용되는 크로스엔트로피 함수 
"""
criterion = torch.nn.BCELoss().to(device)
optimizer = torch.optim.SGD(model.parameters(), lr=0.01)

"""
학습 코드 작성 하기
"""
for epoch in range(10001):

    optimizer.zero_grad()  # optimizer 초기화

    # forward 연산
    output = model(x)

    # loss 계산
    loss = criterion(output, y)
    loss.backward()
    optimizer.step()

    # print show
    if epoch % 100 == 0:
        print(f"epoch >> {epoch} Loss >> {loss.item()}")

"""
학습된 다층 퍼셉트로의 예측값 확인 
"""
with torch.no_grad():
    output = model(x)
    predicted = (output > 0.5).float()
    acc = (predicted == y).float().mean()
    print("모델의 출력값 output >> ", output.detach().cpu().numpy())
    print("모델의 예측값 predicted >> ", predicted.detach().cpu().numpy())
    print("실제값 >>> ", y.cpu().numpy())
    print("정확도 >>> ", acc.item()*100)

다중 퍼셉트론으로 손글씨 분류

"""
다층 퍼셉트론으로 손글씨 분류
사이킷런 패키지에서 제공하는 분류용 예측 데이터를 사용
0~9까지의 숫자를 손으로 쓴 이미지 데이터로 load_digits() 명령어로 로드
각 이미지 사이즈는 8 * 8 = 64px 구성
흑백 이미지 갯수 1,797개
"""

import matplotlib.pyplot as plt
from sklearn.datasets import load_digits

import torch
import torch.nn as nn
from torch import optim

digits = load_digits()
print("image 행렬\n", digits.images[0])
print("타겟 >> ", digits.target[0])
print("전체 데이터 >> ", len(digits.images))


# 상위 5개만 샘플이미지 확인
# zip, image = [1, 2, 3, 4], label = [사과, 자몽, 바나나, 수박]

images_and_labels = list(zip(digits.images, digits.target))

# for index, (image, label) in enumerate(images_and_labels[:4]):
#     plt.subplot(2, 5, index+1)
#     plt.axis('off')
#     plt.imshow(image, cmap=plt.cm.gray_r, interpolation='nearest')
#     plt.title("sample : %i" % label)
#     plt.show()

# 데이터 생성
x = digits.data
y = digits.target


# 모델 생성
model = nn.Sequential (
    nn.Linear(64, 32), # 8*8, 64의 절반
    nn.ReLU(),
    nn.Linear(32, 16),
    nn.ReLU(),
    nn.Linear(16, 10) # 반의 반, output : 0 ~ 9, 총 9개
)


# 데이터 텐서
device = "cuda" if torch.cuda.is_available() else "cpu"

x = torch.tensor(x, dtype=torch.float32).to(device)
y = torch.tensor(y, dtype=torch.int64).to(device) # 64자리만큼 크기 할당, 경량화 모델은 8이나 16 사용

# loss function
loss_fn = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters())

loss_list = []
for epoch in range(101):
    optimizer.zero_grad()       # optimizer 초기화
    output = model(x)
    loss = loss_fn(output, y)   # 오차범위 줄이기
    loss.backward()
    optimizer.step()

    if epoch % 10 == 0:
        print("epoch {:4d}/{} loss : {:.6f}".format(epoch, 100, loss.item()))

    loss_list.append(loss.item()) # loss는 텐서값이라서 item() 값을 넣어야 함

plt.title("loss")
plt.plot(loss_list)
plt.show()