Convolutional Neural Networks (CNN) 기초

전수향·2023년 5월 15일

인공지능

목록 보기

9/10

1. 이미지 종류

컴퓨터 비전 분야에서 다루는 이미지는 크게 두 가지로 나뉩니다. True color 이미지와 Grey scale 이미지입니다. True color 이미지는 RGB (Red, Green, Blue) 채널을 사용하여 24-bit 색상으로 구성된 이미지입니다. 즉, 이미지 하나에 16,777,216개의 색상이 있습니다. Grey scale 이미지는 흑백 이미지를 말하며, 흰색부터 검은색까지 총 256 단계의 명암을 가집니다.

2. CNN을 사용하는 이유

CNN은 이미지 분류, 객체 탐지, 얼굴 인식 등 다양한 분야에서 활용됩니다. 이는 CNN의 특징 추출 능력과 효과적인 이미지 분류 기술 덕분입니다. CNN은 각 레이어에서 이미지의 다양한 특징을 추출하여 이를 기반으로 이미지를 분류합니다. 따라서 CNN은 다양한 종류의 이미지에 대해 높은 분류 정확도를 보입니다.

3. Convolution layer 동작원리

CNN에서 가장 중요한 레이어 중 하나는 Convolution layer입니다. Convolution layer는 입력 이미지와 학습된 필터를 합성곱하여 특징 맵(feature map)을 생성합니다. 이때, 필터는 이미지의 작은 영역을 나타내며, 이 영역은 일정한 간격으로 이동하면서 합성곱을 수행합니다. 이렇게 생성된 특징 맵은 다음 레이어로 전달됩니다.

4. Pooling layer 동작원리

Pooling layer는 Convolution layer에서 생성된 특징 맵의 크기를 줄이기 위해 사용됩니다. 이는 계산량을 줄이고, overfitting을 방지하는 데에 효과적입니다. Pooling layer는 입력 특징 맵을 일정한 크기의 영역으로 나누어 각 영역에서 최대값을 추출하는 방식으로 작동합니다.

5. Flatten이란?

Flatten은 다차원 배열을 1차원 배열로 바꾸는 과정을 말합니다. 이는 Fully Connected layer에 입력으로 넣기 위해 사용됩니다. Flatten 이후에는 각 뉴런이 이전 레이어의 모든 뉴런과 연결되어 있습니다.

6. CNN 예제 코드 작성

from keras.layers import Conv2D, MaxPooling2D, Flatten, Dense
from keras.models import Sequential

model = Sequential()

# Convolution layer
model.add(Conv2D(32, (3, 3), activation='relu', input_shape(64, 64, 3)))

# Pooling layer
model.add(MaxPooling2D(pool_size=(2, 2)))

# Flatten layer
model.add(Flatten())

# Fully Connected layer
model.add(Dense(units=128, activation='relu'))
model.add(Dense(units=1, activation='sigmoid'))

# Compile the model
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])

7. CNN 활용 예제

CIFAR-10 이미지 분류
CIFAR-10 데이터셋은 10개의 클래스로 구성된 32x32 크기의 컬러 이미지 데이터셋입니다. 이 데이터셋을 사용하여 이미지 분류 문제를 해결하는 CNN 모델을 구현해 보겠습니다.

from keras.datasets import cifar10
from keras.utils import to_categorical
from keras.models import Sequential
from keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Dropout

# Load the dataset
(train_images, train_labels), (test_images, test_labels) = cifar10.load_data()

# Preprocess the data
train_images = train_images.astype('float32') / 255
test_images = test_images.astype('float32') / 255
train_labels = to_categorical(train_labels)
test_labels = to_categorical(test_labels)

# Define the model
model = Sequential()
model.add(Conv2D(32, (3, 3), padding='same', activation='relu', input_shape=(32, 32, 3)))
model.add(Conv2D(32, (3, 3), padding='same', activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Conv2D(64, (3, 3), padding='same', activation='relu'))
model.add(Conv2D(64, (3, 3), padding='same', activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(512, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(10, activation='softmax'))

# Compile the model
model.compile(optimizer='rmsprop', loss='categorical_crossentropy', metrics=['accuracy'])

# Train the model
model.fit(train_images, train_labels, epochs=50, batch_size=64)

# Evaluate the model
test_loss, test_acc = model.evaluate(test_images, test_labels)
print('Test accuracy:', test_acc)

위 코드에서는 CIFAR-10 데이터셋을 불러온 뒤, Convolution layer, Pooling layer, Flatten layer, 그리고 Dense layer로 이루어진 모델을 구현하였습니다. 모델은 6개의 Convolution layer와 2개의 Dense layer로 구성됩니다. Dropout을 사용하여 과적합을 방지하였습니다.

모델을 학습시킨 뒤, 테스트 데이터셋으로 정확도를 평가하였습니다. 이 모델은 50 epoch를 학습한 뒤 약 79%의 정확도를 달성하였습니다. CIFAR-10은 일반적인 이미지 분류 문제에서 사용되는 대표적인 데이터셋 중 하나입니다. 이를 기반으로 한 CNN 모델은 다양한 분야에서 사용될 수 있습니다.

결론

CNN은 이미지 분류, 객체 검출, 얼굴 인식 등 다양한 영역에서 활용될 수 있는 강력한 딥러닝 알고리즘입니다. 이번 글에서는 CNN의 기본 개념과 동작 원리, 그리고 간단한 예제 코드를 소개하였습니다. 또한, CIFAR-10 데이터셋을 활용하여 이미지 분류 문제를 해결하는 CNN 모델을 구현하였습니다. 이를 기반으로 한 다양한 응용분야에서의 CNN 활용을 기대해 봅니다.

전수향

꿈나무 개발자

이전 포스트

딥러닝 모델 성능 검증

다음 포스트