[딥러닝] CNN #1

SSOYEONG·2022년 4월 15일

Deep Learning

목록 보기

6/14

Convolution

Convolution f * h

Step 1. Flip the function h.
Step 2. Cross-multiply and sum the nonzero overlap terms.
Step 3. Slide h to the right by one position.
Step 4. Repeat until all the output values are zero.

2D Convolution

The result of convolution must be two dimensional.

Step 1. Flip the function g

Step 2. Cross-multiply and sum

Convolution의 의미

Convolution의 결과 == Input image에 특정 filter를 적용한 결과
Convolution을 사용하여 feature의 강도를 추출할 수 있다.

Convolutional Neural Network

CNN requires far fewer parameters than FC by using Parameter Sharing.
FC는 연산량이 너무 많기에 영상 처리에 불리함.
FC는 한 노드가 모든 노드에 연결되어 있음.
CNN은 input의 특정 영역과만 연결되어 있어서, 필요한 parameter가 적음.

Parameter Sharing이란?

Parameter Sharing means that the same weight matrix acts on all the neurons in a particular feature map. The same filter is applied in different regions of the image.
Training하고자 하는 파라미터에만 필터가 적용됨.
-> 그 필터를 움직이며 각 영역에서 feature map(output)을 계산한다.
-> 즉 모든 영역에서는 필터만 sharing한다.

Simple architecure of CNN

Conv layer

Channel(depth)가 변한다. Feature map 컨트롤

Pool layer

Feature map은 유지하고 Width 컨트롤. 이미지를 작게 만든다.

FC layer

3D를 1D로 변환

왜 conv layer를 거치면 channel 수가 변할까?

input이 3x64x64이고 kernel은 3x3인 경우
어떻게 kernel을 곱해야 output이 2D가 될까?
kernel도 3장으로 맞추어 줘야 3x64x64와 사이즈가 맞아서 output이 2D가 될 수 있다.
The channel of a conv layer is the same as the number of kernels.
Output channel is not related to input channel.
계산해보기 https://cs231n.github.io/convolutional-networks/

Padding

Conv layer를 거치면서 input에 비해 output size가 줄어드는 문제
Border에 padding을 주어 해결한다.

Receptive field

Kernel(filter) size
Output에 영향을 주는 spatial size
CNN에서 초반에는 좁은 범위, 특정 영역의 feature만 확인하다가
레이어가 깊어질 수록 receptive field 사이즈를 증가시켜서 큰 범위를 본다.
더 high level 정보, 전체적인 정보를 얻을 수 있다.

Receptive field를 증가시키는 방법 두 가지

Increase the Kernel size -> 연산량이 늘어난다는 단점
Decrease the image(or input) size -> 적합

Image size를 줄이는 방법 두 가지: Stride and Pooling

Stride

Stride is the step of a convolution operation.
If the stride is 2, the output size is halved.
커널 이동 보폭

Pooling

Choose a representative value from a region

Fully connected layer

Network 마지막에 fully-connected layer 있음.
FC 전에 feature extraction layers가 있고, FC는 classification layers로 칭함
CNN과 FC를 연결할 때, flatten or linear 연산을 수행한다.
Last layer를 1D로 변환

1 X 1 convolution

Convolution 은 1x1 사이즈의 filter를 사용한다.
Channel 수 조절 | 연산량 감소 | Non-linearity

Computation complexity를 줄인다.
연산량이 줄어들면, 더 많은 feature를 training할 수 있다.
Nonlinear 계산이 더 많아지므로 performance가 증가한다.

📌 Note

CNN이 이미지에 유리한 이유, 왜 필요한 파라미터가 적은지
1x1 conv의 역할

References

https://hwiyong.tistory.com/45

https://velog.io/@dldydldy75/CNN-1x1-Convolution-%EA%B3%BC-%EC%BB%B4%ED%93%A8%ED%84%B0-%EB%B9%84%EC%A0%84