[D&A 운영진 딥러닝 스터디] 3주차 2차시

권유진·2022년 1월 18일

AlexNet DenseNet Fine Tuning GoogleNet LeNet Resnet VGG transfer learning 딥러닝 자습 전이학습

D&A 운영진 스터디

목록 보기

6/17

CNN Architecture

ImageNet

이미지 분류 모델을 측정하기 위한 데이터로 가장 많이 사용하는 데이터셋
2만 개 이상의 클래스와 약 1400만장의 이미지로 구성
ILSVRC(ImageNet Large Scale Visual Recognition Challenge) 대회에서 사용

LeNet

Yann LeCun 교수가 제안한 최초의 CNN 모델
1990년대 당시에 컴퓨팅 문제가 있었기에 비교적 단순한 구조 보유
32*32 크기의 Input과 Convolution Layer 2개, Pooling Layer 2개, Fully Connected Layer 3개 보유

AlexNet

LeNet과 구조 유사
224*224 크기의 RGB 3 Channel Image를 Input으로 사용, Activation Function으로 ReLU 사용
- ReLU를 사용하여 더욱 빠름
Dropout과 Data Augmentation 적용
Pooling 수행 시, stride를 작게 설정하여 원소가 여러 번의 Pooling에 사용되도록 수행
Local Response Normalization(LRN) 수행
- 강하게 활성화된 뉴런의 주변 뉴런들에 대해 normalization을 수행하여 더욱 주목되도록 함

ZFNet

AlexNet의 hyperparamter를 수정
Convolution Layer 개수를 늘려 성능 개선

VGG

이전의 모델과 달리 3*3 Convolution Layer 깊게 중첩
깊이에 따라 VGG16, VGG19 등으로 불림

torchvision.models.vgg

위 코드를 활용하여 VGG11~VGG19 생성 가능
- VGG 뒤의 숫자는 Layer의 개수(Conv Layer 개수 + FC Layer 개수)
3*224*224 입력을 기준으로 만들도록 되어 있음(RGB*224*224)

GoogLeNet(Google + LeNet)

Inception Model이라고 불림
Inception Module을 CNN에 도입
- 한 Layer 내에서 서로 다른 연산을 거친 후 Feature Map을 다시 합치는 방식
  - 한 Feature Map에서 여러 Convolution 적용
  - 작은 규모의 Feature, 비교적 큰 규모의 Feature을 한번에 학습
총 9개의 Inception Module로 구성
마지막 Fully Connected Layer에서 Global Average Mean(GAP)로 대체
- 마지막 Feature Map에 대해 각각의 값을 평균내 연결
- 파라미터 수 크게 감소

ResNet(Residual Network)

네트워크가 깊어짐에 따라 앞 단의 Layer에 대한 정보 희석
Residual Block 개념 도입
- Skip Connection: 이전 Layer의 Feature Map을 다음 Layer의 Feature Map에 더해주는 개념
- 이전의 정보를 뒤에서도 함께 활용
1. BasicBlock: Conv Layer (3*3, 64) -> (3*3, 64) 사용
2. Bottleneck: Conv Layer (1*1, 64) -> (3*3, 64) -> (1*1, 256) 사용

DenseNet

ResNet 확장시킨 버전
모든 Layer에 Skip Connection 적용
- ResNet은 이전 Layer와 다음 Layer에 Skip Connection 적용
Output 수를 맞추기 위해 growth rate을 사용하여 parameter 수를 줄임

+ alpha

모바일에서 사용을 위해 가벼운 모델인 MobileNet, SqueezeNet, ShuffleNet 등장

Transfer Learning

데이터가 부족한 경우 이미 다른 데이터로 학습해 놓은 딥러닝 모델(Pre-trained Model)을 가져와 재학습(Fine-tuning)
- 일반적으로 FC Layer 앞단 네트워크 weight를 가져오고 Fully Connected Layer를 디자인
- Output Layer만 디자인하기도 한다.
- Weight Freezing
- 그들의 Feature 활용
Initialization 개념: 초기 weight에 Pre-Trained Model의 weight 사용

ResNet 구조만 불러오기

import torchvision.models as models

model = models.resnet34(pretrained=False) # ResNet 모델 불러오기(pretrained=False이므로 weight 제외하고 구조만 가져옴)
num_ftrs = model.fc.in_features # model의 Fully-Connected Layer를 구성하고 있는 부분에 접근해 input에 해당하는 노드 수 저장
model.fc = nn.Linear(num_ftrs, 10) # model에 새로운 레이어 추가
model = model.cuda() # model.to(device)와 동일

ResNet 불러와 Fine-Tuning

import torchvision.models as models

model = models.resnet34(pretrained=True)
num_ftrs = model.fc.in_features
model.fc = nn.Linear(num_ftrs, 10)
model = model.cuda()

VGG 코드

class VGG(nn.Module):
    def __init__(self, features, num_classes=1000, init_weights=True):
        super(VGG, self).__init__()
        self.features = features # VGG Convolution Layers
        self.avgpool = nn.AdaptiveAvgPool2d((7,7)) # output의 크기에 따라 알아서 average pooling
        
        self.classifier = nn.Sequential([ # Fully-Connected Layer 3개
            nn.Linear(512*7*7, 4096),
            nn.ReLU(True),
            nn.Dropout(),
            nn.Linear(4096, 4096),
            nn.ReLU(True),
            nn.Dropout(),
            nn.Linear(4096, num_classes)
        ])
        if init_weights:
            self._initialize_weights()
    def forward(self, x):
        x = self.features(x) # Convolution Layer
        x = self.avgpool(x) # avgpool
        x = x.view(x.size(0), -1)
        x = self.classifier(x) # F/C Layer
        return x
    def _initialize_weights(self):
        for m in self.modules(): # features에 남겨뒀던 값들을 하나씩 가져옴
            if isinstance(m, nn.Conv2d): # convolution 일 경우
                nn.init.kaiming_normal_(m.weight, model='fan_out', nonlinearity='relu') # He Initialization
                if m.bias in not None: # VGG에서는 bias를 0으로 설정하기 때문에 0으로 설정
                    nn.init.constant_(m.bias, 0)
            elif isinstance(m, nn.BatchNorm2d):
                nn.init.constant_(m.weight, 1)
                nn.init.constant_(m.bias, 0)
            elif isinstance(m, nn.Linear):
                nn.init.normal_(m.weight, 0, 0.01)
                nn.init.constant_(m.bias, 0)
                
def make_layers(cfg, batch_norm=False):
    layers = []
    in_channels = 3
    for v in cfg:
        if v == 'M': # MaxPooling Layer
            layers += [nn.MaxPool2d(kernel_size=2, stride=2)]
        else: # Conv Layer의 filter의 개수
            conv2d = nn.Conv2d(in_channels, v, kernel_size=3, padding=1)
            if batch_norm:
                layers += [conv2d, nn.BatchNorm2d(v), nn.ReLU(inplace=True)]
            else:
                layers += [conv2d, nn.ReLU(inplace=True)]
            in_channels = v # in_channels 갱신
    return nn.Sequential(*layers)

위 모델 적용 코드

cfg = {
    'A':[64, 'M', 128, 'M', 256, 256, 'M', 512, 512,'M', 512, 512, 'M']
}

model = VGG(make_layers(cfg['A']))

ResNet 코드 (파이썬 딥러닝 파이토치)

class BasicBlock(nn.Module):
    def __init__(self, in_planes, planes, stride=1):
        super(BasicBlock, self).__init__() # nn.Module 내에 있는 메소드 상속
        self.conv1 = nn.Conv2d(in_planes, planes, # 데이터 채널 수(in_planes) 입력 받아 필터 개수(planes) 반환
                               kernel_size=3, stride=stride,
                               padding=1, bias=False)
        self.bn1 = nn.BatchNorm2d(planes) # planes 크기의 데이터에 적용할 수 있는 BatchNorm2d
        self.conv2 = nn.Conv2d(planes, planes, 
                              kernel_size=3, stride=stride,
                              padding=1, bias=False)
        self.bn2 = nn.BatchNorm2d(planes)
        
        self.shortcut = nn.Sequential()
        if stride != 1 or in_planes != planes: # for skip-connection
            self.shortcut = nn.Sequential(
                nn.Conv2d(in_planes, planes,
                         kernel_size=1, stride=stride,
                         bias=False),
                nn.BatchNorm2d(planes)
                )
            
        def forward(self, x):
            out = nn.functional.relu(self.bn1(self.conv1(x)))
            out = self.bn2(self.conv2(out))
            out += shortcut(x) # skip-connection
            out = nn.functional.relu(out)
            return out
        
class ResNet(nn.Module):
    def __init__(self, num_classes=10):
        super(ResNet, self).__init__()
        self.in_planes = 16 # 16 채널 수
        self.conv1 = nn.Conv2d(3, 16, # Basic Block 클래스 내에서 이용하는 self.conv1과 다름 # input으로 이용하는 컬러 이미지에 적용
                              kernel_size=3, stride=1,
                              padding=1, bias=False)
        self.bn1 = nn.BatchNorm2d(16)
        self.layer1 = self._make_layer(16, 2, stride=1)
        self.layer2 = self._make_layer(32, 2, stride=2)
        self.layer3 = self._make_layer(64, 2, strdie=2)
        self.linear = nn.Linear(64, num_classes)
        
    def _make_layer(self, planes, num_blocks, stride): # 여러 층의 레이어를 구성해 반환해주는 메소드
        strides = [stride] + [1]*(num_blocks-1) # stride를 BasicBlock마다 설정
        layers = []
        for stride in strides:
            layers.append(BasicBlock(self.in_planes, planes, stride)) # 처음에만 in_planes를 input으로
            self.in_planes = planes # 두번째부터는 planes를 input으로
        return nn.Sequential(*layers)
    
    def forward(self, x):
        out = nn.functional.relu(self.bn1(self.conv1(x)))
        out = self.layer1(out) # 16채널을 input으로 받아 16채널을 output으로 하는 BasicBlock 2개 생성
        out = self.layer2(out) # 16채널을 input으로 받아 32채널을 output으로 하는 BasicBlock 1개 생성 + 32->32 1개 생성
        out = self.layer3(out) # 32->64 BasicBlock 1개 생성 + 64->64 1개 생성
        out = nn.functional.avg_pool2d(out, 8) # feature map에 average pooling 적용
        out = out.view(out.size(0), -1)
        out = self.linear(out)
        return out

ResNet 코드 (모두를 위한 딥러닝)

def conv3x3(in_planes, out_planes, stride=1):
    # 3*3 conv with padding
    return nn.Conv2d(in_planes, out_planes,
                     kernel_size=3, stride=stride,
                     padding=1, bias=False)
def conv1x1(in_planes, out_planes, stride=1):
    # 1*1 conv
    return nn.Conv2d(in_planes, out_planes,
                     kernel_size=1, stride=stride,
                     bias=False)
                     
class BasicBlock(nn.Module):
    expansion = 1
    def __init__(self, inplanes, planes, stride=1, downsample=None):
        super(BasicBlock, self).__init__()
        self.conv1 = conv3x3(inplanes, planes, stride)
        self.bn1 = nn.BatchNorm2d(planes)
        self.relu = nn.ReLU(inplace=True)
        self.conv2 = conv3x3(planes, planes)
        self.bn2 = nn.BatchNorm2d(planes)
        self.downsample = downsample
        self.stride = stride 
    
    def forward(self, x):
        identity = x
        out = self.conv1(x)
        out = self.bn1(out)
        out = self.relu(out)
        
        out = self.conv2(out)
        out = self.bn2(out)
        
        if self.downsample is not None:
            # out과 identity가 shape이 안맞을 경우 identity downsampling
            identity = self.downsample(x)
            
        out += identity
        out = self.relu(out)
        return out
        
class Bottleneck(nn.Module):
    expansion = 4
    def __init__(self, inplanes, planes, stride=1, downsample=None):
        super(Bottleneck, self).__init__()
        self.conv1 = conv1x1(inplanes, planes)
        self.bn1 = nn.BatchNorm2d(planes)
        self.conv2 = conv3x3(planes, planes, stride)
        self.bn2 = nn.BatchNorm2d(planes)
        self.conv3 = conv1x1(planes, planes * self.expansion) # 4배로 뜀
        self.bn3 = nn.BatchNorm2d(planes * self.expansion)
        self.relu = nn.ReLU(inplace=True)
        self.downsample = downsample
        self.stride = stride
    def forward(self, x):
        identity = x
        
        out = self.conv1(x)
        out = self.bn1(out)
        out = self.relu(out)
        
        out = self.conv2(out)
        out = self.bn2(out)
        out = self.relu(out)
        
        out = self.conv3(out)
        out = self.bn3(out)
        
        if self.downsample is not None:
            identity = self.downsample(x)
            
        out += identity
        out = self.relu(out)
        return out
        
class ResNet(nn.Module):
    def __init__(self, block, layers, num_classes=1000, zero_init_residual=False):
        super(ResNet, self).__init__()
        
        self.inplanes = 64
        
        self.conv1 = nn.Conv2d(3, 64, kernel_size=7,
                              stride=2, padding=3, bias=False)
        self.bn1 = nn.BatchNorm2d(64)
        self.relu = nn.ReLU(inplace=True)
        self.maxpool = nn.MaxPool2d(kernel_size=3, stride=2, padding=1)
        
        self.layer1 = self._make_layer(block, 64, layers[0])
        self.layer2 = self._make_layer(block, 128, layers[1], stride=2)
        self.layer3 = self._make_layer(block, 256, layers[2], stride=2)
        self.layer4 = self._make_layer(block, 512, layers[3], stride=2)
        self.avgpool = nn.AdaptiveAvgPool2d((1,1))
        self.fc = nn.Linear(512 * block.expansion, num_classes)
        
        for m in self.modules():
            if isinstance(m, nn.Conv2d):
                nn.init.kaiming_normal_(m.weight, mode='fan_out', nonlinearity='relu')
            elif isinstance(m, nn.BatchNorm2d):
                nn.init.constant_(m.weight, 1)
                nn.init.constant_(m.bias, 0)
                
        if zero_init_residual:
            for m in self.modules():
                if isinstance(m, Bottleneck):
                    nn.init.constant_(m.bn3.weight, 0)
                elif isinstance(m, BasicBlock):
                    nn.init.constant_(m.bn2.weight, 0)
                    
    def _make_layer(self, block, planes, blocks, stride=1):
        downsample = None
        if stride != 1 or self.inplates != planes * block.expansion:
            downsample = nn.Sequential([
                conv1x1(self.inplanes, planes=block.expansion, stride),
                nn.BatchNorm2d(planes=block.expansion)
            ])
        layers = []
        layers.append(block(self.inplanes, planes, stride, downsample))
        self.inplanes = planes * block.expansion
        for _ in range(1, blocks):
            layers.append(block(self.inplanes, planes))
        return nn.Sequential(*layers)
    
    def forward(self, x):
        x = self.conv1(x)
        x = self.bn1(x)
        x = self.relu(x)
        x = self.maxpool(x)
        
        x = self.layer1(x)
        x = self.layer2(x)
        x = self.layer3(x)
        x = self.layer4(x)
        
        x = self.avgpool(x)
        x = x.view(x.size(0), -1)
        x = self.fc(x)
        return x

참고
파이썬 딥러닝 파이토치 (이경택, 방성수, 안상준)
모두를 위한 딥러닝 시즌 2 Lab 10-5, 10-6-1, 10-6-2

권유진

데이터사이언스를 공부하는 권유진입니다.

이전 포스트

[D&A 운영진 딥러닝 스터디] 3주차 1차시

다음 포스트

[D&A 운영진 딥러닝 스터디] 3주차 2차시

D&A 운영진 스터디

CNN Architecture

ImageNet

LeNet

AlexNet

ZFNet

VGG

GoogLeNet(Google + LeNet)

ResNet(Residual Network)

DenseNet

+ alpha

Transfer Learning

ResNet 구조만 불러오기

ResNet 불러와 Fine-Tuning

VGG 코드

ResNet 코드 (파이썬 딥러닝 파이토치)

ResNet 코드 (모두를 위한 딥러닝)

[D&A 운영진 딥러닝 스터디] 3주차 1차시

[D&A 운영진 딥러닝 스터디] 4주차 1차시

0개의 댓글