mmdetection에서 albu 라이브러리 적용하는 방법

ai_lim·2022년 5월 12일
0

retinanet_18 기준입니다.
저는 noodle을 detection하는 프로젝트를 해서 파일명을 noodle관련으로 바꿔줬습니다. 각자의 프로젝트에 맞게 이름 수정하면 될 것 같습니다!

  1. mmdetection > configs > base > datasets > noodles_detection.py을 생성한 후 원하는 albumentation을 구성한다.

    • mmdetection/configs/base/datasets/coco_detection.py 복붙해서 albu 구성하기!
    • mmdetection/configs/albu_example 에 albu 예시가 있음
    • 위의 예시는 mask_rcnn 기준이므로 train_pipeline에서 mask 관련된 것은 지워줬음
    # dataset settings
    dataset_type = 'CocoDataset'
    data_root = 'data/coco/'
    img_norm_cfg = dict(
        mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
    albu_train_transforms = [
        dict(
            type='ShiftScaleRotate',
            shift_limit=0.0625,
            scale_limit=0.0,
            rotate_limit=0,
            interpolation=1,
            p=0.5),
        dict(
            type='RandomBrightnessContrast',
            brightness_limit=[0.1, 0.3],
            contrast_limit=[0.1, 0.3],
            p=0.2),
        dict(
            type='OneOf',
            transforms=[
                dict(
                    type='RGBShift',
                    r_shift_limit=10,
                    g_shift_limit=10,
                    b_shift_limit=10,
                    p=1.0),
                dict(
                    type='HueSaturationValue',
                    hue_shift_limit=20,
                    sat_shift_limit=30,
                    val_shift_limit=20,
                    p=1.0)
            ],
            p=0.1),
        dict(type='JpegCompression', quality_lower=85, quality_upper=95, p=0.2),
        dict(type='ChannelShuffle', p=0.1),
        dict(
            type='OneOf',
            transforms=[
                dict(type='Blur', blur_limit=3, p=1.0),
                dict(type='MedianBlur', blur_limit=3, p=1.0)
            ],
            p=0.1),
    ]
    
    train_pipeline = [
        dict(type='LoadImageFromFile'),
        #dict(type='LoadAnnotations', with_bbox=True, with_mask=True),
        dict(type='LoadAnnotations', with_bbox=True),
        dict(type='Resize', img_scale=(1333, 800), keep_ratio=True),
        dict(type='Pad', size_divisor=32),
        dict(
            type='Albu',
            transforms=albu_train_transforms,
            bbox_params=dict(
                type='BboxParams',
                format='pascal_voc',
                label_fields=['gt_labels'],
                min_visibility=0.0,
                filter_lost_elements=True),
            keymap={
                'img': 'image',
               # 'gt_masks': 'masks',
                'gt_bboxes': 'bboxes'
            },
            update_pad_shape=False,
            skip_img_without_anno=True),
        dict(type='Normalize', **img_norm_cfg),
        dict(type='DefaultFormatBundle'),
        dict(
            type='Collect',
            #keys=['img', 'gt_bboxes', 'gt_labels', 'gt_masks'],
            keys=['img', 'gt_bboxes', 'gt_labels'],
            meta_keys=('filename', 'ori_shape', 'img_shape', 'img_norm_cfg',
                       'pad_shape', 'scale_factor'))
    ]
    
    test_pipeline = [
        dict(type='LoadImageFromFile'),
        dict(
            type='MultiScaleFlipAug',
            img_scale=(1333, 800),
            flip=False,
            transforms=[
                dict(type='Resize', keep_ratio=True),
                dict(type='RandomFlip'),
                dict(type='Normalize', **img_norm_cfg),
                dict(type='Pad', size_divisor=32),
                dict(type='ImageToTensor', keys=['img']),
                dict(type='Collect', keys=['img']),
            ])
    ]
    data = dict(
        samples_per_gpu=2,
        workers_per_gpu=2,
        train=dict(
            type=dataset_type,
            ann_file=data_root + 'annotations/instances_train2017.json',
            img_prefix=data_root + 'train2017/',
            pipeline=train_pipeline),
        val=dict(
            type=dataset_type,
            ann_file=data_root + 'annotations/instances_val2017.json',
            img_prefix=data_root + 'val2017/',
            pipeline=test_pipeline),
        test=dict(
            type=dataset_type,
            ann_file=data_root + 'annotations/instances_val2017.json',
            img_prefix=data_root + 'val2017/',
            pipeline=test_pipeline))
    evaluation = dict(interval=1, metric='bbox')
  2. mmdetection/configs/retinanet/retinanet_r18_fpn_1x_noodles.py 을 생성한다.

    • mmdetection/configs/retinanet/retinanet_r18_fpn_1x_coco.py 를 복붙해서 넣기
    • '../base/datasets/cocodetection.py' 를 '../_base/datasets/noodles_detection.py' 로 바꿔주기
      _base_ = [
          '../_base_/models/retinanet_r50_fpn.py',
          '../_base_/datasets/noodles_detection.py',
          '../_base_/schedules/schedule_1x.py', '../_base_/default_runtime.py'
      ]
      
      # optimizer
      model = dict(
          backbone=dict(
              depth=18,
              init_cfg=dict(type='Pretrained', checkpoint='torchvision://resnet18')),
          neck=dict(in_channels=[64, 128, 256, 512]))
      optimizer = dict(type='SGD', lr=0.01, momentum=0.9, weight_decay=0.0001)
      
      # NOTE: `auto_scale_lr` is for automatically scaling LR,
      # USER SHOULD NOT CHANGE ITS VALUES.
      # base_batch_size = (8 GPUs) x (2 samples per GPU)
      auto_scale_lr = dict(base_batch_size=16)

만난 오류!

AttributeError: module 'albumentations' has no attribute 'BboxParams'

해결방법

!pip install -U albumentations
    

설치하고 cv 버전도 맞춰줘야합니다.

!pip uninstall opencv-python-headless
!pip install opencv-python-headless==4.1.2.30

0개의 댓글