쓰면서 배우는 YOLOv8

RHJ·2023년 3월 9일

yolov8

목록 보기

1/1

YOLOv8 Python Docs

영상처리 개발자로 1년 반동안 YOLO 시리즈를 사용하면서 사내 깃랩에만 정리하고 내 깃이나 블로그에는 정리 안해서 반성할 겸.. 작성합니다...
git: https://github.com/RHEEJEONGJIN/velog_yolov8_tutorial

YOLOv8 설치

pip install ultralytics

ultralytics를 pip install해서 yolov8에 필요한 다른 패키지들도 같이 설치되는 것을 확인할 수 있다.

패키지 및 라이브러리 불러오기

from ultralytics import YOLO
import cv2
import os
import urllib

우선 간단하게 cv2와 ultralytics 패키지부터 import 시키자.

이미지 저장

# download image
os.makedirs('data', exist_ok=True)
urllib.request.urlretrieve("https://ultralytics.com/images/bus.jpg", 'data/bus.jpg')

ultralytics github에 있는 이미지를 다운로드 받자.
urllib 패키지를 사용해서 다운로드 받을 수 있다.
urllib.request.urlretrieve("url 경로", '저장할 이름')

YOLOv8 모델 불러오기

# load model
model = YOLO("./pretrained/yolov8s.pt")

yolov8로 바뀌면서 기존 yolov5와 가장 큰 변화가 패키지화가 된 것이다.
물론 ultralytics 깃페이지에서 git clone을 통해 기존처럼 사용할 수 있지만
변화의 트렌드에 맞춰 git clone하지 않고 유지보수하기 좋게 pip 패키지로만 진행할 예정이다.

yaml 파일 준비

customizing 하는데 필요한 준비물로 dataset yaml 파일과 cfg yaml 파일이 필요하다.
해당 파일들은 ultralytics github에도 있지만 패키지 안에 존재하므로 나는 주로 패키지 내에서 검색해서 찾고 복사해서 내 폴더 내 따로 만들어 준다.

coco128.yaml

# Ultralytics YOLO 🚀, GPL-3.0 license
# COCO128 dataset https://www.kaggle.com/ultralytics/coco128 (first 128 images from COCO train2017) by Ultralytics
# Example usage: yolo train data=coco128.yaml
# parent
# ├── yolov5
# └── datasets
#     └── coco128  ← downloads here (7 MB)


# Train/val/test sets as 1) dir: path/to/imgs, 2) file: path/to/imgs.txt, or 3) list: [path/to/imgs1, path/to/imgs2, ..]
path: ../datasets/coco128  # dataset root dir
train: images/train2017  # train images (relative to 'path') 128 images
val: images/train2017  # val images (relative to 'path') 128 images
test:  # test images (optional)

# Classes
names:
  0: person
  1: bicycle
  2: car
  3: motorcycle
  4: airplane
  5: bus
  6: train
  7: truck
  8: boat
  9: traffic light
  10: fire hydrant
  11: stop sign
  12: parking meter
  13: bench
  14: bird
  15: cat
  16: dog
  17: horse
  18: sheep
  19: cow
  20: elephant
  21: bear
  22: zebra
  23: giraffe
  24: backpack
  25: umbrella
  26: handbag
  27: tie
  28: suitcase
  29: frisbee
  30: skis
  31: snowboard
  32: sports ball
  33: kite
  34: baseball bat
  35: baseball glove
  36: skateboard
  37: surfboard
  38: tennis racket
  39: bottle
  40: wine glass
  41: cup
  42: fork
  43: knife
  44: spoon
  45: bowl
  46: banana
  47: apple
  48: sandwich
  49: orange
  50: broccoli
  51: carrot
  52: hot dog
  53: pizza
  54: donut
  55: cake
  56: chair
  57: couch
  58: potted plant
  59: bed
  60: dining table
  61: toilet
  62: tv
  63: laptop
  64: mouse
  65: remote
  66: keyboard
  67: cell phone
  68: microwave
  69: oven
  70: toaster
  71: sink
  72: refrigerator
  73: book
  74: clock
  75: vase
  76: scissors
  77: teddy bear
  78: hair drier
  79: toothbrush


# Download script/URL (optional)
download: https://ultralytics.com/assets/coco128.zip

default.yaml

# Ultralytics YOLO 🚀, GPL-3.0 license
# Default training settings and hyperparameters for medium-augmentation COCO training

task: detect  # inference task, i.e. detect, segment, classify
mode: train  # YOLO mode, i.e. train, val, predict, export

# Train settings -------------------------------------------------------------------------------------------------------
model: ./pretrained/yolov8s.pt # path to model file, i.e. yolov8n.pt, yolov8n.yaml
data: ./dataset/coco128.yaml # path to data file, i.e. coco128.yaml
epochs: 1  # number of epochs to train for
patience: 50  # epochs to wait for no observable improvement for early stopping of training
batch: 16  # number of images per batch (-1 for AutoBatch)
imgsz: 640  # size of input images as integer or w,h
save: True  # save train checkpoints and predict results
save_period: -1 # Save checkpoint every x epochs (disabled if < 1)
cache: False  # True/ram, disk or False. Use cache for data loading
device: 0 # device to run on, i.e. cuda device=0 or device=0,1,2,3 or device=cpu
workers: 8  # number of worker threads for data loading (per RANK if DDP)
project: runs/custom # project name
name: rhee # experiment name
exist_ok: True  # whether to overwrite existing experiment
pretrained: False  # whether to use a pretrained model
optimizer: SGD  # optimizer to use, choices=['SGD', 'Adam', 'AdamW', 'RMSProp']
verbose: True  # whether to print verbose output
seed: 0  # random seed for reproducibility
deterministic: True  # whether to enable deterministic mode
single_cls: False  # train multi-class data as single-class
image_weights: False  # use weighted image selection for training
rect: False  # support rectangular training if mode='train', support rectangular evaluation if mode='val'
cos_lr: False  # use cosine learning rate scheduler
close_mosaic: 10  # disable mosaic augmentation for final 10 epochs
resume: False  # resume training from last checkpoint
min_memory: False  # minimize memory footprint loss function, choices=[False, True, <roll_out_thr>]
# Segmentation
overlap_mask: True  # masks should overlap during training (segment train only)
mask_ratio: 4  # mask downsample ratio (segment train only)
# Classification
dropout: 0.0  # use dropout regularization (classify train only)

# Val/Test settings ----------------------------------------------------------------------------------------------------
val: True  # validate/test during training
split: val  # dataset split to use for validation, i.e. 'val', 'test' or 'train'
save_json: False  # save results to JSON file
save_hybrid: False  # save hybrid version of labels (labels + additional predictions)
conf:  # object confidence threshold for detection (default 0.25 predict, 0.001 val)
iou: 0.7  # intersection over union (IoU) threshold for NMS
max_det: 300  # maximum number of detections per image
half: False  # use half precision (FP16)
dnn: False  # use OpenCV DNN for ONNX inference
plots: True  # save plots during train/val

# Prediction settings --------------------------------------------------------------------------------------------------
source:  # source directory for images or videos
show: False  # show results if possible
save_txt: False  # save results as .txt file
save_conf: False  # save results with confidence scores
save_crop: False  # save cropped images with results
hide_labels: False  # hide labels
hide_conf: False  # hide confidence scores
vid_stride: 1  # video frame-rate stride
line_thickness: 3  # bounding box thickness (pixels)
visualize: False  # visualize model features
augment: False  # apply image augmentation to prediction sources
agnostic_nms: False  # class-agnostic NMS
classes:  # filter results by class, i.e. class=0, or class=[0,2,3]
retina_masks: False  # use high-resolution segmentation masks
boxes: True  # Show boxes in segmentation predictions

# Export settings ------------------------------------------------------------------------------------------------------
format: torchscript  # format to export to
keras: False  # use Keras
optimize: False  # TorchScript: optimize for mobile
int8: False  # CoreML/TF INT8 quantization
dynamic: False  # ONNX/TF/TensorRT: dynamic axes
simplify: False  # ONNX: simplify model
opset:  # ONNX: opset version (optional)
workspace: 4  # TensorRT: workspace size (GB)
nms: False  # CoreML: add NMS

# Hyperparameters ------------------------------------------------------------------------------------------------------
lr0: 0.01  # initial learning rate (i.e. SGD=1E-2, Adam=1E-3)
lrf: 0.01  # final learning rate (lr0 * lrf)
momentum: 0.937  # SGD momentum/Adam beta1
weight_decay: 0.0005  # optimizer weight decay 5e-4
warmup_epochs: 3.0  # warmup epochs (fractions ok)
warmup_momentum: 0.8  # warmup initial momentum
warmup_bias_lr: 0.1  # warmup initial bias lr
box: 7.5  # box loss gain
cls: 0.5  # cls loss gain (scale with pixels)
dfl: 1.5  # dfl loss gain
fl_gamma: 0.0  # focal loss gamma (efficientDet default gamma=1.5)
label_smoothing: 0.0  # label smoothing (fraction)
nbs: 64  # nominal batch size
hsv_h: 0.015  # image HSV-Hue augmentation (fraction)
hsv_s: 0.7  # image HSV-Saturation augmentation (fraction)
hsv_v: 0.4  # image HSV-Value augmentation (fraction)
degrees: 0.0  # image rotation (+/- deg)
translate: 0.1  # image translation (+/- fraction)
scale: 0.5  # image scale (+/- gain)
shear: 0.0  # image shear (+/- deg)
perspective: 0.0  # image perspective (+/- fraction), range 0-0.001
flipud: 0.0  # image flip up-down (probability)
fliplr: 0.5  # image flip left-right (probability)
mosaic: 1.0  # image mosaic (probability)
mixup: 0.0  # image mixup (probability)
copy_paste: 0.0  # segment copy-paste (probability)

# Custom config.yaml ---------------------------------------------------------------------------------------------------
cfg:  # for overriding defaults.yaml

# Debug, do not modify -------------------------------------------------------------------------------------------------
v5loader: False  # use legacy YOLOv5 dataloader

# Tracker settings ------------------------------------------------------------------------------------------------------
tracker: botsort.yaml  # tracker type, ['botsort.yaml', 'bytetrack.yaml']

모델 훈련

model.train(cfg="cfg/custom.yaml")

default.yaml 파일을 cutom.yaml 파일이라 이름을 바꾸어 사용한다.

model: pretrained/yolov8s.pt # path to model file, i.e. yolov8n.pt, yolov8n.yaml
data: dataset/coco128.yaml # path to data file, i.e. coco128.yaml
epochs: 1  # number of epochs to train for
patience: 50  # epochs to wait for no observable improvement for early stopping of training
batch: 16  # number of images per batch (-1 for AutoBatch)
imgsz: 640  # size of input images as integer or w,h
save: True  # save train checkpoints and predict results
save_period: -1 # Save checkpoint every x epochs (disabled if < 1)
cache: False  # True/ram, disk or False. Use cache for data loading
device: 0 # device to run on, i.e. cuda device=0 or device=0,1,2,3 or device=cpu
workers: 8  # number of worker threads for data loading (per RANK if DDP)
project: runs/custom # project name
name: rhee # experiment name
exist_ok: True  # whether to overwrite existing experiment
pretrained: True  # whether to use a pretrained model
optimizer: SGD  # optimizer to use, choices=['SGD', 'Adam', 'AdamW', 'RMSProp']

파라미터들을 조정해서 훈련시키고 있다.
exist_ok를 하면 experiments가 계속 생성이 안되고 내가 지정한 이름 한개로만 결과가 업데이트 된다. 나는 컴퓨터 용량을 생각해서 True로 진행했다.

검증

validation 코드는 잘 안써서 코드만 언급하겠다. 훈련과정에서 나타나는 것이라 따로 나타내야할 필요가 있는가? 싶기도 하다.

# validation
metrics = model.val()

추론

# load image
img = cv2.imread("data/bus.png", cv2.IMREAD_COLOR)

이미지 먼저 불러온 뒤 추론을 진행하겠다.

results = model(img)
print(results)

추론 방법은 매우 간단하다.
학습 시킨 모델에 source에 이미지만 넣어주면 끝난다.

boxes = results[0].boxes
box = boxes[0]  # returns one box
print(box.xyxy)
print(box.conf)

이제 모델의 output인 results에서 object의 xyxy와 conf를 잘 조절하면 된다.
나는 현재 사람 인식을 주로 하고 있으며, 사람에 대한 오검출 및 미검출을 해결하기 위한 단계를 진행중이다.
사람에 대한 오검출을 줄이기위한 아이디어로 output의 각 클래스 별 점수를 조절해서 사람과 다른 사물의 점수가 비슷하게 검출된다면 해당 object는 검출하지 않겠다. 라는 방법을 시도 중이다.

data customizing은 다음편에서 진행하겠다.

RHJ

영상처리개발자