[CV] YOLOv8 + MLFlow For Beginner

์œฐ์ง„ยท2023๋…„ 6์›” 2์ผ
1

Competetion

๋ชฉ๋ก ๋ณด๊ธฐ
1/8

๐ŸŒท ํ•™์Šต ์ค‘ ์ •๋ฆฌํ•œ ๋‚ด์šฉ์œผ๋กœ ๋ฌธ์ œ๊ฐ€ ์žˆ๋‹ค๋ฉด ๋ฉ”์ผ(jnw__@๋„ค์ด๋ฒ„)๋กœ ์—ฐ๋ฝ ๋ถ€ํƒ๋“œ๋ฆฝ๋‹ˆ๋‹ค. :D

Reference

00. ๊ฐœ์š”

Reference
Baek Kyun Shin๋‹˜์˜ ๋…ผ๋ฌธ ๋ฆฌ๋ทฐ - YOLO(You Only Look Once) ํ†บ์•„๋ณด๊ธฐ

You Only Look Once๋ผ๋Š” ์ด๋ฆ„์˜ YOLO
ํ•œ ๋ฒˆ์˜ Convolution Network๋กœ ์ด๋ฏธ์ง€์˜ bounding box์™€ ๊ทธ ํ™•๋ฅ ์„ ๊ณ„์‚ฐํ•˜์—ฌ ๊ฐ€์žฅ ํ™•๋ฅ ์ด ๋†’์€ bounding box๋ฅผ ์ถ”๋ก ํ•˜๋Š” ๋ชจ๋ธ

๊ฐ์ฒด ํƒ์ง€์— ์ข‹์€ ์„ฑ๋Šฅ์„ ๋ณด์ด๋ฉด์„œ๋„ ์‹ค์‹œ๊ฐ„ ํƒ์ง€ ์†๋„๋„ ๋น ๋ฅด๋‹ค.

์ตœ๊ทผ Ultralytics์—์„œ YOLOv8๋ฅผ ๋ฐœํ‘œํ–ˆ๊ณ , backbone๊ณผ neck architecture๋ฅผ ๊ฐ•ํ™”ํ•˜์—ฌ ๊ฐœ์„ ๋œ ์„ฑ๋Šฅ์„ ๋ณด์—ฌ์ค€๋‹ค.

์•„๋ž˜ ๊ตฌ์กฐ๋Š” YOLOv4 ๋…ผ๋ฌธ์—์„œ ๊ฐ€์ ธ์˜จ ๊ฒƒ

Docs์— ์ž˜ ์ •๋ฆฌ๋œ ๋‚ด์šฉ๋“ค์„ ์‹ค์Šตํ•ด๋ณด๋ฉด์„œ MLFlow๋กœ ๊ฐ„๋‹จํ•˜๊ฒŒ ํ•™์Šต ๊ด€๋ฆฌ๋ฅผ ํ•ด๋ณด์•˜๊ณ , ๊ทธ ๋‚ด์šฉ์„ ๊ธฐ๋กํ•˜๊ณ ์ž ํ•œ๋‹ค. :D

ํ…Œ์ŠคํŠธ ํ™˜๊ฒฝ

  • Linux 18.04
  • Ultralytics YOLOv8.0.110
  • Python-3.8.16
  • torch-2.0.1+cu117 CUDA:0
    • (Tesla V100-SXM2-32GB, 32510MiB)

ํŒจํ‚ค์ง€ ์„ค์น˜
pip install ultralytics
pip install opencv-python

01. Simple Test

๐Ÿ’ ๋น ๋ฅด๊ณ  ๊ฐ„๋‹จํ•œ ์‚ฌ์šฉ์„ฑ

์ด๋ฏธ ํ•™์Šต๋œ ๋ชจ๋ธ์„ ๊ฐ€์ ธ์™€์„œ ๊ฐ„๋‹จํžˆ ํ…Œ์ŠคํŠธ ํ•ด๋ณผ ์ˆ˜ ์žˆ๋‹ค.

from ultralytics import YOLO
import cv2

model = YOLO("yolov8n.pt")
# accepts all formats - image/dir/Path/URL/video/PIL/ndarray. 0 for webcam

# from ndarray
im2 = cv2.imread("bbang2.jpg")
results = model.predict(source=im2, save=True, save_txt=True)  # save predictions as labels

์•„๋ž˜์™€ ๊ฐ™์ด ์˜ˆ์ธก ๊ฒฐ๊ณผ๊ฐ€ ๋œจ๊ณ  ์ €์žฅ๋œ ์ด๋ฏธ์ง€๋ฅผ ํ™•์ธํ•  ์ˆ˜ ์žˆ๋‹ค.

640x480 8 persons, 7.0ms
Speed: 2.5ms preprocess, 7.0ms inference, 1.5ms postprocess per image at shape (1, 3, 640, 640)

02. Custom Dataset ๊ตฌ์„ฑํ•˜๊ธฐ

๐Ÿ’ Custom Dataset ์œผ๋กœ ํ•™์Šต/์˜ˆ์ธก์„ ํ•ด๋ณด์ž.

  • ์ด๋ฏธ์ง€์™€ ํŒŒ์ผ ์ด๋ฆ„์ด ๊ฐ™์€ ํ…์ŠคํŠธ ์Œ์ด ํ•„์š”
  • ํ…์ŠคํŠธ๋Š” ๊ฐ์ฒด์˜ class์™€ center_x, center_y, width, height๋กœ ๊ตฌ์„ฑ๋˜์–ด์•ผ ํ•œ๋‹ค.

๐Ÿ’ Ultralytics๋Š” yaml ํŒŒ์ผ์— ์ €์žฅ๋œ ๊ฒฝ๋กœ๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ํ•˜์—ฌ ๋ฐ์ดํ„ฐ๋ฅผ ์ฐธ์กฐ

  • ๋ฐ์ดํ„ฐ ๊ฒฝ๋กœ ์˜ค๋ฅ˜๊ฐ€ ๋‚˜๋ฉด ์•„๋ž˜ ํŒŒ์ผ์—์„œ datasets_dir์„ ์ˆ˜์ •ํ•˜๋ฉด ๋จ
    • .config/Ultralytics/settings.yaml

Reference
Ultralytics Document Dataset ๊ตฌ์„ฑ ๋ฐฉ๋ฒ•

YOLO ๋ชจ๋ธ์— ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๊ฒŒ ๋ฐ์ดํ„ฐ ๋ณ€ํ™˜

  • .txt ๋ฐ์ดํ„ฐ๋Š” image ๋ฐ์ดํ„ฐ์™€ ์ด๋ฆ„์ด ๊ฐ™์•„์•ผ ํ•˜๊ณ ,
    object_class center_x center_y width height ๋กœ ์“ฐ์—ฌ์•ผ ํ•œ๋‹ค.

๋ณ€ํ™˜ ์ฝ”๋“œ

  • image_width, image_height ๋กœ ๋‚˜๋ˆ  ์ฃผ๋Š”๊ฒƒ์€ bounding box์˜ ํฌํ‚ค๋ฅผ ์ผ๋ฐ˜ํ™”ํ•˜๊ธฐ ์œ„ํ•จ
x_min, y_min = float(min(line[5], line[7])), float(min(line[6], line[8]))
x_max, y_max = float(max(line[1], line[3])), float(max(line[2], line[4]))
x, y = float(((x_min + x_max) / 2) / image_width), float(((y_min + y_max) / 2) / image_height)
w, h = abs(x_max - x_min) / image_width, abs(y_max - y_min) / image_height
yolo_labels.append(f"{class_name} {x} {y} {w} {h}")            

02. Yaml ํŒŒ์ผ ์ž‘์„ฑ ํ›„ Trainํ•˜๊ธฐ

๐Ÿ’ YOLO ๋ชจ๋ธ์€ yaml ํŒŒ์ผ์„ ๊ธฐ์ค€์œผ๋กœ ๋ฐ์ดํ„ฐ์™€ ๋ถ„๋ฅ˜ ํด๋ž˜์Šค๋ฅผ ์ฐธ์กฐ

yaml ํŒŒ์ผ ์ •์˜

yaml_data = {
              "names": classes,
              "nc": len(classes),
              "path": "data/yolo/",
              "train": "train",
              "val": "valid",
              "test": "test"
            }

train

  • last.pt ์™€ best.pt๊ฐ€ ์ €์žฅ๋จ
  • last.pt๋ฅผ ๋ถˆ๋Ÿฌ์™€ ์ด์–ด ํ•™์Šตํ•˜๊ธฐ ๊ฐ€๋Šฅ
# model = YOLO(f"{MODEL}/train/weights/last.pt")
model = YOLO("yolov8x")

results = model.train(
    **opt
    )

predict

model = YOLO("v2/train/weights/best.pt")
test_image_paths = glob("./data/yolo/test/*.png")
for i, image in tqdm(enumerate(get_test_image_paths(test_image_paths)), total=int(len(test_image_paths)/BATCH_SIZE)):
    model.predict(image, imgsz=(1024, 1024), iou=0.2, conf=0.5, save_conf=True, save=False, save_txt=True, project=f"{MODEL}", name="predict",
                  exist_ok=True, device=0, augment=True, verbose=False)
    if i % 5 == 0:
        clear_output(wait=True)

result

03. MLFlow์— ํ•™์Šต ๊ฒฐ๊ณผ ์ €์žฅํ•˜๊ธฐ (Databricks)

๐Ÿ’ ๋” ์ข‹์€ ๋ฐฉ๋ฒ•์ด ์žˆ๋‹ค๋ฉด ๊ผญ ์•Œ๋ ค์ฃผ์„ธ์š” !

์ฐธ๊ณ 
Databricks ๋ฌด๋ฃŒ ๋ฒ„์ „์€ ์•„๋ž˜ ๋ฒ„ํŠผ์œผ๋กœ ๊ฐ€์ž…ํ•˜๊ธฐ

๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ ์„ค์น˜

conda install mlflow 
conda install databricks-cli

๊ณ„์ • ์„ค์ •

  • ์•„๋ž˜ ๋ช…๋ น์–ด ์ž…๋ ฅ ํ›„ Username(๋ฉ”์ผ ์ฃผ์†Œ)๊ณผ password ์ž…๋ ฅ
databricks configure --host https://community.cloud.databricks.com/

call back function ์„ ์–ธ

def on_fit_epoch_end(trainer):
    if mlflow:
        metrics_dict = {f"{re.sub('[()]', '', k)}": float(v) for k, v in trainer.metrics.items()}
        mlflow.log_metrics(metrics=metrics_dict, step=trainer.epoch)
import mlflow

mlflow.set_tracking_uri("databricks")
mlflow.set_experiment("/Users/{user-id}/{project-name}")

# ์‹คํ—˜ ์„ธ์…˜ ์ƒ์„ฑ
with mlflow.start_run():

	model.add_callback("on_fit_epoch_end",on_fit_epoch_end)
	results = model.train(
    **opt
    )

04. ํŒŒ๋ผ๋ฏธํ„ฐ ์ตœ์ ํ™”

๐Ÿ’ 8๊ธฐ๊ฐ€ ์ด์ƒ์˜ ๋žจ์„ ํ• ๋‹นํ•  ๊ฒƒ์„ ์ œ์•ˆ

  • ์˜ค๋ฅ˜๊ฐ€ ์—„์ฒญ ๋ฐœ์ƒํ•˜๋Š”๋ฐ, ์‹ค์Šต์šฉ์œผ๋กœ epoch๋ฅผ ๋‚ฎ์ถฐ์„œ ์‹คํ–‰

ํŒจํ‚ค์ง€ ์„ค์น˜

pip install -U ultralytics "ray[tune]"  # install and/or update
pip install wandb  # optional
from ray import tune

model = YOLO(f"{MODEL}/train/weights/last.pt")

result = model.tune(
    data="/opt/ml/yujin/DataAnalysisPractice/Dacon/03.๋ฐ์ด์ฝ˜ ํ•ฉ์„ฑ๋ฐ์ดํ„ฐ ๊ธฐ๋ฐ˜ ๊ฐ์ฒด ํƒ์ง€ AI ๊ฒฝ์ง„๋Œ€ํšŒ/data/yolo/custom.yaml",
    space={"lr0": tune.uniform(1e-5, 1e-1)},
    train_args={"epochs": 10}
)

0๊ฐœ์˜ ๋Œ“๊ธ€