๐Ÿ”ธ๋”ฅ๋Ÿฌ๋‹ ์ตœ์ ํ™”

Jiwon Parkยท2023๋…„ 4์›” 5์ผ
1

์ตœ์ ํ™” ๋ฐฉ๋ฒ•

  • ์›๋ก  ๊ธฐ๋ฒ•: Gradient Descent(GD)
  • ๋ฌถ์Œ ๋‹จ์œ„ ํ•™์Šต: Mini-batch gradient descent(batch GD)
  • ์˜์ƒ ๋‹จ์œ„ ํ•™์Šต: Stochastic gradient descent(SGD)
  • ์ตœ๊ทผ ๊ฐ€์žฅ ๋งŽ์ด ์“ฐ๋Š” optimizer: Adam optimization

๊ณผ์ ํ•ฉ ๋ฌธ์ œ

  • epoch๊ฐ€ ํฌ๋‹ค๊ณ  ๋ฌด์กฐ๊ฑด ์ข‹์€ ๊ฑด ์•„๋‹˜ -> train loss๋Š” ์ ์  ๋‚ฎ์•„์ง€๊ณ  validation loss๋Š” ์–ด๋А ์ˆœ๊ฐ„ ์ฆ๊ฐ€ํ•˜๋Š” '๊ณผ์ ํ•ฉ' ๋ฐœ์ƒ

  • ๊ณผ์ ํ•ฉ ์›์ธ: training ๋ฐ์ดํ„ฐ๋ฅผ ์–ต์ง€๋กœ ๋งž์ถ”๊ฒŒ ๋จ

  • ํ•ด๊ฒฐ๋ฐฉ์•ˆ

      1. drop-out layer ์ถ”๊ฐ€
      • ์ฃผ์–ด์ง„ ํ™•๋ฅ ๋กœ ๋žœ๋คํ•œ ์œ„์น˜์˜ ๋‰ด๋Ÿฐ ์ œ๊ฑฐ
      • ๋งค ํ•™์Šต๋งˆ๋‹ค ๋‹ค์–‘ํ•œ ์กฐํ•ฉ์˜ ๋‰ด๋Ÿฐ ํ•™์Šต -> ๊ณผ์ ํ•ฉ ๋ฐฉ์ง€

      1. ๋ชจ๋ธ ๊ฐ„์†Œํ™”
      • ํŒŒ๋ผ๋ฏธํ„ฐ๊ฐ€ ๋งŽ์•„์ง์— ๋”ฐ๋ผ ๊ณผ์ ํ•ฉ ๋ฐœ์ƒ
        -> ์€๋‹‰ ๋ ˆ์ด์–ด์˜ ํŒŒ๋ผ๋ฏธํ„ฐ ์ค„์—ฌ ๊ณผ์ ํ•ฉ ๋ฐฉ์ง€
      1. ์†์‹คํ•จ์ˆ˜(loss function) ๋‚ด ๊ณผ์ ํ•ฉ ๋ฐฉ์ง€ ํ•จ์ˆ˜(regularization term) ์ถ”๊ฐ€
      • ์ผ๋ถ€ ํŒŒ๋ผ๋ฏธํ„ฐ๊ฐ€ ์ปค์ง€๋ฉฐ ๊ณผ์ ํ•ฉ ๋ฐœ์ƒ
        -> ์ด๋ฅผ ์ œํ•œํ•˜๋Š” ์ˆ˜์‹ ์†์‹คํ•จ์ˆ˜ ๋‚ด์— ์ถ”๊ฐ€ํ•˜์—ฌ ๊ณผ์ ํ•ฉ ๋ฐฉ์ง€
      1. validation loss ์ด์šฉ
      • validation loss๊ฐ€ ์ฆ๊ฐ€ํ•  ๋•Œ ํ•™์Šต ์ข…๋ฃŒ
  • ์†์‹คํ•จ์ˆ˜

    • ์ฃผ์–ด์ง„ ์ƒํ™ฉ์— ์ ํ•ฉํ•œ ์†์‹คํ•จ์ˆ˜ ์‚ฌ์šฉ

    • Binary Crossentropy: ์ด์ง„๊ฒฐ๊ณผ(0/1) ๋ฌธ์ œ

    • Categorical Crossentropy: ๋‹ค์ค‘๋ถ„๋ฅ˜(ํด๋ž˜์Šค) ๋ฌธ์ œ, ์ž…๋ ฅ๋‹จ์—์„œ one-hot encoding ์‚ฌ์šฉ

    • Sparse Categorical Crossentropy: ๋‹ค์ค‘๋ถ„๋ฅ˜(ํด๋ž˜์Šค) ๋ฌธ์ œ, ์ž…๋ ฅ๋‹จ์—์„œ one-hot encoding ์—†์ด ๋ฐ”๋กœ ํด๋ž˜์Šค ์ ์šฉ

    • Mean absolute error(MAE) & Mean square error(MSE): ํšŒ๊ท€๋ฌธ์ œ(Regression), ์‹ค์ œ๊ฐ’๊ณผ ์˜ˆ์ธก๊ฐ’์˜ ์ฐจ์ด ํ™œ์šฉ

    • Dice loss: ์˜์ƒ ๋ถ„ํ• (Segmentation), ์ •๋‹ต๊ณผ ์˜ˆ์ธก์ด ์–ผ๋งˆ๋‚˜ ๊ฒน์น˜๋Š”์ง€ ์ˆ˜์น˜ํ™”

profile
Data Science

0๊ฐœ์˜ ๋Œ“๊ธ€