๐Ÿ“Œ ๋ณธ ๋‚ด์šฉ์€ Michigan University์˜ 'Deep Learning for Computer Vision' ๊ฐ•์˜๋ฅผ ๋“ฃ๊ณ  ๊ฐœ์ธ์ ์œผ๋กœ ํ•„๊ธฐํ•œ ๋‚ด์šฉ์ž…๋‹ˆ๋‹ค. ๋‚ด์šฉ์— ์˜ค๋ฅ˜๋‚˜ ํ”ผ๋“œ๋ฐฑ์ด ์žˆ์œผ๋ฉด ๋ง์”€ํ•ด์ฃผ์‹œ๋ฉด ๊ฐ์‚ฌํžˆ ๋ฐ˜์˜ํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค.
(Stanford์˜ cs231n๊ณผ ๋‚ด์šฉ์ด ๊ฑฐ์˜ ์œ ์‚ฌํ•˜๋‹ˆ ์ฐธ๊ณ ํ•˜์‹œ๋ฉด ๋„์›€ ๋˜์‹ค ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค)๐Ÿ“Œ




1. ์„ ํ˜• ๋ถ„๋ฅ˜์˜ ๋ฌธ์ œ & ํ•ด๊ฒฐ

1) ๋ฌธ์ œ์ 

  • linear classifier๋กœ ๋ถ„๋ฅ˜ ์•ˆ๋ ๋•Œ ๋„ˆ๋ฌด ๋งŽ์Œ

  • ๊ธฐํ•˜ํ•™์  ๊ด€์  & ์‹œ๊ฐ์  ๊ด€์ 

    • ๊ธฐํ•˜ํ•™์  ๊ด€์ : ์„ ํ˜• ๋ถ„๋ฅ˜ ์–ด๋ ค์›€
    • ์‹œ๊ฐ์  ๊ด€์ : ํ•˜๋‚˜์˜ ํด๋ž˜์Šค์˜ ์—ฌ๋Ÿฌ ๋ชจ๋“œ ์ธ์‹๋ถˆ๊ฐ€

2) ํ•ด๊ฒฐ์ฑ…

  • Feature Transform

    • ํŠน์ง•

      • ๋ถ„๋ฅ˜์— ๋” ์ ํ•ฉํ•˜๋„๋ก ์ž…๋ ฅ๋ฐ์ดํ„ฐ์— ์ ์šฉ (์ œ๊ณฑ, tanํ•จ์ˆ˜์‚ฌ์šฉ)

        • Original space
          • ์ผ๋ฐ˜์ขŒํ‘œ (cartesian)
        • Feature space
          • ๊ทน์ขŒํ‘œ (polar)
          • Feature transform(๋น„์„ ํ˜•์„ฑ) ์ดํ›„ ์ •์˜๋˜๋Š” ๊ณต๊ฐ„
          • ์„ ํ˜•๋ถ„๋ฆฌ ๊ฐ€๋Šฅํ•˜๊ฒŒ ๋จ (์„ ํ˜• ๊ฒฝ๊ณ„๋ฅผ feature space์— ์‚ฝ์ž…๊ฐ€๋Šฅ)
      • ๋‹ค์‹œ original space๋กœ ๋ณ€๊ฒฝ

        • original space
          • ๋น„์„ ํ˜• ๋˜์–ด๋ฒ„๋ฆผ
    • ๊ฒฐ๋ก 

      • Data์˜ ํŠน์„ฑ์— ๋งž๊ฒŒ feature transform ์„ ํƒ์‹œ ์„ ํ˜•๋ถ„๋ฅ˜ ํ•œ๊ณ„ ๊ทน๋ณต๊ฐ€๋Šฅ
      • feature transform์„ ๋” ๊ด‘๋ฒ”์œ„ ํ•˜๊ฒŒ ์ ์šฉ์‹œ์— ๋” ์กฐ์‹ฌํ•˜๊ธฐ
      • ์„ ํ˜•๋ถ„๋ฅ˜ ๋ชจ๋ธ์— ์ ์šฉํ•˜์—ฌ ํŠน์ง• ๋ถ€์—ฌ ์šฉ์ด




2. Feature Representation

1) Color Histogram

  • ํŠน์ง•
    • ์š”์•ฝ: ์œ„์น˜๊ด€๊ณ„ ๋ถ„๋ฅ˜ ๋ฒ„๋ฆฌ๊ณ , ์ƒ‰ ๋ถ„๋ฅ˜๋งŒ ํ•˜์ž !
    • RGB ์ŠคํŽ™ํŠธ๋Ÿผ of color space ๋ถ„ํ•  & ์ด์‚ฐbin
    • input image์˜ ๊ฐ ํ”ฝ์…€์— ๋Œ€ํ•ด bin์˜ ํ•ด๋‹น ์œ„์น˜์— ํ• ๋‹น
    • normalize of histogram
    • ์ด๋ฏธ์ง€์— ๋Œ€ํ•œ ๋ชจ๋“  ๊ณต๊ฐ„์ •๋ณด ๋ฒ„๋ฆผ (spacially invariant) = ์ด๋ฏธ์ง€์— ์–ด๋–ค ์œ ํ˜•์˜ ์ƒ‰์ƒ์ด ์žˆ๋Š”์ง€๋งŒ ๊ด€์‹ฌ
      • ex. car image (๊ฐˆ์ƒ‰๋ฐฐ๊ฒฝ, ๋นจ๊ฐ„ ์ฐจ)
        • car๋Š” ์ด๋ฏธ์ง€์— ๊ฐ๊ธฐ ๋‹ค๋ฅธ ์œ„์น˜
        • ์„ ํ˜•๋ถ„๋ฅ˜: ์ด๋Ÿฌํ•œ ํ‘œํ˜„์ฒ˜๋ฆฌ ์–ด๋ ค์›€ but color histogram: ์ •ํ™•ํ•œ ์œ„์น˜๊ด€๊ณ„X, ํ•ญ์ƒ ๋นจ&๊ฐˆ ๋ฌถ์Œ

2) Histogram of Oriented Gradients (HoG)

  • ํŠน์ง•
    • ์š”์•ฝ: input image์˜ ๋ชจ๋“  ์œ„์น˜์—์„œ local ๋ฐฉํ–ฅ & ๊ฐ•๋„๋กœ ํ‘œํ˜„

    • color information ๋ฒ„๋ฆผ (local edge & strength์—๋งŒ ๊ด€์‹ฌ)

=โ‡’ 1,2 ์ž˜ ์•ˆ์“ฐ์ž„: ์ด 2๊ฐ€์ง€๋Š” ์‚ฌ๋žŒ์ด input data์˜ ์˜ฌ๋ฐ”๋ฅธ quality๊ฐ€ ๋ญ”์ œ ์ƒ๊ฐํ•ด์•ผ๋ผ์„œ

3) Bag of Words

  • ๊ฐœ๋…

    • ๋ฐ์ดํ„ฐ๊ธฐ๋ฐ˜ feature transform
    • Step1: Build codebook
      • input ์ด๋ฏธ์ง€์—์„œ random patch ์ถ”์ถœ (๋‹ค์–‘ํ•œ scale, size์—์„œ)(๋ชจ๋“  ์ด๋ฏธ์ง€๋ฅผ ๋ฌด์ž‘์œ„๋กœ ์ž˜๋ผ๋ƒ„)
        โ†’ codebook ๋งŒ๋“ค๊ธฐ & visual word๋กœ ํ‘œํ˜„ ๊ฐ€๋Šฅ
        (train set์— ๋‚˜ํƒ€๋‚˜๋Š” ๋งŽ์€ ๊ณตํ†ต๊ธฐ๋Šฅ์„ captureํ•˜๊ฑฐ๋‚˜, ์ธ์‹ํ•  ์ˆ˜ ์žˆ๋Š” ์ผ์ข…์˜ visual word ํ‘œํ˜„ ๋ฐฐ์›€)
    • Step2: Encode images
      • ๊ฐ input image์— ๋Œ€ํ•œ ํžˆ์Šคํ† ๊ทธ๋žจ ๊ณ„์‚ฐ
        = ๊ฐ visual word๊ฐ€ ํ•ด๋‹น input image์— ์–ผ๋งˆ๋‚˜ ๋‚˜ํƒ€๋‚ฌ๋Š”์ง€ ์•Œ์ˆ˜O
  • ํŠน์ง•

    • ์‚ฌ๋žŒ์ด ๊ธฐ๋Šฅ์  ํ˜•ํƒœ(functional form)๋ฅผ ์ง€์ •ํ•  ํ•„์š”๊ฐ€ ์—†๊ธฐ์—, ๋งค์šฐ ๊ฐ•๋ ฅํ•œ ๊ธฐ๋Šฅ ํ‘œํ˜„ ์œ ํ˜•
    • code book ๋‹จ์–ด๋“ค์˜ ํŠน์ง•์ด ํ›ˆ๋ จ data๋กœ๋ถ€ํ„ฐ ํ•™์Šต๋˜์–ด ๋ฌธ์ œ์— ๋” ์ž˜ ์ ํ•ฉ (์•ž์˜ ๋ฐฉ๋ฒ•๋“ค๋ณด๋‹ค ๋” ํŽธํ•จ)

4) ์—ฌ๋Ÿฌ๊ฐœ์˜ feature representation ์‚ฌ์šฉ (๋ชจ๋‘ ์กฐํ•ฉ)

  • color, edge๋“ฑ๊ณผ ๊ฐ™์ด input image์—์„œ ๋‹ค์–‘ํ•œ ์œ ํ˜•์˜ ์ •๋ณด ์•Œ์ˆ˜ ์žˆ์Œ
  • ์ด๋ ‡๊ฒŒ ์กฐํ•ฉํ•˜๋Š” ๋ฐฉ๋ฒ•์ด computer vision์—์„œ ๋„๋ฆฌ ์“ฐ์ž„




3. Image Features

1) ์ „์ฒด ํ๋ฆ„

raw image pixel

โ†’ Feature Extraction โ†’(f: ๋ถ„๋ฅ˜ ์„ฑ๋Šฅ ์ตœ๋Œ€ํ™” ์œ„ํ•ด ์Šค์Šค๋กœ ์กฐ์ • X)โ†’ 10 numbers giving scores for classes โ†’ (training(ํ•™์Šต๊ฐ€๋Šฅ๋ชจ๋ธ)) โ†’ ๋‹ค์‹œ ๋ฐ˜๋ณต

=โ‡’ ๊ฒฐ๋ก ) ๋” ๋‚˜์€ ๋ฐฉ๋ฒ• ์‚ฌ์šฉ ํ•„์š”

(์ด๋ฏธ์ง€ ๋ถ„๋ฅ˜์‹œ, system์˜ ๋ชจ๋“  ๋ถ€๋ถ„์„ ์ž๋™์  ์กฐ์ •ํ•˜์—ฌ ์„ฑ๋Šฅ ๋†’์ž„)

(Neural Network๊ฐ€ ํ•˜๋Š”์ผ์— ๋Œ€ํ•œ ๋™๊ธฐ)

2) Image Features vs Neural Networks

  • ์•ž์„œ ์„ค๋ช…ํ•œ feature representation๊ณผ ๋‹ค๋ฅด์ง€ X
  • ๋ถ„๋ฅ˜ ์„ฑ๋Šฅ ๋†’์ด๊ธฐ ์œ„ํ•ด train์‹œ, ์ „์ฒด๋ฅผ ๊ณต๋™์œผ๋กœ ์กฐ์ •ํ•˜๋Š”๊ฒƒ๋งŒ ๋‹ค๋ฆ„




4. Neural Networks

๐Ÿ“NN์ด ๊ฐ€์žฅ ๊ฐ•๋ ฅํ•œ 1๋ฒˆ์งธ ์ด์œ  - W์˜ ์—ญํ• 
1) ๊ฐœ๋…

  • layer 2๊ฐœ
  • layer 3๊ฐœ

2) W์˜ ์—ญํ• 

  • NN์ด ๊ฐ€์žฅ ๊ฐ•๋ ฅํ•œ 1๋ฒˆ์งธ ์ด์œ 

    a. W์˜ ์ด์ „ ๋ ˆ์ด์–ด์— ๋Œ€ํ•œ ์˜ํ–ฅ๋ ฅ ์ „๋‹ฌ ์—ญํ• 

    • ์ด์ „ ๋ ˆ์ด์–ด์˜ ๊ฐ ์š”์†Œ๊ฐ€ ๋‹ค์Œ ๋ ˆ์ด์–ด์˜ ๊ฐ ์š”์†Œ์— ์–ผ๋งˆ๋‚˜ ์˜ํ–ฅ๋ฏธ์น˜๋Š”์ง€ ์•Œ๋ ค์คŒ

      • ๋ง๊ทธ๋Œ€๋กœ ๊ฐ€์ค‘์น˜์ธ๋“ฏ. ์ด์ „ ๋ ˆ์ด์–ด์— ๋Œ€ํ•œ ๊ฐ€์ค‘์น˜

        ex. W1 = input layer x๊ฐ€ hidden layer h์˜ ๊ฐ ์š”์†Œ์— ์–ผ๋งŒํผ ์˜ํ–ฅ ๋ฏธ์น˜๋Š”์ง€

        W2 = hidden layer์˜ ๊ฐ ์š”์†Œ๊ฐ€ ์ถœ๋ ฅ score์˜ ๊ฐ ์š”์†Œ์— ์–ผ๋งŒํผ ์˜ํ–ฅ ๋ฏธ์น˜๋Š”์ง€

    • Fully connected Neural Network

      • ๋ชจ๋“  ์š”์†Œ๋“ค์ด ๋‹ค๋ฅธ ์š”์†Œ๋“ค์— ๋ชจ๋‘ ์™„์ „ํžˆ ์—ฐ๊ฒฐ๋˜์–ด ์žˆ์–ด์„œ
      • ๋งค์šฐ denseํ•˜๊ฒŒ ์—ฐ๊ฒฐ

      b. W์˜ template ์—ญํ• 

    • ๊ฐœ๋…

      • Neural net: 1๋ฒˆ์งธ layer = template ๋ชจ์Œ

        2๋ฒˆ์งธ layer = template ์žฌ์กฐํ•ฉ

      • ์ˆจ๊ฒจ์ง„ ๊ณ„์ธต์˜ ๊ฐ’ ํ•ด์„ ๋ฐฉ๋ฒ•
        : ํ•™์Šต๋œ ๊ฐ template์ด ์ž…๋ ฅ์ด๋ฏธ์ง€ x์— ์–ผ๋งŒํผ ๋ฐ˜์‘ํ•˜๋Š”์ง€
        โ†’ ๋‚ด๋ถ€์— ๋Œ€ํ•ด ์™„๋ฒฝํžˆ ํ•ด์„X,
        but 2-layer Neural Network System์—์„œ ํ™•์‹คํžˆ ์‹๋ณ„ํ•  ์ˆ˜ ์žˆ๋Š” ๊ณต๊ฐ„๊ตฌ์กฐ

      • class์˜ ์—ฌ๋Ÿฌ mode๋ณ„ ๋‹ค๋ฅธ template์‚ฌ์šฉ ๊ฐ€๋Šฅ
        : ์„ ํ˜•๋ถ„๋ฅ˜์—์„œ ๋ง์˜ ๋จธ๋ฆฌ๊ฐ€ 2๊ฐœ ๋˜๋Š” ๋ฌธ์ œ ํ•ด๊ฒฐ

      • ๋ถ„์‚ฐ ํ‘œํ˜„ (๊ณต๊ฐ„๊ตฌ์กฐ ํŒŒ์•…๊ฐ€๋Šฅ + ์„ ํ˜•์กฐํ•ฉ์œผ๋กœ ์ด๋ฏธ์ง€ ์ •๋ณด ๋Œ€์ถฉ ํŒŒ์•…๊ฐ€๋Šฅ)

        : W1์—์„œ ํ•™์Šตํ•˜๋Š” ๊ฒƒ๋“ค์€ ๊ฑฐ์˜ ์ธ๊ฐ„์ด ํ•ด์„ํ•˜๊ธฐ ์–ด๋ ค์›€
        : ๋Œ€์‹  ์–ด๋–ค ์ข…๋ฅ˜์˜ ๊ณต๊ฐ„๊ตฌ์กฐ ๊ฐ€์ง€๊ณ  ์žˆ์œผ๋ฉฐ, ์‹ ๊ฒฝ๋ง์˜ ๋ถ„์‚ฐํ‘œํ˜„์ด๋ผ๋Š” ๊ฐœ๋… ์‚ฌ์šฉํ•ด ์ด๋ฏธ์ง€ ๋‚˜ํƒ€๋ƒ„
        & ์„ ํ˜•์กฐํ•ฉ ํ†ตํ•ด ์ด๋ฏธ์ง€ ์ •๋ณด ์•Œ์ˆ˜O




5. Deep Neural Network

  • ํ•ด์„
    • Depth = number of layers
      = weight matrix ๊ฐœ์ˆ˜
    • Width = Size of each layer
      = number of units
      = ์ˆจ๊ฒจ์ง„ ํ‘œํ˜„์˜ ๋‹จ์œ„์ˆ˜, ์ฐจ์›์ˆ˜




6. Activation Functions

1) ๊ฐœ๋…

  • Neural Network์—์„œ ์ค‘์š” ์—ญํ•  (์„ ํ˜• โ†’ ๋น„์„ ํ˜•)
  • W1๊ณผ W2์‚ฌ์ด์— ๊ปด์„œ ์ž‘๋™
  • ex. relu

2) Question

  1. Q. ํ™œ์„ฑํ™”ํ•จ์ˆ˜๊ฐ€ ์—†์œผ๋ฉด ์–ด์ผ€๋ ๊นŒ?

    โ†’ A. ์—ฌ์ „ํžˆ ์„ ํ˜•๋ถ„๋ฅ˜ ํ•˜๊ฒŒ ๋œ๋‹ค.

    ex. deep linear network (์ตœ์ ํ™”์—์„œ ์ฃผ๋กœ ์“ฐ์ž„)

3) ์ข…๋ฅ˜

  • relu๊ฐ€ ์ ค ๋งŽ์ด ์‚ฌ์šฉ๋จ (0์—์„œ ๋ฏธ๋ถ„X, ๋‹ค๋ฅธ ์ ์—์„  ๋ฏธ๋ถ„ ํŽธํ•จ)




7. Space Warping

๐Ÿ“NN์ด ๊ฐ€์žฅ ๊ฐ•๋ ฅํ•œ 2๋ฒˆ์งธ ์ด์œ  - ํ™œ์„ฑํ™”ํ•จ์ˆ˜๋กœ ๋น„์„ ํ˜• ๋ถ„๋ฅ˜ ๊ฐ€๋Šฅ

  • ๊ธฐ์กด) ๋ถ„๋ฅ˜ ์ ์ˆ˜ ์˜ˆ์ธก ์ธก๋ฉด์—์„œ ์„ค๋ช…
  • ๋‹ค๋ฅธ ๊ด€์ ) input ๊ณต๊ฐ„ ์™œ๊ณก

1) ๊ฐœ๋…

  • 2์ฐจ์› ์„ ํ˜• ๊ณต๊ฐ„์—์„œ๋Š” 2์ฐจ์› ์„ ํ˜•๋ณ€ํ™˜ํ•˜๊ธฐ
    โ‡’ ์ž…๋ ฅ๊ณต๊ฐ„์— 2๊ฐœ์„  ๊ทธ์–ด์ ธ ์žˆ๊ณ , ์ด ์„ ๋“ค์€ 4๊ฐœ ์˜์—ญ์œผ๋กœ ๋‚˜๋ˆ ์ง
    โ‡’ ๋ณ€ํ™˜๋œ ์ถœ๋ ฅ ๊ณต๊ฐ„์—์„œ 4๊ฐœ ์‚ฌ๋ถ„๋ฉด์œผ๋กœ ๋ณ€ํ˜•

  • ๋ฐ์ดํ„ฐ ๋„ฃ์–ด์„œ ํ•ด์„ํ•ด๋ณด๊ธฐ
    โ‡’ ์„ ํ˜• ๋ณ€ํ™˜์‹œ, ๊ณต๊ฐ„์„ ์„ ํ˜• ์™œ๊ณก(input data๋ฅผ ์ƒˆ๋กญ๊ฒŒ ํ‘œํ˜„)
    โ‡’ ๊ทธ์น˜๋งŒ ์„ ํ˜•๋ณ€ํ™˜ํ•ด๋„ ์—ฌ์ „ํžˆ ์„ ํ˜•๋ถ„๋ฆฌ๊ฐ€ ์•ˆ๋จ
  • ํ™œ์„ฑํ™” ํ•จ์ˆ˜ ์ ์šฉ (๋น„์„ ํ˜•)

    • ์‹ค์ œ ์ ์šฉ ์˜ˆ์‹œ

      • ํ•ด์„
        • B: ex. (-2,4)๋ฉด (0,4)๋จ
        • D: ex. (4,-2)๋ฉด (4,0)๋จ
        • C: ex. (-4,-4)๋ฉด (0,0)๋จ
  • ์ ์šฉ ํ›„ ๊ฒฐ๊ณผ

    • ์„ ํ˜• ๋ถ„๋ฆฌ ๊ฐ€๋Šฅํ•ด์ง
  • ๋‹ค์‹œ original space๋กœ ๋Œ๋ฆด๋•Œ

    • original space์—์„œ ๋น„์„ ํ˜• boundary ์ค„์ˆ˜ ์žˆ์Œ(=๋‹ค์‹œ ๋Œ๋ฆด๋•Œ ๋น„์„ ํ˜• ๋ถ„๋ฆฌ ๋จ)

      =โ‡’ ๊ฒฐ๋ก 1) ์—ฌ๋Ÿฌ ์„ ํ˜•๋ถ„๋ฅ˜๊ธฐ์™€ ๊ฐ™์€ ์ข…๋ฅ˜๋กœ ํ•ด์„ํ• ๋•Œ, ๋ชจ๋“  ์ข…๋ฅ˜์˜ ๊ณต๊ฐ„์„ ์ž์ฒด์ ์œผ๋กœ ์ ‘์„์ˆ˜์žˆ์Œ

      โ†’ ๊ทธ๋ฆฌ๊ณ  ์ ‘ํžŒ ๊ณต๊ฐ„์—์„œ ์„ ํ˜•๋ถ„๋ฅ˜ ํ•จ

      โ†’ ๊ณผ์ ํ•ฉ ๋ฐœ์ƒ ๊ฐ€๋Šฅ

      =โ‡’ ๊ฒฐ๋ก 2) Neural Network์˜ ๋งˆ์ง€๋ง‰์€ linearํ•˜๊ฒŒ ๊ทธ๋ ค์ง

      โ†’ Neural Network์—ญํ•  = linearํ•˜๊ฒŒ ๊ธ‹๊ธฐ ์ „๊นŒ์ง€

      โ†’ feature๋ณ€ํ™˜ ์—ฌ๋Ÿฌ๊ฐœ ์‹œ๋„ ํ›„ ๋งˆ์ง€๋ง‰ ์„  ๊ทธ์—ˆ์„๋•Œ ์ž˜ ๋ถ„๋ฅ˜




8. Universal Approximation

1) ๊ฐœ๋…

  • Neural Network
    • ์ œํ•œ๋œ ์ž…๋ ฅ๊ณต๊ฐ„์—์„œ ์—ฐ์†ํ•จ์ˆ˜๋ฅผ ๊ทผ์‚ฌํ™”ํ•˜๋Š” ๋ฐฉ๋ฒ•์€ ํ•™์Šตํ•˜๋Š”๊ฒƒ
  • ๊ฐ bump๋“ค์„ ํ•ฉ์ณ์„œ NN ์ „๋ฐ˜์˜ ์—ฐ์‚ฐ๊ณผ์ •์„ ํ‘œํ˜„ํ•  ์ˆ˜ ์žˆ์Œ
  • ๋” ๋‚˜์€ ๊ทผ์‚ฌ์น˜ ์œ„ํ•ด
    • bump๊ฐ„ gap์„ ์ข๊ฒŒ ๋งŒ๋“ค๊ธฐ : fidelity์กฐ์ •
    • ๋” ๋งŽ์€ bump ์ถ”๊ฐ€ ๊ฐ€๋Šฅ
  • ๊ทผ๋ฐ ์ด๊ฑฐ ๊ฐ–๋Š”๋‹ค๊ณ  ๋ฌด์กฐ๊ฑด ์ข‹์€๊ฑด ์•„๋‹˜
profile
๐Ÿ–ฅ๏ธ

0๊ฐœ์˜ ๋Œ“๊ธ€