[0513] TIL 22์ผ์ฐจ

nikevapormaxยท2022๋…„ 5์›” 13์ผ
0

TIL

๋ชฉ๋ก ๋ณด๊ธฐ
21/116

๐Ÿ˜‚ ๋จธ์‹ ๋Ÿฌ๋‹

๐Ÿ˜ญ XOR ์‹ค์Šต

  • XOR์€ ์•„๋ž˜์™€ ๊ฐ™์€ ํŠน์ง•์„ ์ง€๋‹ˆ๊ณ  ์žˆ๋‹ค.
  • ์‹ค์Šต์— ํ•„์š”ํ•œ ๊ฒƒ๋“ค์„ import ํ•ด์ค€๋‹ค.
import numpy as np
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.optimizers import Adam, SGD
  • ์šฐ๋ฆฌ๊ฐ€ ๋งŒ๋“ค ๋ชจ๋ธ์ด XOR์˜ ๊ฐ’์„ ์ œ๋Œ€๋กœ ์ถœ๋ ฅํ•  ์ˆ˜ ์žˆ๋Š”์ง€ ์•Œ๊ณ  ์‹ถ๊ธฐ ๋•Œ๋ฌธ์— ๋‹ค์Œ๊ณผ ๊ฐ™์ด ์ž…๋ ฅ๊ฐ’๊ณผ ์ถœ๋ ฅ๊ฐ’์„ ์ •ํ•ด์ฃผ์—ˆ๋‹ค. ์ถ”๊ฐ€์ ์œผ๋กœ keras๋ฅผ ์‚ฌ์šฉํ•  ๊ฒƒ์ด๊ธฐ ๋•Œ๋ฌธ์— float32๋กœ ๋ฐ์ดํ„ฐ์˜ ํƒ€์ž…์„ ์„ค์ •ํ•ด ์ฃผ์—ˆ๋‹ค.
# XOR์˜ ์ž…๋ ฅ๊ฐ’๊ณผ ๊ฒฐ๊ณผ๊ฐ’์„ ๋‹ค์Œ๊ณผ ๊ฐ™์ด ์ €์žฅ
x_data = np.array([[0, 0], [0, 1], [1, 0], [1, 1]], dtype=np.float32)
y_data = np.array([[0], [1], [1], [0]], dtype=np.float32)

- ์ด์ง„ ๋…ผ๋ฆฌ ํšŒ๊ท€

  • ์šฐ์„ , ์ด์ง„ ๋…ผ๋ฆฌ ํšŒ๊ท€๋ฅผ ์‚ฌ์šฉํ•ด ๋ชจ๋ธ์„ ์ƒ์„ฑํ•ด๋ณด๊ธฐ๋กœ ํ–ˆ๋‹ค.
    • ๊ฒฐ๊ณผ์ ์œผ๋กœ, ์ด์ง„ ๋…ผ๋ฆฌ ํšŒ๊ท€๋ฅผ ํ†ตํ•ด XOR์˜ ๊ฐ’์„ ์ œ๋Œ€๋กœ ์ถœ๋ ฅํ•  ์ˆ˜ ์žˆ๋Š” ๋ชจ๋ธ์€ ์ƒ์„ฑ ๋ถˆ๊ฐ€๋Šฅํ•˜๋‹ค.
    • ๋ชจ๋ธ์„ ํ”ผํŒ…ํ•˜๋Š” ๊ณผ์ •์—์„œ epoch์„ 1000์œผ๋กœ ์„ค์ •ํ•˜์˜€๊ณ , ์ด ๊ฒฝ์šฐ ๋ชจ๋“  ๊ฒฐ๊ณผ๊ฐ’์„ ์ถœ๋ ฅํ•˜๊ฒŒ ๋˜๋ฉด 1000๊ฐœ๊ฐ€ ์ถœ๋ ฅ๋˜๋ฏ€๋กœ verbose๋ฅผ 0์œผ๋กœ ์„ธํŒ…ํ•ด ๊ฒฐ๊ณผ๊ฐ’ ์ถœ๋ ฅ์„ ๋ง‰์•˜๋‹ค. (vervose=1์ด๋ฉด ํ•™์Šต๊ฒฐ๊ณผ ์ถœ๋ ฅํ•จ)
model = Sequential([ # Sequential s๋ชจ๋ธ์„ ์ƒ์„ฑํ•˜๋Š”๋ฐ(์ˆœ์ฐจ์ ์ธ API) -> ์‚ฌ์šฉ ๋˜๋Š” ์ดํ•ดํ•˜๊ธฐ ์‰ฝ๋‹ค. (์ธต์ธต์ด ์Œ“๊ธฐ ๋•Œ๋ฌธ์—, ๊ทธ๋Ÿฐ๋ฐ ์‹ค๋ฌด์—์„œ๋Š” ์‚ฌ์šฉํ•˜๊ธฐ ํž˜๋“ฆ)
  Dense(1, activation='sigmoid') # Dense์˜ ๊ฐ’์ด 1์ด๋ฏ€๋กœ ์ถœ๋ ฅ๊ฐ’์ด ํ•˜๋‚˜์ด๊ณ , ์ด์ง„ ๋…ผ๋ฆฌ ํšŒ๊ท€์ด๋ฏ€๋กœ ํ™œ์„ฑํ™”ํ•จ์ˆ˜๋กœ๋Š” sigmoid๋ฅผ ์‚ฌ์šฉ
])

# ์ด์ง„ ๋…ผ๋ฆฌ ํšŒ๊ท€์ด๋ฏ€๋กœ loss function์€ binary_crosssentropy๋ฅผ ์‚ฌ์šฉํ•˜๋ฉฐ, optimizer๋กœ๋Š” SGD(Stochastic Gradient Descent) ์‚ฌ์šฉ
# SGD(Stochastic Gradient Descent) : ํ™•๋ฅ ์  ๊ฒฝ์‚ฌ ํ•˜๊ฐ•๋ฒ•
# learning rate๋Š” 0.1 ์‚ฌ์šฉ
model.compile(loss='binary_crossentropy', optimizer=SGD(lr=0.1))

model.fit(x_data, y_data, epochs=1000, verbose=0)
  • ํ•™์Šตํ•œ ๋ชจ๋ธ์— ๋Œ€ํ•œ ๊ฒฐ๊ณผ๊ฐ’์„ ์˜ˆ์ธกํ•ด ๋ณด์•˜๋‹ค.
    • ์šฐ๋ฆฌ๊ฐ€ ์›ํ–ˆ๋˜ ๊ฐ’์€ 0 1 1 0์ด๋‹ค. ์•„๋ž˜์˜ ๊ฒฐ๊ณผ๊ฐ’์„ ๋ณด๋ฉด ๋ณ„๋กœ ๊ทผ์ ‘ํ•˜์ง€ ์•Š๋‹ค๋Š” ๊ฒƒ์„ ์•Œ ์ˆ˜ ์žˆ๋‹ค. ๋”ฐ๋ผ์„œ ํ•ด๋‹น ๋ชจ๋ธ์€ ์‹คํŒจ๋‹ค.
# ํ•™์Šตํ•œ ๋ชจ๋ธ์„ ๊ฐ€์ง€๊ณ  ๊ฒฐ๊ณผ๊ฐ’์„ ์˜ˆ์ธกํ•ด๋ด„ 
y_pred = model.predict(x_data)

# ์šฐ๋ฆฌ๊ฐ€ ์›ํ•˜๋Š” ๊ฒฐ๊ณผ๋Š” 0 1 1 0์ด๋‹ค. ๊ทธ๋Ÿฐ๋ฐ ๊ฒฐ๊ณผ๊ฐ’์„ ๋ณด๋ฉด 0.5 ์ •๋„๋กœ ์ฐจ์ด๊ฐ€ ๊ฝค ๋งŽ์ด ๋‚˜๋Š” ๊ฒƒ์„ ์•Œ ์ˆ˜ ์žˆ๋‹ค.
# ์ด๋ž˜์„œ ์ด์ง„ ๋…ผ๋ฆฌ ํšŒ๊ท€๋กœ๋Š” ๊ณ„์‚ฐ์ด ๋ถˆ๊ฐ€๋Šฅํ•˜๋‹ค. 
print(y_pred)

- XOR ๋”ฅ๋Ÿฌ๋‹(MLP)

  • ํ•„์š”ํ•œ ๊ฒƒ๋“ค์€ import ํ•˜์˜€์œผ๋‹ˆ ๋ฐ”๋กœ ๋ชจ๋ธ์„ ์ƒ์„ฑํ•˜๊ฒ ๋‹ค.
    • ํ˜„์žฌ ๋งŒ๋“œ๋ ค๊ณ  ํ•˜๋Š” ๋ชจ๋ธ์€ Multilayer Perceptrons์œผ๋กœ ์ด์ง„ ๋…ผ๋ฆฌ ํšŒ๊ท€์˜ ๊ฒฝ์šฐ์™€ ๋‹ฌ๋ฆฌ ์—ฌ๋ ค ๊ฒน์˜ ๋ ˆ์ด์–ด๋กœ ์ด๋ฃจ์–ด์ ธ ์žˆ๋‹ค.
    • relu ํ•จ์ˆ˜๋ฅผ ํ™œ์„ฑํ™”ํ•จ์ˆ˜๋กœ ์‚ฌ์šฉํ–ˆ๋‹ค.
      0๋ณด๋‹ค ์ž‘์€ ๊ฐ’์ด ๋“ค์–ด์˜ค๋ฉด 0์œผ๋กœ ์ถœ๋ ฅํ•˜๊ณ (y=0), 0๋ณด๋‹ค ํฐ ๊ฐ’์ด ๋“ค์–ด์˜ค๋ฉด ์ž…๋ ฅ๊ณผ ๋˜‘๊ฐ™์€ ๊ฐ’์„ ๋‚ด๋Š” ๊ทธ๋ž˜ํ”„(y=x)
model = Sequential([ # Sequential ๋ชจ๋ธ์„ ์ƒ์„ฑํ•˜๋Š”๋ฐ
  Dense(8, activation='relu'), # hidden layer๋ฅผ ์ง€์ •ํ•˜๋Š”๋ฐ, ๋…ธ๋“œ๊ฐ€ 8๊ฐœ์ธ fully-connected์ธ Dense layer์ž„. ํ™œ์„ฑํ™”ํ•จ์ˆ˜๋Š” relu ์‚ฌ์šฉ
  Dense(1, activation='sigmoid'), # ๊ฒฐ๊ณผ๊ฐ’์ด 1๊ฐœ์ด๊ณ  sigmoid๋ฅผ ์‚ฌ์šฉํ•˜๊ฒŒ ๋˜๋Š”๋ฐ, ์ด์œ ๋Š” ์–ด์งœํ”ผ ๊ฒฐ๊ณผ๊ฐ’์ด 0 ๋˜๋Š” 1๋กœ binary์˜ ํ˜•ํƒœ๋ฅผ ๋„๊ธฐ ๋•Œ๋ฌธ
])

# ์ด ๋ถ€๋ถ„์€ ์ด์ง„ ๋…ผ๋ฆฌ ํšŒ๊ท€๋ž‘ ๋˜‘๊ฐ™์Œ.
model.compile(loss='binary_crossentropy', optimizer=SGD(lr=0.1))

model.fit(x_data, y_data, epochs=1000, verbose=0) 
  • ๋ชจ๋ธ์„ ์ƒ์„ฑํ–ˆ์œผ๋‹ˆ ๋ชจ๋ธ์„ ํ†ตํ•ด ๊ฒฐ๊ณผ๋ฅผ ์˜ˆ์ธกํ•ด ๋ณด๋„๋ก ํ•˜๊ฒ ๋‹ค.
    • ์šฐ๋ฆฌ๊ฐ€ ์›ํ–ˆ๋˜ 0 1 1 0๊ณผ ์œ ์‚ฌํ•œ ๊ฒฐ๊ณผ๋ฅผ ์–ป์„ ์ˆ˜ ์žˆ์—ˆ๋‹ค.
y_pred = model.predict(x_data)

# ์šฐ๋ฆฌ๊ฐ€ ์›ํ•˜๋Š” ๊ฐ’์ธ 0 1 1 0๊ณผ ๋น„์Šทํ•œ ๊ฐ’์„ ์–ป์„ ์ˆ˜ ์žˆ๊ฒŒ ๋˜์—ˆ๋‹ค. hidden layer๋กœ ์ธํ•ด ์†์‰ฝ๊ฒŒ ์›ํ•˜๋Š” ๊ฐ’์„ ์–ป์„ ์ˆ˜ ์žˆ๋‹ค. 
print(y_pred)

- Keras Functional API

  • Sequential API๋Š” ์ˆœ์ฐจ์ ์ธ ๋ชจ๋ธ ์„ค๊ณ„์—๋Š” ํŽธ๋ฆฌํ•œ API ์ด์ง€๋งŒ, ๋ณต์žกํ•œ ๋„คํŠธ์›Œํฌ๋ฅผ ์„ค๊ณ„ํ•˜๊ธฐ์—๋Š” ํ•œ๊ณ„๊ฐ€ ์žˆ๊ธฐ ๋•Œ๋ฌธ์— ์‹ค๋ฌด์—์„œ๋Š” Functional API๋ฅผ ์ฃผ๋กœ ์‚ฌ์šฉํ•˜๋ฏ€๋กœ ํ•ด๋‹น ์‹ค์Šต์„ ์ง„ํ–‰ํ•ด ๋ณด์•˜๋‹ค.
  • Functional API๋ฅผ ์‚ฌ์šฉํ•˜๊ธฐ ์œ„ํ•ด ํ•„์š”ํ•œ ๊ฒƒ๋“ค์„ import ํ•˜์˜€๋‹ค.
import numpy as np
from tensorflow.keras.models import Sequential, Model # sequential ๋Œ€์‹ ์— Model ์‚ฌ์šฉ
from tensorflow.keras.layers import Dense, Input # Dense ๋ฟ๋งŒ ์•„๋‹ˆ๋ผ Input ๋„ ์‚ฌ์šฉ
from tensorflow.keras.optimizers import Adam, SGD
  • Functional API๋ฅผ ์‚ฌ์šฉํ•ด ๋ชจ๋ธ์„ ์ƒ์„ฑํ•ด๋ณด์•˜๋‹ค.
    • hidden layer๋ฅผ ์‚ฌ์šฉํ•˜์˜€์œผ๋ฉฐ, relu๋ฅผ ํ™œ์„ฑํ™” ํ•จ์ˆ˜๋กœ ์‚ฌ์šฉํ–ˆ๋‹ค.
    • ๋งจ ์œ„์—์„œ ์„ ์–ธํ•œ input ๋ณ€์ˆ˜๊ฐ€ ๋‹ค์Œ ์ค„ hidden์˜ ๋งจ ๋งˆ์ง€๋ง‰์— ์ ์šฉ๋˜๋ฉฐ, hidden์˜ ๊ฒฝ์šฐ์—๋„ ๋งˆ์ฐฌ๊ฐ€์ง€์ธ ๊ฒƒ์„ ์•Œ ์ˆ˜ ์žˆ๋‹ค.
    • argument๋ช…์ธ inputs์˜ ๊ฒฝ์šฐ 2๊ฐœ ์ด์ƒ์ด ๋“ค์–ด๊ฐ€๊ณ (์‹ค์ œ๋กœ 2๊ฐœ ๋“ค์–ด๊ฐ) outputs์˜ ๊ฒฝ์šฐ ์—ฌ๋Ÿฌ ๊ฐœ์˜ ๊ฒฐ๊ณผ๊ฐ€ ๋‚˜์˜ฌ ์ˆ˜ ์žˆ๊ธฐ ๋•Œ๋ฌธ์— ์จ์•ผ ํ•œ๋‹ค.
    • ์ด์ง„ ๋…ผ๋ฆฌ ํšŒ๊ท€์˜ ๊ฒฝ์šฐ์™€ ๋‹ค๋ฅด๊ฒŒ Model์„ ์‚ฌ์šฉํ•ด ๋ชจ๋ธ์„ ์„ ์–ธํ•œ๋‹ค.
    • model.summary()๋ฅผ ํ†ตํ•ด ๋ชจ๋ธ์˜ ๊ฐœ์š”๋ฅผ ์•Œ ์ˆ˜ ์žˆ๋‹ค.
input = Input(shape=(2,)) # Functional API๋Š” input layer์˜ ํฌ๊ธฐ๋ฅผ ์ •ํ•ด์ค„ ์ˆ˜ ์žˆ์Œ
hidden = Dense(8, activation='relu')(input) # ์œ„์—์„œ ์ง€์ •ํ•œ input์„ hidden layer๋ฅผ ์„ค์ •ํ•  ๋•Œ ๋„ฃ์–ด์ค˜์•ผํ•จ!
output = Dense(1, activation='sigmoid')(hidden) # ์œ„์—์„œ ์ง€์ •ํ•œ hidden์„ output์— ๋„ฃ์–ด์ค˜์•ผ ํ•จ!

model = Model(inputs=input, outputs=output) # input๊ณผ output ๋˜ํ•œ inputs ๋ฐ outputs(argument ์ด๋ฆ„)์— ์œ„์˜ ๊ฒƒ๋“ค๊ณผ ๋งˆ์ฐฌ๊ฐ€์ง€๋กœ ๋„ฃ์–ด์ฃผ๋ฉด ๋˜๋ฉฐ, 
                                            # ์ด์ง„ ๋…ผ๋ฆฌ ํšŒ๊ท€์™€ ๋‹ค๋ฅด๊ฒŒ Model ํด๋ž˜์Šค๋กœ model์„ ์„ ์–ธํ•ด์ค€๋‹ค. 

model.compile(loss='binary_crossentropy', optimizer=SGD(lr=0.1))

# Sequential API๋ฅผ ์‚ฌ์šฉํ•˜๋ฉด ๊ตฌ์กฐ๋ฅผ ํ™•์ธํ•˜๊ธฐ ํž˜๋“ค์ง€๋งŒ, Functional API๋ฅผ ์‚ฌ์šฉํ•˜๋ฉด model.summary()๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๊ตฌ์กฐ๋ฅผ ํ™•์ธํ•˜๊ธฐ ์‰ฌ์›€
model.summary()

<summary ๊ฒฐ๊ณผ>

  1. input์˜ ๊ฒฝ์šฐ Output Shape์— 2๊ฐ€ ์จ์žˆ๋Š”๋ฐ input์ด 2๊ฐœ๊ฐ€ ๋“ค์–ด๊ฐ„๋‹ค๋Š” ๋œป์ด๋‹ค.
  2. dense์˜ ๊ฒฝ์šฐ(hidden layer)๋„ 8๊ฐœ๊ฐ€ ๋“ค์–ด๊ฐ€๊ธฐ ๋•Œ๋ฌธ์— ์ €๋ ‡๊ฒŒ ์“ฐ๊ณ  output layer์˜ ๊ฒฝ์šฐ๋„ 1๊ฐœ ์ด๊ธฐ ๋•Œ๋ฌธ์— ์ €๋ ‡๊ฒŒ ๋‚˜์˜ด
  3. ์—ฌ๊ธฐ์„œ ์ค‘์š”ํ•œ ๊ฒƒ์€ ๋ฐ”๋กœ 'None'์ธ๋ฐ, ์ด๊ฒƒ์€ batch size์ด๋‹ค.
    ๋‚ด๊ฐ€ ์ •ํ•˜๊ธฐ ๋‚˜๋ฆ„์ด๊ธฐ ๋•Œ๋ฌธ์— ์ €๋ ‡๊ฒŒ ์“ฐ์ž„
  4. ํŒŒ๋ผ๋ฏธํ„ฐ์˜ ๊ฐœ์ˆ˜์™€ ์ด ๊ฐœ์ˆ˜๋„ ๋‚˜์˜ด
  5. Non-trainable params : ๋ณดํ†ต dropout ์ด๋‚˜ normalization layer๋“ค์ด ํŠธ๋ ˆ์ด๋‹์„ ์•ˆํ•ด์„œ ์—ฌ๊ธฐ์— ๋‚˜ํƒ€๋‚˜๊ฒŒ ๋จ
  • ์ƒ์„ฑ๋œ ๋ชจ๋ธ์„ ํ†ตํ•ด ๊ฒฐ๊ณผ๊ฐ’ ์˜ˆ์ธก์„ ์ง„ํ–‰ํ•ด ๋ณด์•˜๋‹ค.
model.fit(x_data, y_data, epochs=1000, verbose=0)

y_pred = model.predict(x_data)

print(y_pred)
# ๊ฒฐ๊ณผ๊ฐ’์ด ์šฐ๋ฆฌ๊ฐ€ ์›ํ•˜๋Š” 0 1 1 0์— ๊ฐ€๊น๊ฒŒ ๋‚˜์˜จ ๊ฒƒ์„ ๋ณผ ์ˆ˜ ์žˆ๋‹ค. 


๐Ÿ˜ญ ์˜์–ด ์•ŒํŒŒ๋ฒณ ์ˆ˜ํ™” ๋ฐ์ดํ„ฐ์…‹ ์‹ค์Šต

  • ์ด๋ฏธ์ง€๊ฐ€ ์žˆ์–ด ์†๋„๊ฐ€ ์ข€ ๋Š๋ฆด ์ˆ˜ ์žˆ๋‹ค. ๋”ฐ๋ผ์„œ ๋‹ค์Œ๊ณผ ๊ฐ™์ด ์„ธํŒ…์„ ํ•˜์˜€๋‹ค.
    [๋Ÿฐํƒ€์ž„] - [๋Ÿฐํƒ€์ž„ ์œ ํ˜• ๋ณ€๊ฒฝ] - [ํ•˜๋“œ์›จ์–ด ๊ฐ€์†๊ธฐ] GPU ์„ ํƒ - [์ €์žฅ]
  • kaggle์˜ ๋ฐ์ดํ„ฐ๋ฅผ ์‚ฌ์šฉํ–ˆ๊ธฐ ๋•Œ๋ฌธ์— ์•„๋ž˜์™€ ๊ฐ™์ด ์ •๋ณด ์ž…๋ ฅ์„ ํ•œ๋‹ค. ๊ทธ๋ฆฌ๊ณ  ๋ฐ์ดํ„ฐ์…‹์˜ ์••์ถ•์„ ํ•ด์ œํ•˜์˜€๋‹ค.
import os
os.environ['KAGGLE_USERNAME'] = 'username' 
os.environ['KAGGLE_KEY'] = 'key' 

!kaggle datasets download -d datamunge/sign-language-mnist
!unzip sign-language-mnist.zip
  • ๋‹ค์Œ๊ณผ ๊ฐ™์ด import๋ฅผ ์ง„ํ–‰ํ•˜์˜€๋‹ค.
from tensorflow.keras.models import Model # fiunctional API ์‚ฌ์šฉ
from tensorflow.keras.layers import Input, Dense # fiunctional API ์‚ฌ์šฉ
from tensorflow.keras.optimizers import Adam, SGD
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.preprocessing import OneHotEncoder
  • ์ด๋ฒˆ ๋ฐ์ดํ„ฐ์…‹์€ train_set๊ณผ test_set์ด ๊ฐ๊ฐ ๋”ฐ๋กœ ์กด์žฌํ•œ๋‹ค. ๋จผ์ € train_set์„ ์ ์šฉํ•˜๋„๋ก ํ•˜๊ฒ ๋‹ค.
# ํŠธ๋ ˆ์ธ์…‹์„ ์ ์šฉ์‹œํ‚ด
train_df = pd.read_csv('sign_mnist_train.csv')

train_df.head()

  • test_set์„ ์ ์šฉํ•˜์˜€๋‹ค.
# ํ…Œ์ŠคํŠธ์…‹์„ ์ ์šฉ์‹œํ‚ด 
test_df = pd.read_csv('sign_mnist_test.csv')

test_df.head()

  • ์šฐ๋ฆฌ๊ฐ€ ๊ฒฐ๊ณผ๊ฐ’์œผ๋กœ ์–ป์–ด์•ผ ํ•˜๋Š” ์ •๋ณด๋Š” ์•ŒํŒŒ๋ฒณ์„ ๋‚˜ํƒ€๋‚ด๋Š” label์ด๋‹ค. ๊ทธ๋Ÿฌ๋ฏ€๋กœ ๋จผ์ € label์ด ์–ด๋–ป๊ฒŒ ๋ถ„ํฌ๋˜์–ด ์žˆ๋Š”์ง€ ํ™•์ธํ•ด๋ณด๋„๋ก ํ•˜๊ฒ ๋‹ค.
    • 9=J or 25=Z ๋Š” ๋™์ž‘์ด ๋“ค์–ด๊ฐ€๋ฏ€๋กœ ์ œ์™ธํ•˜์˜€๋‹ค.
      -> ์ฆ‰ ์•ŒํŒŒ๋ฒณ์ด 26๊ฐœ์ง€๋งŒ, ํ˜„ ๋ฐ์ดํ„ฐ์…‹์—๋Š” 24๊ฐ€ ๋“ค์–ด๊ฐ€์žˆ์Œ!
    • ํ˜„์žฌ label์€ ๊ณจ๊ณ ๋ฃจ ๋ถ„ํฌ๋˜์–ด ์žˆ๋Š” ๊ฒƒ์„ ์•Œ ์ˆ˜ ์žˆ๋‹ค.
plt.figure(figsize=(16, 10))
# seaborn์˜ countplot์„ ํ†ตํ•ด ๊ฐ ๋ผ๋ฒจ ๋ณ„ ๋ถ„ํฌ๋ฅผ ํ™•์ธ
sns.countplot(train_df['label'])
plt.show()

  • ๋ฐ์ดํ„ฐ์…‹์˜ ๋ถ„๋ฆฌ๋ฅผ ์ง„ํ–‰ํ•˜์˜€๋‹ค.
# de.head()๋กœ ๊ตฌ์กฐ๋ฅผ ๋ดค์„ ๋•Œ ์•ŒํŒŒ๋ฒณ์— ํ•ด๋‹นํ•˜๋Š” label์„ ๋นผ๋ฉด ๋˜๊ธฐ ๋•Œ๋ฌธ์— ์•„๋ž˜์™€ ๊ฐ™์ด ์ง„ํ–‰
train_df = train_df.astype(np.float32) # keras ์‚ฌ์šฉ์„ ์œ„ํ•ด float32bit๋กœ ๋ณ€ํ˜•
x_train = train_df.drop(columns=['label'], axis=1).values # .values๋ฅผ ์‚ฌ์šฉํ•˜๊ฒŒ ๋˜๋ฉด ๋ฐ์ดํ„ฐํ”„๋ ˆ์ž„์—์„œ np.array๋กœ ๋ณ€๊ฒฝํ•  ์ˆ˜ ์žˆ์Œ
y_train = train_df[['label']].values # ์œ„์—์„œ ์ œ๊ฑฐํ–ˆ๋˜ label์„ ํฌํ•จ

# ํ…Œ์ŠคํŠธ์…‹ ๋˜ํ•œ ํŠธ๋ ˆ์ธ์…‹๊ณผ ๊ฐ™์€ ๋ฐฉ์‹์œผ๋กœ ๋ถ„๋ฆฌ
test_df = test_df.astype(np.float32)
x_test = test_df.drop(columns=['label'], axis=1).values
y_test = test_df[['label']].values

print(x_train.shape, y_train.shape) # (27455, 784) (27455, 1)
print(x_test.shape, y_test.shape)   # (7172, 784) (7172, 1)
# ์ž…๋ ฅ ๋…ธ๋“œ ๊ฐœ์ˆ˜๊ฐ€ 784๊ฐœ ์ด๊ณ , ์•„์›ƒํ’‹ ๋…ธ๋“œ์˜ ๊ฐœ์ˆ˜๋Š” 1์ž„
  • ๋ฐ์ดํ„ฐ๋ฅผ ๋ฏธ๋ฆฌ๋ณด๊ธฐํ•ด๋ณด์•˜๋‹ค.
    • ๋งจ ์œ„์˜ ํ‘œ๋ฅผ ๋ณด๋ฉด ์•Œ๊ฒ ์ง€๋งŒ, ์šฐ๋ฆฌ๋Š” ์ด๋ฏธ์ง€ ํ”ฝ์…€์— ๋Œ€์ž…๋˜๋Š” ๊ฐ ๊ฐ’์„ ํ•œ ์ค„๋กœ ๊ฐ€์ง€๊ณ  ์žˆ๋‹ค. ํ•˜์ง€๋งŒ ์ด๋ฏธ์ง€๋กœ ๋ณด์—ฌ์ง€๊ฒŒ ํ•˜๋ ค๋ฉด 2์ฐจ์›์œผ๋กœ ๋ณ€๊ฒฝํ•ด์ฃผ์–ด์•ผ ํ•œ๋‹ค. ๋”ฐ๋ผ์„œ ์ด๋ฏธ์ง€์˜ ํฌ๊ธฐ๊ฐ€ 784 ์ด๋ฏ€๋กœ ๊ฐ 28 x 28๋กœ ๋งŒ๋“ค์–ด ์ฃผ์—ˆ๋‹ค.
index = 1
plt.title(str(y_train[index])) # y_train์—์„œ ๊ฐ’์„ ๋ฝ‘์•„์˜ค๊ณ , ์ด ๊ฐ’์„ x_train์—์„œ ์ด๋ฏธ์ง€ ๊ฐ’์„ ๋ฏธ๋ฆฌ๋ณด๊ธฐ๋กœ ์ฐพ์•„๋ด„ -> label์ด ๋‚˜์˜ค๊ฒŒ ๋จ
plt.imshow(x_train[index].reshape((28, 28)), cmap='gray') # ํ˜„์žฌ ํ”ฝ์…€์ด ํ•œ ์ค„๋กœ ์ญ‰ ๋‚˜์—ด๋˜์–ด ์žˆ๊ธฐ ๋•Œ๋ฌธ์— ๊ทธ๋ฆผ์œผ๋กœ ๋ณด๋ ค๋ฉด 2์ฐจ์›์œผ๋กœ ๋ณ€๊ฒฝํ•ด์•ผํ•จ
                                                          # 28 * 28 = 784
                                                          # gray scale๋กœ ๋ฟŒ๋ ค์ฃผ๋ผ๋Š” ๋ช…๋ น
plt.show()
# 6์ด๋ผ๋Š” label์„ ๊ฐ€์ง„ ๋…€์„์ด ๋‚˜์˜ค๊ฒŒ ๋จ
# G

  • ์œ„์˜ label์„ countplot์œผ๋กœ ๊ทธ๋ฆฐ ๋ถ€๋ถ„์„ ๋ณด๋ฉด ๊ฐ label์˜ ์ˆซ์ž๊ฐ€ 1์—์„œ 24๊นŒ์ง€์ธ ๊ฒƒ์„ ์•Œ ์ˆ˜ ์žˆ๋‹ค. ์šฐ๋ฆฌ๋Š” ์ด ์ˆซ์ž๋“ค์„ ์ปดํ“จํ„ฐ๊ฐ€ ์ดํ•ดํ•˜๊ธฐ ์‰ฝ๊ฒŒ ํ•ด์ฃผ๋ ค๊ณ  OneHotEncoding์„ ์ง„ํ–‰ํ•˜์˜€๋‹ค.
    • ์›ํ•ซ์ธ์ฝ”๋”ฉ์„ ํ†ตํ•ด ๊ฐ label๋“ค์˜ ๊ฐ’์ด 0 ~ 1 ์‚ฌ์ด์˜ ๊ฐ’์œผ๋กœ ๋ณ€๊ฒฝ๋˜์—ˆ๋‹ค.
    • ์ด ์ž‘์—…์„ ๊ฑฐ์น˜๊ณ  ๋ฐ”๋กœ ์œ„์˜ ์‚ฌ์ง„์„ ๋‹ค์‹œ ๊ฐ€์ ธ์™€๋ณด๋ฉด ์•„๋ž˜์™€ ๊ฐ™์ด ๋ณ€ํ•˜๊ฒŒ ๋œ๋‹ค.
# ๊ทธ๋Ÿฐ๋ฐ 6์ด๋ผ๋Š” label์„ ์ปดํ„ฐ๊ฐ€ ์ดํ•ดํ•˜๊ธฐ ํž˜๋“ค๊ธฐ ๋•Œ๋ฌธ์— ์šฐ๋ฆฌ๋Š” ๋ฐ”๋กœ ์›ํ•ซ ์ธ์ฝ”๋”ฉ์„ ํ•ด์ฃผ๋ฉด ๋œ๋‹ค. 
encoder = OneHotEncoder()
y_train = encoder.fit_transform(y_train).toarray() # array ํ˜•์‹์œผ๋กœ ๋ฐ”๊ฟ”์คŒ
y_test = encoder.fit_transform(y_test).toarray() # array ํ˜•์‹์œผ๋กœ ๋ฐ”๊ฟ”์คŒ

print(y_train.shape) # (27455, 24)
# ์›๋ž˜๋Š” (27455, 1)์ด์—ˆ๋Š”๋ฐ, ์›ํ•ซ์ธ์ฝ”๋”ฉ์ด ๋˜์–ด 24๋กœ ๋ฐ”๋€ ๊ฒƒ์„ ์•Œ ์ˆ˜ ์žˆ๋‹ค. 
  • ์ด๋ฏธ์ง€ ๋ฐ์ดํ„ฐ ๋˜ํ•œ 0 ~ 255 ์‚ฌ์ด์˜ ๊ฐ’์„ ๊ฐ€์ง€๊ณ  ์žˆ๊ธฐ ๋•Œ๋ฌธ์— 255๋ฅผ ๋‚˜๋ˆ„์–ด์ค˜ ์ผ๋ฐ˜ํ™”์‹œํ‚ค๋Š” ์ž‘์—…์„ ์ง„ํ–‰ํ•˜์˜€๋‹ค.
x_train = x_train / 255. # 255๋กœ ๋‚˜๋ˆ„๋Š” ์ด์œ ๋Š” ์ด๋ฏธ์ง€ ๋ฐ์ดํ„ฐ์˜ ํ”ฝ์…€์˜ ์ตœ๋Œ“๊ฐ’์ด 255์ด๊ธฐ ๋•Œ๋ฌธ์ž„
x_test = x_test / 255.   # 0-1 ์‚ฌ์ด์˜ ์†Œ์ˆ˜์  ๋ฐ์ดํ„ฐ(floating point 32bit = float32)๋กœ ๋ฐ”๊พธ๊ณ  ์ผ๋ฐ˜ํ™” (0-255 ์˜€๋˜๊ฑธ 255๋กœ ๋‚˜๋ˆ ์„œ 0-1 ์‚ฌ์ด์˜ ๊ฐ’์œผ๋กœ ๋งŒ๋“ค์–ด์ฃผ๋Š”๊ฑฐ!)
  • ๋„คํŠธ์›Œํฌ๋ฅผ ๊ตฌ์„ฑํ•ด ์ฃผ์—ˆ๋‹ค.
input = Input(shape=(784,))                    # input์€ 784๊ฐœ๊ฐ€ ๋“ค์–ด๊ฐ€๊ฒŒ ๋˜๋ฉฐ, ๋‹ค์Œ ์ค„์˜ input์— ๊ฐ’์ด ๋“ค์–ด๊ฐ„๋‹ค. 
hidden = Dense(1024, activation='relu')(input) # ๊ฐ๊ฐ์˜ hidden layer๋“ค์ด 1024, 512, 256 ๊ฐœ์˜ input node ๋ฅผ ๊ฐ€์ง€๊ณ  ์žˆ์œผ๋ฉฐ
hidden = Dense(512, activation='relu')(hidden) # ํ™œ์„ฑํ™”ํ•จ์ˆ˜๋กœ๋Š” relu๋ฅผ ์‚ฌ์šฉํ•˜๊ฒŒ ๋œ๋‹ค. 
hidden = Dense(256, activation='relu')(hidden) # ๊ฐ hidden ๊ฐ’์€ ๋‹ค์Œ ์ค„์˜ (hidden) ์•ˆ์— ๋“ค์–ด๊ฐ€๊ฒŒ ๋œ๋‹ค. 
output = Dense(24, activation='softmax')(hidden) # ๊ฒฐ๊ณผ๋กœ๋Š” ์•ŒํŒŒ๋ฒณ ์ค‘ j์™€ g๋ฅผ ์ œ์™ธํ•œ ๊ฐ’์ด ๋‚˜์™€์•ผ ํ•˜๋ฏ€๋กœ 24๊ฐœ์ด๊ณ , ํ™œ์„ฑํ™”ํ•จ์ˆ˜๋Š” softmax ํ•จ์ˆ˜๋ฅผ ์‚ฌ์šฉํ•œ๋‹ค.

model = Model(inputs=input, outputs=output) # Functional API๋ฅผ ์‚ฌ์šฉํ•˜๊ธฐ ๋•Œ๋ฌธ์— Model class๋ฅผ ์‚ฌ์šฉํ•˜๋ฉฐ, input๊ณผ output์ด ์—ฌ๋Ÿฌ ๊ฐœ์ด๋ฏ€๋กœ inputs, outputs๊ฐ€ ๋œ๋‹ค. 

# ๋‹คํ•ญ ๋…ผ๋ฆฌ ํšŒ๊ท€์ด๊ธฐ ๋•Œ๋ฌธ์— softmax ์™€ categorical_crossentropy๋ฅผ ์‚ฌ์šฉํ•จ
model.compile(loss='categorical_crossentropy', optimizer=Adam(lr=0.001), metrics=['acc']) # metrics=['acc']๋ฅผ ์‚ฌ์šฉํ•ด ์ •ํ™•๋„ ๊ฐ™์ด ํŒ๋‹จ(0-1 ์‚ฌ์ด์˜ ๊ฐ’ ๋‚˜์˜ด)

model.summary() # ๋ชจ๋ธ์˜ ๊ฐœ์š”

  • ์ƒ์„ฑ๋œ ๋ชจ๋ธ์˜ ํ•™์Šต์„ ์ง„ํ–‰ํ•˜์˜€๋‹ค.
history = model.fit(
    x_train,
    y_train,
    validation_data=(x_test, y_test), # ๊ฒ€์ฆ ๋ฐ์ดํ„ฐ๋ฅผ ๋„ฃ์–ด์ฃผ๋ฉด ํ•œ epoch์ด ๋๋‚ ๋•Œ๋งˆ๋‹ค ์ž๋™์œผ๋กœ ๊ฒ€์ฆ
    epochs=20 # epochs ๋ณต์ˆ˜ํ˜•์œผ๋กœ ์“ฐ๊ธฐ!
)
  • ํ•™์Šต ๊ฒฐ๊ณผ ๊ทธ๋ž˜ํ”„
plt.figure(figsize=(16, 10))
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])


# ๊ฐ€๋กœ์ถ• : epochs / ์„ธ๋กœ์ถ• : loss
# training_loss ๊ฐ€ ํŒŒ๋ž€์ƒ‰
# val_loss๊ฐ€ ์ฃผํ™ฉ์ƒ‰
# ๊ฐˆ์ˆ˜๋ก loss ๊ฐ€ ๊ฐ์†Œํ•˜๋Š” ์ถ”์„ธ!

plt.figure(figsize=(16, 10))
plt.plot(history.history['acc'])
plt.plot(history.history['val_acc'])

# ๊ฐ€๋กœ์ถ• : epochs / ์„ธ๋กœ์ถ• : loss
# training_acc ๊ฐ€ ํŒŒ๋ž€์ƒ‰
# val_acc๊ฐ€ ์ฃผํ™ฉ์ƒ‰
# ์ •ํ™•๋„๋Š” ๋ฐ˜๋Œ€๋กœ ์˜ฌ๋ผ๊ฐ€๊ณ  ์žˆ๋‹ค!


๐Ÿ˜ญ 3์ฃผ์ฐจ ์ˆ™์ œ

3์ฃผ์ฐจ ์ˆ™์ œ

profile
https://github.com/nikevapormax

0๊ฐœ์˜ ๋Œ“๊ธ€