lstm 13. 케라스의 simpleRNN LSTM

행동하는 개발자·2022년 12월 19일

RNN

목록 보기

12/14

simple rnn

우선 RNN과 LSTM을 테스트하기 위한 임의의 입력을 만든다.

train_X = [[[0.1, 4.2, 1.5, 1.1, 2.8], [1.0, 3.1, 2.5, 0.7, 1.1], [0.3, 2.1, 1.5, 2.1, 0.1], [2.2, 1.4, 0.5, 0.9, 1.1]]]
train_X = np.array(train_X, dtype=np.float32)

print(train_X.shape)
(1, 4, 5)

이는 (배치사이즈, timesteps, input_dim)에 해당하는 3d 텐서다. 배치사이즈는 한 번에 rnn이 학습하는 데이터 양을 의미하지만 여기는 샘플이 하나밖에 없다.

rnn = SimpleRNN(3)
# rnn = SimpleRNN(3, return_sequences=False, return_state=False)와 동일.
hidden_state = rnn(train_X)

print('hidden state : {}, shape: {}'.format(hidden_state, hidden_state.shape))

결과값
hidden state : [[-0.866719    0.95010996 -0.99262357]], shape: (1, 3)

은닉상태의 크기를 3으로 지정했을 때, (1,3) 크기의 텐서가 생성

rnn = SimpleRNN(3, return_sequences=True)
hidden_states = rnn(train_X)

print('hidden states : {}, shape: {}'.format(hidden_states, hidden_states.shape))

결과값
hidden states : [[[ 0.92948604 -0.9985648   0.98355013]
  [ 0.89172053 -0.9984244   0.191779  ]
  [ 0.6681082  -0.96070355  0.6493537 ]
  [ 0.95280755 -0.98054564  0.7224146 ]]], shape: (1, 4, 3)

(1,4,3) 크기의 텐서 생성, 앞서 입력 데이터가 (1,4,5)의 크기를 가지는 tensor 4가 시점에 해당하는 값이므로 모든 시점에 대하여 은닉상태의 값을 출력하여 (1,4,3) 크기의 텐서가 출력된다.

LSTM

lstm = LSTM(3, return_sequences=True, return_state=True)
hidden_states, last_hidden_state, last_cell_state = lstm(train_X)

print('hidden states : {}, shape: {}'.format(hidden_states, hidden_states.shape))
print('last hidden state : {}, shape: {}'.format(last_hidden_state, last_hidden_state.shape))
print('last cell state : {}, shape: {}'.format(last_cell_state, last_cell_state.shape))

결과값
hidden states : [[[ 0.1383949   0.01107763 -0.00315794]
  [ 0.0859854   0.03685492 -0.01836833]
  [-0.02512104  0.12305924 -0.0891041 ]
  [-0.27381724  0.05733536 -0.04240693]]], shape: (1, 4, 3)
last hidden state : [[-0.27381724  0.05733536 -0.04240693]], shape: (1, 3)
last cell state : [[-0.39230722  1.5474017  -0.6344505 ]], shape: (1, 3)

lstm은 simplernn 때와는 달리, 세개의 결과를 반환한다. 마지막 시점의 은닉 상태뿐만 아니라 셀 상태까지 반환한다.

Bidirectional 이해하기

return_sequences가 False이고 return_state가 True인 경우

bilstm = Bidirectional(LSTM(3, return_sequences=False, return_state=True, \
                            kernel_initializer=k_init, bias_initializer=b_init, recurrent_initializer=r_init))
hidden_states, forward_h, forward_c, backward_h, backward_c = bilstm(train_X)

print('hidden states : {}, shape: {}'.format(hidden_states, hidden_states.shape))
print('forward state : {}, shape: {}'.format(forward_h, forward_h.shape))
print('backward state : {}, shape: {}'.format(backward_h, backward_h.shape))

결과값
hidden states : [[0.6303139  0.6303139  0.6303139  0.70387346 0.70387346 0.70387346]], shape: (1, 6)
forward state : [[0.6303139 0.6303139 0.6303139]], shape: (1, 3)
backward state : [[0.70387346 0.70387346 0.70387346]], shape: (1, 3)

행동하는 개발자

끊임없이 뭔가를 남기는 사람

이전 포스트

lstm 12. 데이터의 개수를 뻥튀기

다음 포스트

lstm 13. 케라스의 simpleRNN LSTM

RNN

simple rnn

LSTM

Bidirectional 이해하기

lstm 12. 데이터의 개수를 뻥튀기

lstm 14. 순환신경망

0개의 댓글