PARALLELIZING LINEAR RECURRENT NEURAL NETS OVER SEQUENCE LENGTH

About_work·2023년 7월 25일
0

딥러닝

목록 보기
2/16

Abstract

  • We show the training of RNNs with only linear sequential dependencies can be parallelized over the sequence length using the parallel scan algorithm,
    • leading to rapid training on long sequences even with small minibatch size.
  • We develop a parallel linear recurrence CUDA kernel and show that
    • it can be applied to immediately speed up training and inference of several state of the art RNN architectures by up to 9x.
  • We abstract(추상화하다) recent work on linear RNNs into a new framework of linear surrogate RNNs
  • and develop a linear surrogate model for the long short-term memory unit, the GILR-LSTM, that utilizes parallel linear recurrence.
  • We extend sequence learning to new extremely long sequence regimes that were previously out of reach
    • by successfully training a GILR-LSTM on a synthetic sequence classification task with a one million timestep dependency.

profile
새로운 것이 들어오면 이미 있는 것과 충돌을 시도하라.

0개의 댓글