Resurrecting Recurrent Neural Networks for Long Sequences

About_work·2023년 7월 25일
0

딥러닝

목록 보기
1/16

abstract

  • RNN
    • fast inference on long sequences.
    • hard to optimize and slow to train.
  • Deep state-space model(SSMs)
    • long sequence modeling tasks에서 매우 성능 좋음.
    • fast parallelizable training + fast inference.
  • we show that careful design of deep RNNs using standard signal propagation arguments can
    • recover the impressive performance of deep SSMs on long-range reasoning tasks,
    • while also matching their training speed.
  • we analyze and ablate a series of changes to standard RNNs including
    • linearizing and diagonalizing the recurrence, using better parameterizations and initializations, and
    • ensuring proper normalization of the forward pass.
  • 이를 통해, SSMs의 성능에 대한 insight를 제공할 수 있게 되었고, Linear Recurrent Unit 이라는 모듈도 소개한다.(Long Range 성능도 좋고, computational efficiency도 갖췄다.)

profile
새로운 것이 들어오면 이미 있는 것과 충돌을 시도하라.

0개의 댓글

관련 채용 정보

Powered by GraphCDN, the GraphQL CDN