Aiffel 양재 2기 - 71일차(2022.04.11)
공부 일지
[Going Deeper - NLP]
- modern NLP의 흐름에 올라타보자
- Word Embedding, context
- Transfer Learning
- Language Modeling
- Transformer
- ELMO(Embedding from Language Models)
- character-level CNN
- bidirectional LSTM
- ELMO 임베딩 레이어
- ELMo의 이용
- GPT(Generative Pre-Training Transformer)
- Transfomer Decoder Block : Pretraining LM (Unspervised Learning)
- Embedding
- Masked Multi-Head Attention
- Text Prediction & Text classification: finetuning downstream task (Supervised Learning)
- Input Transformation
- GPT vs. GPT2
- GPT3
- GPT Neo
- BERT(Bidirectional Encoder Representations from Transformers)
- Transformer Encoder Block
- Embedding
- Token Embedding
- Segment Embedding
- Position Embedding
- Activation Function : GELU
- Masked LM(MLM)
- Next Sentence Prediction (NSP)
- Fine-tuning Task
- Transformer-XL(Transformer Extra Long)
- Vanilla Transformer LMs
- Segment-level recurrence with state reuse
- Relative Positional Encodings
- XLNet, BART
- Permutation Language Model
- AR(AutoRegressive)
- AE(AutoEncoding)
- Two-Stream Self-Attention
- BART
- ALBERT(A Lite BERT for Self-supervised Learning of Language Representations)
- Factorized embedding parameterization
- Cross-layer parameter sharing
- Inter-sentence coherence loss
- T5(Text-to-Text Transfer Transformer)
- C4
- Shared Text-To-Text Framework
- Modified MLM
- 모델 아키텍처
- 새로운 task
- Closed-Book Question Answering
- fill-in-the blank task
- Switch Transformer
- MoE
- Switch Routing
- Distributed Switch Implementation
- Differentiable Load Balancing Loss
- Selective precision
- ERNIE
- PaddlePaddle
- ERNIE v1
- Masking Strategies
- Transformer Encoder
- ERNIE v3
- ERNIE v3(ERNIE v1 + ERNIE v2)
회고
- 오늘은 방대한 양의 트랜스포머류의 모델들을 알아보았다.
- 이를 지금 당장 다 알 수는 없지만 조금씩 공부해보자!
- 또 오늘은 일과 후에 딥랩 세미나를 들으러 간다.
- 주제는 Data2vec!
- 굉장히 흥미로웠고 재미있었다.
- 빅모델에 대해서 조금씩 공부해보자!
- Keep Going!