[Information] Papers Must Read

Sejin Jeong·2022년 12월 22일

Paper Read review

Information

목록 보기

4/4

**The above picture is a laptop power saving screen, and has nothing to do with this post.

Neural Machine Translation by Jointly Learning to Align and Translate (a.k.a. Bahdanau Attention)

Go to Read

Effective Approaches to Attention-based Neural Machine Translation (a.k.a. Luong Attention)

Go to Read

Attention Is All You Need (a.k.a. Transformer)

Go to Read

Improving Language Understanding by Generative Pre-Training (a.k.a. GPT-1)

Go to Read

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding (a.k.a. BERT)

Go to Read

Language Models are Unsupervised Multitask Learners (a.k.a. GPT-2)

Go to Read

Language Models are Few-Shot Learner (a.k.a. GPT-3)

Go to Read

Multi-Task Deep Neural Networks for Natural Language Understanding (a.k.a. MT-DNN)

Go to Read

MASS: Masked Sequence to Sequence Pre-training for Language Generation (a.k.a. MASS)

Go to Read

XLNet: Generalized Autoregressive Pretraining for Language Understanding (a.k.a. XLNet)

Go to Read

RoBERTa: A Robustly Optimized BERT Pretraining Approach (a.k.a. RoBERTa)

Go to Read

BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension (a.k.a. BART)

Go to Read

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer (a.k.a. T5)

Go to Read

Robust Speech Recognition via Large-Scale Weak Supervision (a.k.a. Whisper)

Go to Read

If you see this post, please recommend papers.

your recommended paper 1

your recommended paper 2

your recommended paper 3

Sejin Jeong

Soli Deo Gloria. / Sapere Aude.

이전 포스트

[Information] Domestic Journal & Conference

3개의 댓글

Sejin Jeong

2022년 12월 22일

인터스피치 컨퍼런스
Data2vec(메타AI)
NVIDIA - Language Understanding Model
ERNIE
ULMFit
ELMo
HyperClova(Naver)
KoGPT(kakao)
KorBERT(ETRI)
Learning to Identify Ambiguous and Misleading News Headlines
Why Does Unsupervised Pre-training Help Deep Learning?
PaLM: Scaling Language Modeling with Pathways
LaMDA
Gopher
GLaM
DALL-E (to attain for language)
YOLO (computer vision, but to attain for language)
Chinchilla
BIG-bench
DialogBERT
A Neural Conversational Model(Google, Paper) - Generative Conversational model(생성 대화 모델)
Meena
LUKE(Deep Contextualized Entity Representations with Entity-aware Self-attention)
RNN

답글 달기

Sejin Jeong

2023년 1월 8일

Translation Background

Seq2Seq : Sequence to Sequence Learning with Neural Networks (추가)
https://arxiv.org/abs/1409.3215, 2014
Attention
Transformer
GPT
BERT
AE & AR : Generalized Autoregressive Pretraining for Language Understanding (XLNet) (수정)
https://arxiv.org/abs/1906.08237, 2019

Translation Datasets

Back-Translation : Understanding Back-Translation at Scale (추가)
https://arxiv.org/abs/1808.09381, 2018
Bleualign & Ableualign (추가)
https://github.com/rsennrich/Bleualign, 2010
https://tech.kakaoenterprise.com/50, 2020

Translation Efficiency

Non-AR : Non-Autoregressive Neural Machine Translation (추가)
https://arxiv.org/abs/1711.02281, 2017
CMLM : Parallel Decoding of Conditional Masked Language Models (추가)
https://arxiv.org/abs/1904.09324, 2019
Deep encoder, shallow decoder: Reevaluating the speed-quality tradeoff in machine translation (추가)
https://www.researchgate.net/profile/Nikolaos-Pappas/publication/342301878_Deep_Encoder_Shallow_Decoder_Reevaluating_the_Speed-Quality_Tradeoff_in_Machine_Translation/links/5f73ac53a6fdcc00864836c0/Deep-Encoder-Shallow-Decoder-Reevaluating-the-Speed-Quality-Tradeoff-in-Machine-Translation.pdf, 2020
Deep Encoder, Shallow Decoder: Reevaluating Non-autoregressive Machine Translation (추가)
https://arxiv.org/abs/2006.10369, 2020
Shallow-to-Deep Training for Neural Machine Translation (추가)
https://arxiv.org/abs/2010.03737, 2020

Translation Training

XLM : Cross-lingual Language Model Pretraining
https://arxiv.org/abs/1901.07291, 2019
BART & mBART
https://arxiv.org/abs/1910.13461, 2019
https://arxiv.org/abs/2001.08210, 2020
Multi-Task Learning : Multi-task Learning for Multilingual Neural Machine Translation
https://arxiv.org/abs/2010.02523, 2020
Pre-LN & Admin & Very Deep Transformers for Neural Machine Translation
https://arxiv.org/abs/2002.04745, 2020
https://arxiv.org/abs/2004.08249, 2020
https://arxiv.org/abs/2008.07772, 2020
DeLighT : Very Deep and Light-weight Transformer
https://arxiv.org/abs/2008.00623, 2020
T5 : Exploring the Limits of Transfer Learning with a United Text-to-Text Transformer
https://arxiv.org/abs/1910.10683, 2019

Translation Architecture

MNMT : Complete Multilingual Neural Machine Translation
https://arxiv.org/abs/2010.10239, 2020
M2M : Beyond English-Centric Multilingual Machine Translation
https://arxiv.org/abs/2010.11125, 2020
M2 : Revising Modularized Multilingual NMT to Meet Industrial Demands
https://arxiv.org/abs/2010.09402, 2020

답글 달기

Sejin Jeong

2023년 1월 9일

Word2Vec (2013, 1, 16)
Glove (2014, 1, 2)
FastText (2016, 7, 15)
Transformer (2017, 6, 12)
ELMo (2018, 2, 15)
BERT (2018, 10, 11)

답글 달기