# Language Model
OPT: Open Pre-trained Transformer Language Models
OPT: Open Pre-trained Transformer Language Models, arXiv 2022

BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension
BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding, arXiv 2018

[GPT] Improving Language Understanding by Generative Pre-Training
Improving Language Understanding by Generative Pre-Training
Sparks of Artificial General Intelligence: Early experiments with GPT-4
Sparks of Artificial General Intelligence: Early experiments with GPT-4, arXiv 2023

TempLM: Distilling Language Models into Template-Based Generators
TempLM: Distilling Language Models into Template-Based Generators, arXiv 2022
Mind the Gap: Assessing Temporal Generalization in Neural Language Models
Mind the Gap: Assessing Temporal Generalization in Neural Language Models, NeurIPS 2021 Spotlight
LLaMA: Open and Efficient Foundation Language Models
LLaMA: Open and Efficient Foundation Language Models, arXiv 2023
[T0] Multitask Prompted Training Enables Zero-Shot Task Generalization
Multitask Prompted Training Enables Zero-Shot Task Generalization, ICLR 2022
RoBERTa: A Robustly Optimized BERT Pretraining Approach
RoBERTa: A Robustly Optimized BERT Pretraining Approach, Facebook AI

BLEURT: Learning Robust Metrics for Text Generation
BLEURT: Learning Robust Metrics for Text Generation, ACL 2020

Neural Text Generation with Unlikelihood Training
Neural Text Generation with Unlikelihood Training, ICLR 2020
T0 (V. Sanh et al., 2022, ICLR)
Multitask Prompted Training Enables Zero-Shot Task Generalization 논문 리뷰

[논문 리뷰] Don’t Stop Pretraining: Adapt Language Models to Domains and Tasks
SSL(Self-Supervised Learning) - NLP 이해하기 세번째!

KoGPT 써보기
KoGPT는 2021년 카카오브레인에서 발표한 GPT-3 한국어 버젼이다. 언어모델을 써야하는 일이 있어 써보기로 했다.... 근데 어떻게 쓰는거지?

[Review] Improving Language Understanding by Generative Pre-Training (GPT-1)
DSAIL 스토리 제너레이션 스터디 발표 자료 <GPT-1>

What Language Model to Train if You Have One Million GPU Hours?
100만 A100 GPU 시간을 사용할수 있을때 100B+의 모델을 학습하기 위한 가장 좋은 구조와 학습 세팅은 무엇인가?