[NeurIPS 2020]Energy-based Out-of-distribution Detection

-·2023년 8월 22일

EBM

목록 보기

3/7

https://arxiv.org/abs/2010.03759

Energy-based Out-of-distribution Detection

위 논문에서는 energy score를 사용한 OOD detection unified framework를 제안한다. 이러한 에너지의 차이는 in- and out-of-distribution을 효과적으로 분류할 수 있다. energy score는 softmax confidence가 가지는 OOD example에서의 arbitrarily high value가 나오게되는 치명적인 문제를 해결해준다. 먼저 pre-trained model을 통한 energy의 사용과, energy와 softmax score사이의 관계에 대해 설명한다. 그리고나서 energy를 사용한 loss를 통한 fine-tuning에 대해 설명한다.

Background : Energy-based Models

일단, 이 논문을 이해하기 위해서는 Energy-based Model에 관해 이해하고 있어야 한다. energy-based model(EBM)의 필요성은 입력 공간에 해당하는 각각의 데이터 포인트 $\mathbf{x}$ 에 대해서 어떤 energy function $E(\mathbf{x}):\mathbb{R}^D\rightarrow \mathbb{R}$ 을 통해 single, non-probabilistic scalr인 energy로 매핑해주는 모델링 방법이다. 이 때, energy값들의 집합은 Gibbs distribution을 통해 probability density $p(\mathbf{x})$ 로 변환될 수 있는데 다음과같다.

p(y|\mathbf{x})={e^{-E(\mathbf{x},y)/T}\over{\int_{y^\prime}e^{-E(\mathbf{x},y^\prime)/T}}}={e^{-E(\mathbf{x},y)/T}\over{e^{-E(\mathbf{x})/T}}}

이 때 denominator $\int_{y^\prime}e^{-E(\mathbf{x},y^\prime)/T}$ 는 partition function이라고 불리며, $y$ 에 대한 marginalize를 진행한 결과이다. 이 떄 $T$ 는 temperature parameter이다. 주어진 데이터포인트 $\mathbf{x}\in\mathbb{R}^D$ 의 Helmholtz free energy에 해당하는 $E(\mathbf{x})$ 는 negative log partition function으로 표현할 수 있다.

E(\mathbf{x})=-T\log\int_{y^\prime}e^{-E(\mathbf{x},y^\prime)/T}

Energy Function

Energy-based model은 modern machine learning과 inherent connection을 지니고 있는데, 특히 discriminative model이 그렇다. 이를 알아보기 위해서, discriminative neural classifier $f(\mathbf{x}):\mathbb{R}^D\rightarrow\mathbb{R}^K$ 라고 생각해보고, 우리는 입력 $\mathbf{x}\in\mathbb{R}^D$ 를 $K$ 개의 logits으로 알고있는 real-valued number로 매핑하고자 한다. 이러한 logits은 softmax function을 이용해 categorical distribution으로 derive할 수 있다.

p(y|\mathbf{x})={e^{f_y(\mathbf{x})/T}\over{\sum^K_{i}e^{f_i(\mathbf{x})/T}}}

여기서 $f_y(\mathbf{x})$ 는 $f(\mathbf{x})$ 의 $y^\text{th}$ 번째 인덱스의 로짓을 의미.

Energy based model에서의 연결을 통해서, energy를 given input $(\mathbf{x},y)$ 에 대한 $E(\mathbf{x},y)=-f_y(\mathbf{x})$ 로 정의할 수 있다.

E(\mathbf{x};f)=-T\cdot\log\sum^K_{i}e^{f_i(\mathbf{x})/T}

Energy as Inference-time OOD Score

Out-of-distribution은 binary classification 문제인데, in- and out-of-distribution example 사이의 차이를 특정한 score를 통해 탐지해내는 것이다. scoring function은 in- and out-of-distribution을 구분할 수 있어야 하며, 가장 자연스러운 방법은 density function of the data $p^\text{in}(\mathbf{x})$ 를 사용하고 낮은 likelihood를 가지는 example을 OOD로 분류해내는 것일 것이다. 그러나 이전의 연구들에서 이러한 density function을 deep generative model을 사용해 추정하는 것은 합리적인 방법이 아니라고 한다. (Do deep generative models know what they don’t know? arxiv 2018)

이러한 문제를 해결하기 위해서, 논문의 저자들은 ood detection을 위한 discriminative model로 부터의 energy function을 가져온다. negative log-likelihood (NLL)로 훈련된 모델은 in-distribution data point에 대하여 energy를 push down한다.

이를 보기 위해, negative log-likelihood loss를 살펴보면 $(\mathbf{x},y)\sim P^\text{in}:$

\mathcal{L}_\text{nll}=\mathbb{E}_{(\mathbf{x},y)\sim P^\text{in}}(-\log{\exp(f(\mathbf{x})[y]/\tau)\over{\sum^K_{y^\prime=1}}\exp (f(\mathbf{x})[y^\prime]/\tau)})

$E(\mathbf{x},y)=-f(\mathbf{x})[y]$ 라고 정의할 때, NLL loss는 다음과 같이 다시 쓸 수 있다.

\mathcal{L}_\text{nll}=\mathbb{E}_{(\mathbf{x},y)\sim P^\text{in}}({1\over{\tau}}\cdot E(\mathbf{x},y)+\log\sum^K_{y^\prime=1}\exp(-E(\mathbf{x},y^\prime)/\tau))

첫번째 텀은 energy를 ground truth answer $y$ 에 대해 push down한다. 두번째 contrastive term은 Free Energy로 해석될 수 있는데, 정답이 아닌 $y$ label에 대해서 energy를 pull up한다.

간단한 유도를 통해 $\mathbf{x}\sim P^\text{in}$ 에 대한 energy를 다음과 같이 나타낼 수 있다.

E(\mathbf{x}; f)=-\tau\log\sum_{y^\prime}\exp(f(\mathbf{x},y^\prime)/\tau)

위 논문에서는 $-E(\mathbf{x};f)$ 인 $\mathbf{x}$ 에 대한 energy score를 ood score로 사용하는 방안을 채택한다.

Moreover, the energy score $E(\mathbf{x},f)$ , is a smooth approximation of $-f(\mathbf{x})[y]=E(\mathbf{x},y)$ , which is dominated by the ground truth label $y$ among all labels. Therefore, the NLL loss overall pushes down the energy $E(\mathbf{x}; f)$ of in-distribution data.

거인의 어깨에 올라서서 더 넓은 세상을 바라보라 - 아이작 뉴턴

이전 포스트

[ICLR 2022]A Unified Contrastive Energy-Based Model For Understanding The Generative Ability of Adversarial Training

다음 포스트