Interpreting the Latent Space of GANs for Semantic Face Editing (CVPR 2020)

Woo Yeong CHO·2022년 4월 27일

Deep Learning Face editing GAN inversion Latent space semantic

Abstract

GAN으로 부터 학습된 latent semantics를 face editing에 활용해보겠다.
Latent semantics가 latent space상에서 어떻게 encode되어 있는지를 해석한다.
각기 다른 semantics를 disentangle하여 잘 control할 수 있도록 subspace projection 방법을 사용한다.

Introduction

Randomly sampled latent code를 GAN-decoder에 입력하면, Realistic image를 출력할 수 있었으나 해당 semantic 정보의 유래를 알 수 없었다.

Framework

Semantics in the Latent Space

$f_{S} : X -> S$ where X: image doman and S: Semantic score; $S \subset \mathbb{R}^{m}$ with ' $m$ ' semantics

$g : Z -> X$ where $Z \subseteq \mathbb{R}^{d}$ with $d$ -dimensional latent space

Semantic score: $s = f_{S}(g(z))$

Semantics 의 경계를 잘 구분할 수 있는 ${\color{Red}\bold{Boundary-plane}}$ 을 잘 찾으면, 다른 attribute의 semantics는 변하지 않되, control 하고자 하는 attribute만 잘 edit할 수 있다.

1. Single Semantic

semantic score 와 plane-sample( $z$ ) 사이의 거리의 관계: Linearly dependent

$f(g(z) )= \lambda d(n,z)$ where $\lambda > 0$ : a scalar to measure how fast the semantic varies along with the change of distance; $n$ : boundary plane**의 normal vector

2. Multiple Semantics

semantics score와 plane-sample( $z$ ) 사이의 거리의 관계는 Single semantic의 수식 유도와 같음

다만 위 글에서 말하는 것과 같이, $n_{i}$ 와 $n_{j}$ 는 서로 orthogonal 해야 잘 disentangle되어 있다고 볼 수 있음.

Manipulation in the Latent Space

- Single Attribute Manipulation

단순함! $z_{edit} = z + \alpha n$
so, $f(g(z_{edit})) = f(g(z)) + \lambda\alpha$

- Conditional Manipulation

edit할 때, 다른 semantics에 correlated 된 영향을 받지 않기 위해 $N^{T}N$ 을 Diagonal matrix로 만들어준다. (즉, $n_{i}$ 와 $n_{j}$ 를 서로 orthogonalize 한다.)

위 그림과 같이 semantic 1에 대한 것만 Edit하고 semantic 2는 그대로 유지하고 싶다면, $n_2$ 와 orthogonal 하게 움직이면 되므로, projection vector를 구한다.

Multi-semantics에 대해 control을 하고싶을 때,

Experiments

Latent space separation

five indepenent linear SVMs on pose, smile, age, gender, eyeglasses

Separated by Smile boundary (extreme case)

Latent space separation

tbc...

Woo Yeong CHO

I wanna be a specialist! My previous webpage link https://chowy333.tistory.com/

이전 포스트

IEEE 2021 RAS Winter School on SLAM in Deformable Environments

다음 포스트

Interpreting the Latent Space of GANs for Semantic Face Editing (CVPR 2020)

Abstract

Introduction

Framework

Semantics in the Latent Space

1. Single Semantic

2. Multiple Semantics

Manipulation in the Latent Space

- Single Attribute Manipulation

- Conditional Manipulation

Experiments

Latent space separation

Latent space separation

IEEE 2021 RAS Winter School on SLAM in Deformable Environments

Non-linear LQG design in Quadratic KF

0개의 댓글