profile
ALL IS WELL๐ŸŒป
post-thumbnail

Understanding the Mel Spectrogram

๋‹ค์Œ ๋ธ”๋กœ๊ทธ๋ฅผ ๋ฒˆ์—ญํ–ˆ์Šต๋‹ˆ๋‹ค.signal์€ ์‹œ๊ฐ„์ด ์ง€๋‚จ์— ๋”ฐ๋ผ ํŠน์ • ์–‘์˜ ๋ณ€ํ™”์ž…๋‹ˆ๋‹ค. Audio์˜ ๊ฒฝ์šฐ ๋ณ€ํ™”ํ•˜๋Š” ์–‘์€ ๊ธฐ์••(air pressure)์ž…๋‹ˆ๋‹ค. ์ด ์ •๋ณด๋ฅผ ๋””์ง€ํ„ธ ๋ฐฉ์‹(digitally)์œผ๋กœ ์บก์ฒ˜ํ•˜๋ ค๋ฉด ์–ด๋–ป๊ฒŒ ํ•ด์•ผ ํ• ๊นŒ์š”? ์‹œ๊ฐ„์ด ์ง€๋‚จ์— ๋”ฐ๋ผ air press

2023๋…„ 3์›” 3์ผ
ยท
0๊ฐœ์˜ ๋Œ“๊ธ€
ยท

MFCC

https://brightwon.tistory.com/11์ฐธ๊ณ !!

2023๋…„ 2์›” 21์ผ
ยท
0๊ฐœ์˜ ๋Œ“๊ธ€
ยท
post-thumbnail

[๋…ผ๋ฌธ์ •๋ฆฌ] MaskGIT: Masked Generative Image Transformer

Abstract 1. Introduction 2. Related Work 2.1 Image Synthesis 2.2. Masked Modeling with Bi-directional Transformers 3. Method 3.1. MVTM

2023๋…„ 2์›” 10์ผ
ยท
0๊ฐœ์˜ ๋Œ“๊ธ€
ยท
post-thumbnail

[๋…ผ๋ฌธ์ •๋ฆฌ] Masked Autoencoders Are Scalable Vision Learners

Abstract ๋ณธ ๋…ผ๋ฌด์—์„ ๋А masked autoencoder(MAE)๋ฅผ ์†Œ๊ฐœํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. ๋Œ€๋žต์ ์œผ๋กœ ์ด ๋…ผ๋ฌธ์˜ ์ค‘์š”ํ•œ ๋‘๊ฐ€์ง€ ๋””์ž์ธ์€ Asymmetricํ•œ encoder-decoder ๊ตฌ์กฐ๋ฅผ ๊ฐ€์ง€๊ณ  ์žˆ์–ด์„œ, encoder๋Š” mask token์ด ์—†๋Š” visibl

2023๋…„ 2์›” 10์ผ
ยท
0๊ฐœ์˜ ๋Œ“๊ธ€
ยท
post-thumbnail

[๋…ผ๋ฌธ์ •๋ฆฌ] A Lip Sync Expert Is All You Need for Speech to Lip Generation In The Wild

Abstract In this work, we investigate the problem of lip-syncing a talking face video of an arbitrary identity to match a target speech seg- ment. Cur

2023๋…„ 2์›” 10์ผ
ยท
0๊ฐœ์˜ ๋Œ“๊ธ€
ยท

torch.distributed.launch

torch.distributed.launch ๋Š” ๊ฐ ํ›ˆ๋ จ ๋…ธ๋“œ์—์„œ ์—ฌ๋Ÿฌ ๋ถ„์‚ฐ ํ›ˆ๋ จ ํ”„๋กœ์„ธ์Šค๋ฅผ ์ƒ์„ฑํ•˜๋Š” module... warning:: This module is going to be deprecated in favor of :ref:torchrun <la

2022๋…„ 10์›” 27์ผ
ยท
0๊ฐœ์˜ ๋Œ“๊ธ€
ยท
post-thumbnail

[๋…ผ๋ฌธ์ •๋ฆฌ] DIGAN : Generating Videos with Dynamics-aware Implicit Generative Adversarial Networks

ABSTRACT long video generation์„ ์œ„ํ•ด ๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” implicit neural representations (INRs)์„ ๋น„๋””์˜ค์— ์‚ฌ์šฉํ•˜๋Š” ์ƒˆ๋กœ์šด ๋„คํŠธ์›Œํฌ์ธ dynamics-aware implicit generative adversarial

2022๋…„ 10์›” 2์ผ
ยท
0๊ฐœ์˜ ๋Œ“๊ธ€
ยท
post-thumbnail

[๋…ผ๋ฌธ์ •๋ฆฌ] Image Generators with Conditionally-Independent Pixel Synthesis(CIPS)

Abstract 1. Introduction

2022๋…„ 9์›” 16์ผ
ยท
0๊ฐœ์˜ ๋Œ“๊ธ€
ยท
post-thumbnail

GAN ์—ฐ๊ตฌ๋ถ„๋ฅ˜

์ถœ์ฒ˜: https://ysbsb.github.io/

2022๋…„ 9์›” 7์ผ
ยท
0๊ฐœ์˜ ๋Œ“๊ธ€
ยท
post-thumbnail

[๋…ผ๋ฌธ์ •๋ฆฌ] Arbitrary Style Transfer in Real-time with Adaptive Instance Normalization (AdaIN)

Abstract Gatys et al. recently introduced a neural algorithm that renders a content image in the style of another image, achieving so-called style tr

2022๋…„ 9์›” 6์ผ
ยท
0๊ฐœ์˜ ๋Œ“๊ธ€
ยท
post-thumbnail

[๋…ผ๋ฌธ์ •๋ฆฌ] UPST-NeRF: Universal Photorealistic Style Transfer of Neural Radiance Fields for 3D Scene

Abstract 3D scenes photorealistic stylization aims to generate photorealistic images from arbitrary novel views according to a given style image whil

2022๋…„ 8์›” 24์ผ
ยท
0๊ฐœ์˜ ๋Œ“๊ธ€
ยท
post-thumbnail

[๋…ผ๋ฌธ์ •๋ฆฌ] Neural 3D Video Synthesis from Multi-view Video

https://neural-3d-video.github.io/ Abstract We propose a novel approach for 3D video synthesis that is able to represent multi-view video recordings

2022๋…„ 8์›” 22์ผ
ยท
0๊ฐœ์˜ ๋Œ“๊ธ€
ยท
post-thumbnail

[๋…ผ๋ฌธ์ •๋ฆฌ] StyleGAN-V : A Continuous Video Generator with the Price, Image Quality and Perks of StyleGAN2

Abstract ๋น„๋””์˜ค๋Š” continuousํ•œ events๋ฅผ ๋ณด์—ฌ์ฃผ์ง€๋งŒ ๋Œ€๋ถ€๋ถ„์˜ video synthesis ํ”„๋ ˆ์ž„์›Œํฌ๋Š” ์‹œ๊ฐ„์— ๋”ฐ๋ผ discretelyํ•˜๊ฒŒ ๋‹ค๋ฃน๋‹ˆ๋‹ค. ๋…ผ๋ฌธ์—์„œ๋Š” video๋ฅผ time-continuous signals๋กœ ๋‹ค๋ฃจ๊ณ , continuous-ti

2022๋…„ 8์›” 19์ผ
ยท
0๊ฐœ์˜ ๋Œ“๊ธ€
ยท
post-thumbnail

[๋…ผ๋ฌธ์ •๋ฆฌ] You Only Look Once: Unified, Real-Time Object Detection

https://arxiv.org/abs/1506.02640YOLO, a new approach to object detection.โœจwe frame object detection as a regression problem to spatially separate

2022๋…„ 8์›” 9์ผ
ยท
0๊ฐœ์˜ ๋Œ“๊ธ€
ยท
post-thumbnail

[๋…ผ๋ฌธ์ •๋ฆฌ] MAD: A Scalable Dataset for Language Grounding in Videos from Movie Audio Descriptions

Abstract ์ตœ๊ทผ์˜ video-language research ๊ด€์‹ฌ์ด ๋†’์•„์ง€๋ฉด์„œ large-scale datasets๋„ ํ•จ๊ป˜ ๋ฐœ์ „๋˜์—ˆ๋‹ค. ๊ทธ์™€ ๋น„๊ตํ•ด์„œ video-language grounding task๋ฅผ ์œ„ํ•œ datasets์—๋Š” ์ œํ•œ๋œ ๋…ธ๋ ฅ์ด ๋“ค์—ˆ๊ณ , ์ตœ์‹  ๊ธฐ์ˆ 

2022๋…„ 8์›” 9์ผ
ยท
0๊ฐœ์˜ ๋Œ“๊ธ€
ยท
post-thumbnail

[๋…ผ๋ฌธ์ •๋ฆฌ] Video Textures

Abstract ์ด ๋…ผ๋ฌธ์€ ์ƒˆ๋กœ์šด medium์ธ video texture์— ๋Œ€ํ•ด ์†Œ๊ฐœํ•œ๋‹ค. ๋น„๋””์˜ค ํด๋ฆฝ์„ ๋ถ„์„ํ•ด ๊ตฌ์กฐ๋ฅผ ์ถ”์ถœํ•˜๊ณ  ์ž„์˜ ๊ธธ์ด์˜ ๋น„์Šทํ•˜๊ฒŒ ๋ณด์ด๋Š” ์ƒˆ๋กœ์šด ๋น„๋””์˜ค๋ฅผ ํ•ฉ์„ฑํ•˜๋Š” ๊ธฐ์ˆ ์„ ์ œ์‹œํ•œ๋‹ค. video texture ์™€ view morphing ๊ธฐ์ˆ ์„ ๊ฒฐํ•ฉ

2022๋…„ 8์›” 3์ผ
ยท
0๊ฐœ์˜ ๋Œ“๊ธ€
ยท
post-thumbnail

Lecture 6 Training Neural Networks, Part I

์„ ํ˜•์ ์ธ ์ธต๋งŒ ์—ฌ๋Ÿฌ๊ฐœ ์Œ“๋Š” ๊ฒƒ์€ ์„ ํ˜•์„ฑ์— ์˜ํ•ด์„œ ํ•˜๋‚˜์˜ ์ธต์œผ๋กœ ํ•ฉ์น  ์ˆ˜ ์žˆ์œผ๋ฏ€๋กœ ์˜๋ฏธ๊ฐ€ ์—†๋‹ค. ๊ทธ๋ž˜์„œ ๋” ๋ณต์žกํ•œ non-linear ํ•จ์ˆ˜๋ฅผ ๋งŒ๋“ค๊ธฐ ์œ„ํ•ด ์„ ํ˜•์ธต ์ค‘๊ฐ„์— activation function์„ ๋„ฃ์–ด์ฃผ๋ฉด์„œ ๊ณ„์ธต์ ์ธ ๊ตฌ์กฐ์˜ ๋น„์„ ํ˜•ํ•จ์ˆ˜ ๋„คํŠธ์›Œํฌ๋กœ ๋งŒ๋“ค์–ด ์ค€๋‹ค.์˜ค๋ž˜

2022๋…„ 7์›” 20์ผ
ยท
0๊ฐœ์˜ ๋Œ“๊ธ€
ยท
post-thumbnail

Lecture 5 Convolutional Neural Networks

์ด๋ฒˆ ์‹œ๊ฐ„์—๋Š” Convolutional Neural Network์— ๋Œ€ํ•ด ์‚ดํŽด ๋ณผ ๊ฒƒ์ด๋‹ค. ๊ธฐ์กด Neural Network์™€ ๊ฐ™์€ ์•„์ด๋””์–ด์ด๊ธด ํ•˜์ง€๋งŒ ์ด๋ฒˆ์—๋Š” โ€˜spatial structure(๊ณต๊ฐ„์  ๊ตฌ์กฐ)โ€™๋ฅผ ์œ ์ง€ํ•˜๋Š” Convolutional Layer์— ๋Œ€ํ•ด ๋ฐฐ์šธ

2022๋…„ 7์›” 20์ผ
ยท
0๊ฐœ์˜ ๋Œ“๊ธ€
ยท
post-thumbnail

Lecture 4 Introduction to Neural Networks

Computational graphs๋ฅผ ์ด์šฉํ•ด์„œ ์–ด๋А ํ•จ์ˆ˜๋“  ํ‘œํ˜„ํ•  ์ˆ˜ ์žˆ๋‹ค.์˜ˆ๋ฅผ ๋“ค์–ด ์•„๋ž˜๋Š” ์ง€๊ธˆ๊นŒ์ง€ ๋ดค๋˜ input์ด $x, W$์ธ linear classifier์ด๋‹ค.์ด computational graph๋ฅผ ์ด์šฉํ•ด ํ•จ์ˆ˜๋ฅผ ํ‘œํ˜„ํ•˜๋ฉด backpropagation์„ ์‚ฌ

2022๋…„ 5์›” 21์ผ
ยท
0๊ฐœ์˜ ๋Œ“๊ธ€
ยท
post-thumbnail

Lecture 3. Loss Functions and Optimization

์ง€๋‚œ์‹œ๊ฐ„์— ์‹ค์ œ๋กœ ๊ฐ€์žฅ ์ข‹์€ ํ–‰๋ ฌ $W$๋ฅผ ๊ตฌํ•˜๊ธฐ ์œ„ํ•ด ์–ด๋–ป๊ฒŒ ํŠธ๋ ˆ์ด๋‹ ๋ฐ์ดํ„ฐ๋ฅผ ํ™œ์šฉํ•ด์„œ ํ–‰๋ ฌ W๋ฅผ ๋งŒ๋“ค์–ด์•ผ ํ•˜๋Š”์ง€๋Š” ๋‹ค๋ฃจ์ง€ ์•Š์•˜๋‹ค.Linear Classifier์—์„œ ์–ด๋–ค $W$๊ฐ€ ๊ฐ€์žฅ ์ข‹์€์ง€๋ฅผ ๊ฒฐ์ •ํ•˜๊ธฐ ์œ„ํ•ด์„œ๋Š” ์ง€๊ธˆ์˜ $W$๊ฐ€ ์ข‹์€์ง€ ๋‚˜์œ์ง€๋ฅผ ์ •๋Ÿ‰ํ™”ํ•  ๋ฐฉ๋ฒ•์ด ํ•„์š”ํ•˜

2022๋…„ 5์›” 21์ผ
ยท
0๊ฐœ์˜ ๋Œ“๊ธ€
ยท