[Speech] Feature Extraction- Fourier Transform

누렁이·2024년 1월 17일

Speech (ASR/TTS)

목록 보기

3/15

Reference: [재작성] https://ratsgo.github.io/speechbook/docs/fe

Discrete Fourier Transform

Fourier Transform:
- time 함수 & frequency 함수 이어주는 연산
- (그림)
  - frequency domain = func_푸리에(time domain) => 이것의 반대면 역 푸리에가 나오겠군!
  - func__푸리에: 사부작 사부작 하면 simple wave로 만들어서 그걸 다시 저쪽 차원 형태로 변형한다는 거군! 각각의 도메인과 simple wave 형태를 알면 가능한거구만!
  - 음성인식에서는 컴퓨타가 이산 신호 처리를 잘하니까 이산 (discrete) 푸리에 변환을 많이 씀!
- 수식
  - Discrete fourier Transform: $X_k = \sum_{n=0}^{N-1} x_n \cdot e^{-i~2\pi~k~n~/~N}$
  - Inverse ~: $x_n = \frac{1}{N}\sum_{k=0}^{N-1} X_k e^{i~2\pi~k~n~/~N}$

Concepts

푸리에 변환을 직관적으로 이해하려면 오일러 공식(Euler’s formula)부터 살펴야 합니다.
(정말..? 나도..? 이것까지 알아야한다구..? 울고싶다..)

개념도 그림만 보고..갈래여..
어쨌든.. 이 음성신호들이.. 규칙적으로 진동을 하니까 이게 가능한거라는거 아입니꺼... 푸리에씨와 오일러씨는 그 규칙을 찾으신거고요..? 그렇죠..? 선형변환을 해준거군요... 감사합니다.. ㅠ_ㅠ

얼마나 빠른지
크기가 속도로 분해가 된다.
다 더하면 무게 중심값 => 주파수를 대표하는 특정 값이다

DFT Matrix

Python Implementation

numpy.fft.fft

This function computes the one-dimensional n-point discrete Fourier Transform (DFT) with the efficient Fast Fourier Transform (FFT) algorithm [CT].

Ref: https://numpy.org/doc/stable/reference/generated/numpy.fft.fft.html

import numpy as np
def DFT(x):
    x = np.asarray(x, dtype=float)
    N = x.shape[0]
    n = np.arange(N)
    k = n.reshape((N, 1))
    W = np.exp(-2j * np.pi * k * n / N)
    return np.dot(W, x)

x = np.random.random(1024)
np.allclose(DFT(x), np.fft.fft(x))

누렁이

왈왈

이전 포스트

[Speech] Acoustic Phonetics

다음 포스트

[Speech] Feature Extraction- Fourier Transform

Speech (ASR/TTS)

Discrete Fourier Transform

Concepts

DFT Matrix

Python Implementation

[Speech] Acoustic Phonetics

[Speech] Feature Extraction - MFCCs

0개의 댓글