[Review] The MIDAS Touch: Mixed Data Sampling Regression Models

redgreen·2022년 9월 19일

GDP

목록 보기

1/2

01 Introduction

MIDAS
- variable of interest는 the lower frequency variable 이지만 relevant information은 the high frequency data인 경우에 활용. ex) stock market volatility
- GDP같은 yearly data에 대해 monthly data를 yearly or quarterly data로 aggregating하는 대신 MIDAS regression을 통해 모델링할 수 있다.
- different sampling frequencies이므로 autogressive model 이 아니다.
- 대신, MIDAS는 distributed lag model과 변수를 공유하고 참신한 변수들을 갖는다.
- $Y_t$ and $X_t^{(m)}$ 을 이용 --> efficiency
- equation
: simple linear MIDAS regression: $Y_t = \beta_0 + B(L^{1/m})X_t^{(m)}+\epsilon_t^{(m)}$
: $B(L^{1/m}) = \sum^{j^{max}}_{j=0}B(j)L^{j/m}$ --> $j^{max}$ 길이의 다항식
: $L^{j/m}X_t^{(m)} = X^{(m)}_{t-j/m}$ : $L^{j/m}(= j/m)$ 만큼 시차가 있는 $X$
: 즉 yearly 변수 $Y_t$ 를 quarterly 변수 $X_t^{(m)}$ 에 $j^{max}$ 만큼의 시차까지 표현한 식

The High frequency vairable
:

The low frequency variable
: past market information(the tick-by-tick level)
: variable of interest

stylized distributed lag model
- equation
: $Y_t = \beta_0 + B(L)X_t + \epsilon_t$
: $B(L): lag\; polinomial\; operator$
: $X^{(m)}$ : sampled $m$ times faster, $m:lag$
: $Y_t^{(m)}$ and $X_t^{(m)}$ 을 이용

distributed lag models and MIDAS regression 비교
- feasible GLS(computed using lagged dependent variable)

-특정 상황하에서 $X_t^{(m)}$ 을 통해 $Y_t$ 를 예측할 때 발생하는 aggregation bias는 사라진다는 결과를 제시

MIDAS regression의 관심사
: 독립변수가 frequently sampled되었을 때, 발생하는 discretization biases를 파악
: distributed lag model과 MIDAS 모두 m->0으로 수렴할 때 discretization bias는 0에 수렴함

02 Why MIDAS Regressions?

MIDAS는 tightly parameterized, reduced form regression 임

simple linear MIDAS regression: $Y_t = \beta_0 + \beta_1B(L^{1/m})X_{t-1}^{(m)}+\epsilon_t^{(m)}$

$B(L^{1/m}) = \sum^{j^{max}}_{j=0}B(j)L^{j/m}$ --> $j^{max}$ 길이의 다항식

$L^{j/m}x_t=x_{t-j/m}$

$L^{j/m}$ : $j/m$ 만큼의 $lag$ 를 가진 $x_t$ 를 만드는 연산자

MIDAS는 많은 lag을 활용해서 많은 파라미터를 요구함

파라미터를 줄이기 위해 여러 방법을 사용

03 MIDAS and Distributed Lag models: A Comparison

3.1 Aggregation Bias and Aliasing Revisited

다른 sampling frequency를 가진 데이터를 사용할 때, 불가피하게 temporal aggregation이 발생한다.

aggregation issue에 대해 해당 논문에서는 다음 두가지를 가정했다.
1) underlying stochastic process가 continuous time에 따라 변한다
2) 데이터는 discrete points in times에서 수집된다
--> observed data가 sampling interval에 독립이다?

$Y_t^{(m)}$ : 동일한 $1/m$ 간격으로 discrete time에서 샘플링된 값

$y(t)$ : continuous time preocesses

discrete time distributed lag model
: $Y_{t/m}^{(m)} = {1\over{m}}\sum_{s=-\infin}^{\infin}B^{(m)}({{s}\over{m}})X^{(m)}_{(t-s)/m} + U^{(m)}_{t/m}$

MIDAS regression
: $Y_{t} = {1\over{m}}\sum_{s=-\infin}^{\infin}\bar{B}^{(m)}({{s}\over{m}})X^{(m)}_{(t-s)/m} + U_{t}$

distributed lag model은 $Y$ 와 $X$ 모두 동일한 frequency를 갖고, MIDAS는 $X$ 만 high frequency를 갖는다

$B^{(m)}$ 와 $\bar{B}^{(m)}$ 의 비교가 논문의 관심사 --> OLS 추정치를 사용

multiple regressor의 경우 sampling frequencies가 다를 때, temporal aggregation시에 cross-regressor contamination(?)이 발생할 수 있다.(Geweke, 1975)
--> 논문에서 single regressor에 집중함

$B^{(m)}$ in a distributed lag model

minimize해야하는 식 of $B^{(m)}$
--> $\int_{-{\pi}m}^{{\pi}m}| \tilde{B}^{(m)}(w) - \tilde{b}(w)|^2F_m[S_x](w)$
: $F_m[S_x]$ 의 가중치를 가진 L2-norm으로 볼 수 있음.
: $S_x$ : spectral density of continous sampled process $x(t)$
and spectral density of discretely sampled process $x_{(t-s)/m}$
: $S_x^{(m)}\equiv F_m[S_x]$
: $\tilde{B}^{(m)}$
- continuous sampling convolution polynomial
- Fourier transforms of $B^{(m)}$
: $\tilde{b}$
- discrete sampling
- Fourier transforms of $b$
--> continous sampling poylnomial과 discrete sampling 오차를 최소화 시키기 때문에 discretization bias를 감소시키는 효과가 있음

OLS estimator
: $\tilde{B}^{(m)} = F_m[S_x\tilde{b}]/F_m[S_x] = F_m[S_{yx}]/F_m[S_x]$
: $S_{yx}$ : cross-spectrum of continously sampled $y(t)$ and $x(t)$

exogenous variable: model 외부에서 정의되는 변수

endogenous variable: model 내부에서 정의되는 변수
ex) the supply of and demand for money determine the interest rate contingent on the level of the money supply, so the money supply is an exogenous variable and the interest rate is an endogenous variable.

redgreen

인공지능 꿈나무

다음 포스트

[Review] The MIDAS Touch: Mixed Data Sampling Regression Models

GDP

01 Introduction

02 Why MIDAS Regressions?

03 MIDAS and Distributed Lag models: A Comparison

3.1 Aggregation Bias and Aliasing Revisited

$B^{(m)}$ in `a distributed lag model`

[Review] The MIDAS Touch: Mixed Data Sampling Regression Models - 2

0개의 댓글

[Review] The MIDAS Touch: Mixed Data Sampling Regression Models

GDP

01 Introduction

02 Why MIDAS Regressions?

03 MIDAS and Distributed Lag models: A Comparison

3.1 Aggregation Bias and Aliasing Revisited

B(m)B^{(m)}B(m) in a distributed lag model

[Review] The MIDAS Touch: Mixed Data Sampling Regression Models - 2

0개의 댓글

$B^{(m)}$ in `a distributed lag model`