[AAAI '19] A Deep Neural network for Unsupervised Anomaly Detection and Diagnosis in Multivariate Time Series Data

minha·2022년 2월 7일

Time Series

Time series

목록 보기

2/4

Summary

Dataset: synthetic dataset, power plant dataset --> unsupervised, 학습시에는 normal 데이터만 사용
Reconstruction (O) / Forecasting (X)
Input: time window s = 3개(10, 30, 60 observations -> each for short, medium, long duration anomaly detection) ==> (T, n, n, s) video-like tensor (T timestep개의 (n, n, s) 3D tensor)
Output: (T, n, n, s) reconstructed input (동일한 shape)
Threshold: MANUAL by domain expert
Metric: Precision, Recall, F1-score
Framework: Tensorflow
비고: Ablation experiment의 퀄리티 굉장히 좋음

For one timestep t...

Signature Matrices 생성 for 여러 window size
인풋: h timestep의 n개의 univariate 시계열 (consisting multivariate)
아웃풋: 한 timestep에 대해 각각 이전 10, 30, 60 timestep까지 segment를 만들고 n개의 univariate간에 공분산 행렬을 만든 것 (by calculating inner product btw the vectors), 서로 다른 time window를 3개 사용하므로 총 n x n 행렬이 3개 생긴다.
Purpose: 1. duration에 따라 anomaly의 severity를 다르게 판단할 것이기 때문에 / 2. n개의 univariate간의 상호관계를 나타내기 위해
Convolutional Encoder ==> encode spatial patterns (inter-correlation btw different sensors(univariates))
h개의 sequential 데이터 각각이 (n, n, s)의 image-like tensor(3개의 n x n 행렬을 concatenate)이기 때문에 convolution을 사용할 수 있다.
인풋: h개의 (n, n, s) 텐서
4개의 convolution layer에서 만들어지는 T개(전체 timestep)의 output feature map중 연속적인 h개의 피쳐맵이 Attention-based ConvLSTM의 입력으로 사용된다!
Attention-based ConvLSTM ==> encode temporal dependencies
(Time window: h)
T개의 3D tensor마다 이전 h개의 데이터를 이용하여 temporal information을 인코딩함
이 때, h개의 timestep이 현재 timestep에 영향을 끼치는 중요도가 다르기 때문에 attention을 도입
아웃풋: output tensor, hidden tensor 중 hidden을 사용하여 decoder의 입력으로 사용한다!
Convolutional Decoder ==> reconstruct signature matrices
deconvolution, 4개의 convLSTM 층의 hidden tensor을 중간중간 concate함
아웃풋: (n, n, s)
Loss function: L2 (reconstrunction) loss
Anomaly score: (ground truth signature matrices) - (predicted signature matrices) = residual matrices에서 값이 threshold 보다 큰 것 = anomaly ("The number of poorly reconstructed pairwise correlations")

장점

Anomaly detection <- multivariate 간의 correlation을 도입(spatiotemporal correlation)
Root cause identification <- signature matrices의 각 행/열이 하나의 univariate을 나타내므로
poorly reconstruct 된 signature matrix의 행/열을 파악함으로써 anomaly의 root cause를 판단
할 수 있음
Anomaly severity(duration) interpretation <- 10(short), 30(medium), 60(long) 크기의 time window로 공분산 행렬을 각각 만들어서 각각 reconstruct 했다. 각 공분산 행렬이 얼마나 잘 reconstruct 되었는지에 따라 각각 다른 Anomaly score를 부여한다.
Anomaly likely to be long duration, more severe: 3채널로 모두 detect된 경우

단점

Threshold is manual, maybe apply automated thresholding methods like PoT

https://stopspoon.tistory.com/38
https://youtu.be/NJASxGwvQXg
https://github.com/Zhang-Zhi-Jie/Pytorch-MSCRED

minha

이전 포스트

[AAAI '19] A Deep Neural network for Unsupervised Anomaly Detection and Diagnosis in Multivariate Time Series Data

Time series

Summary

For one timestep t...

장점

단점

[ICDM '20] Multivariate Time-Series Anomaly Detection via Graph Attention Network

0개의 댓글