Process Mining 9

JIYOUNG KIM·2022년 8월 10일

Process Mining

목록 보기

9/9

가장 기본적이면서 직관적인 검증 방법
$1 - \frac{일치하지 않는 셀 개수}{표 전체 셀 개수}$
한계점
- 이벤트 로그의 빈도에 대한 정보를 고려하지 않음
- fitness, generalization, precision을 고려하지 않음
- 프로세스 패턴을 파악하기 어려움

Token 의 four counters (상태)
- Produced token
  - 앞 transition에서 만들어져서 place로 들어오는 token
- Consumed token
  - 해당 place에 있다가 다음 transition에 사용되는 token
- Missing token
  - 해당 place에 없었는 데 다음 transition에 사용되어야 하는 token
- Remaining token
  - 해당 place에 produced 되었지만 다음 transition에 사용되지 않고 남은 token
- Missing token 과 Remaining token 은 적을수록 좋음
$0 \leq Fitness(\sigma, N) \leq 1 = \frac{1}{2}(1 - \frac{m}{c}) + \frac{1}{2}(1 - \frac{r}{p})$
한계
- 모든 transition이 유니크하다고 가정
  - silent transition이나 같은 이름을 가진 transition(duplicate transition)이 있는 경우에는 적용시키지 못함
- generalization, simplicity, precision을 고려하지 못하고 낙관적인 결과를 도출
- local decision이 반영되어야 하는 모델에 대해서 낙관적인 결과 도출
  - local decision : 앞에 선행된 activity가 뒤의 activity의 행동을 결정하는 것

a	>>	d	e	g	h
a	b	d	e	g	>>

수많은 Alignment들 중 log move와 가장 일치율이 높은 alignment
- 일치율
  - 보통 move in log only와 move in model only의 cost를 1,
  - synchronous move의 cost를 0으로 하여
  - cost가 가장 낮은 것을 일치율이 가장 높은 alignment로 정의
  - 두 cost의 차이로 계산
$Fitness(\sigma, N) = 1 - \frac{\delta(\lambda^N_{opt}(\sigma)}{\delta(\lambda^N_{worst}(\sigma)}$ ( $\delta$ = cost)
= $1 - \frac{optimal의 cost}{cost가 최대인 alignment의 cost}$
= $1 - \frac{optimal의 cost}{len(log) + len(shortest/path/in/model)}$
- cost가 최대인 alignment의 cost
  - 모델에서 만들 수 있는 가장 짧은 trace를 구해서 그것이 로그의 trace와 하나도 겹치지 않도록 alignment를 만들면 됨

데이터분석가