Need to represent the prior knowledge well
=> using Beta distribution
MLE: maximize P(D∣θ)P(D|\theta)P(D∣θ)
MAP: maximize P(θ∣D)P(\theta|D)P(θ∣D)