conditions


  • Given dataset D=(x1,,xn), xiRd
  • Assume a joint distribution p(D,θ)
  • Goal: choose a good value of θ for D
  • MAP: θMAP=argmaxθp(θ|D)
  • MLE: θMLE=argmaxθp(D|θ)

samples


Suppose dataset D=(x1,,xn), xiRd, and θN(μ,1). x1,,xn are conditional independent given θ, and distribution is N(θ,σ2).

Then, the MAP estimator is, θmap=argmaxθp(θ|D)=argmaxθ(lnp(D|θ)+lnp(θ))

The derivative of log function is, θ(lnp(D|θ)+lnp(θ))=1σ2(nixinθ)+(μθ)

Then, we have, θ=nixi+σ2μn+σ2

So, we finally get the maximum a posterior of θ, θMAP=nn+σ2ˉx+σ2n+σ2μ

Obviously, the MAP estimator is a convex combination of ˉx and μ.