PRML Chapter 2. Probability Distributions P68 conjugate priors In Bayesian probability theory, if the posterior distributions p (θ| x ) are in the same family as the prior probability distribution p (θ), the prior and posterior are then
PRML Chapter 2. Probability Distributions
P68
conjugate priors
In Bayesian probability theory, if the posterior distributions p(θ|x) are in the same family as the prior probability distributionp(θ), the prior and posterior are then called conjugate distributions, and the prior is called a conjugate prior for the likelihood. For example, the Gaussian family is conjugate to itself (or self-conjugate) with respect to a Gaussian likelihood function: if the likelihood function is Gaussian, choosing a Gaussian prior over the mean will ensure that the posterior distribution is also Gaussian.
exponential family
The exponential families include many of the most common distributions, including the normal, exponential, gamma, chi-squared, beta, Dirichlet, Bernoulli, binomial,multinomial, Poisson, Wishart, Inverse Wishart and many others.
2012@3@21补充:起先第二章果真没有仔细看好,现在看到狄利克雷分布了,回来再了解一下贝塔分布,发现里面这么关键的内容愣是都没看出来,汗。不过今天22点半了,明天在写咯。
2.1.1 The beta distribution
如果忘记伯努利分布和二项分布是怎么回事了,看这里。
书中引出贝塔分布的理由:P70提到,由于最大似然估计在观察数据很少时,会出现严重over-fitting(比如估计抛硬币正反面概率,只有3次抛硬币观察数据,且结果正好都是正面,则模型预测以后所有抛硬币都将是正面)。为了解决这个问题,可以考虑贝叶斯方法,即引入一个先验知识(先验分布p(μ))来控制参数μ,那么如何挑本文来源gaodai#ma#com搞@代~码^网+选这个分布呢?
考虑到伯努利分布的似然函数的形式是μx(1?μ)1?x,错!!原先这里看了个似懂非懂,完全写错了,囧死了,得到一个教训,写日志还是要多来回看看,看懂了再写,否则留下笑柄!现在重写如下:应该是,二项分布的似然函数是:μm(1?μ)n (就是二项分布除归一化参数之外的后面那部分,似然函数之所以不是pdf,是因为它不需要归一化),这个函数的形式是μ的m次方乘以1?μ的n次方,记住这个形式,下面要用到。