如何在Java中计算逆累积beta分布函数

问题描述：

我正在寻找一个Java库/实现，它支持以合理的精度计算beta分布的逆累积分布函数(又称分位数的估计) .

I am looking for a java library / implementation which supports the calculation of the inverse cumulative distribution function for the beta distribution (aka estimation of quantiles) with reasonable precision.

我当然已经尝试过 apache commons math ，但是在版本3中似乎仍然有些具有精确度的问题.下面对导致该问题的问题进行详细描述.

Of course I have tried apache commons math, but in version 3 there still seem to be some issues with the precision. Below the problem which lead to this question is described extensively.

假设我想通过大量试验来计算Beta分布的可信区间.在 apache commons math ...

Suppose I want to calculate the credible interval of a beta distribution with a lot of trials. In apache commons math ...

final int trials = 161750;
final int successes = 10007;
final double alpha = 0.05d;

// the supplied precision is the default precision according to the source code
BetaDistribution betaDist = new BetaDistribution(successes + 1, trials - successes + 1, 1e-9);

System.out.println("2.5 percentile :" + betaDist.inverseCumulativeProbability(alpha / 2d));
System.out.println("mean: " + betaDist.getNumericalMean());
System.out.println("median: " + betaDist.inverseCumulativeProbability(0.5));
System.out.println("97.5 percentile :" + betaDist.inverseCumulativeProbability(1 - alpha / 2d));

提供

2.5 percentile :0.062030402074808505
mean: 0.06187249616697166
median: 0.062030258659508855
97.5 percentile :0.06305170793994147

问题在于2.5个百分位数和中位数相同，而两者均大于平均值.

The issues is that the 2.5 percentile and median are the same meanwhile both greater than the mean.

相比之下， R 包 binom 提供了

binom.confint(10007+1,161750+2,methods=c("agresti-coull","exact","wilson"))
         method     x      n      mean      lower      upper
1 agresti-coull 10008 161752 0.0618725 0.06070873 0.06305707
2         exact 10008 161752 0.0618725 0.06070317 0.06305756
3        wilson 10008 161752 0.0618725 0.06070877 0.06305703

和 R 软件包 stats

qbeta(c(0.025,0.975),10007+1,161750-10007+1)
[1] 0.06070355 0.06305171

第二个来自R的结果，这是 Wolfram Alpha 告诉我的

To second the results from R, here is what Wolfram Alpha told me

InverseBetaRegularized [0.025 ，10007 + 1,161750-10007 + 1] => 0.06070354631 ...
InverseBetaRegularized [0.975 ，10007 + 1,161750-10007 + 1] => 0.06305170794 ...

InverseBetaRegularized[0.025,10007+1,161750-10007+1] => 0.06070354631...
InverseBetaRegularized[0.975,10007+1,161750-10007+1] => 0.06305170794...

有关要求的最终说明:

我需要进行很多这样的计算.因此，任何解决方案都不应花费超过1s的时间(与41ms的(虽然是错误的)apache commons数学相比，这仍然很多).
我知道一个人可以在Java中使用R.出于某些原因，我在这里不做详细介绍，如果其他任何操作(纯Java)失败，这是最后的选择.

更新21.08.12

看来，该问题已在3.1中修复或至少得到改善-apache-公共-数学快照.对于上述用例

It seems that the issue has been fixed or at least improved in 3.1-SNAPSHOT of apache-commons-math. For the usecase above

2.5 percentile :0.06070354581340706
mean: 0.06187249616697166
median: 0.06187069085946604
97.5 percentile :0.06305170793994147

更新23.02.13

虽然乍一看这个问题及其答案可能过于局限，但我认为它很好地说明了某些数字问题无法通过先到先得"的方法来(有效地)解决.所以我希望它保持开放.

While at first glance this question and it's responses may be too localized, I think that it very well illustrates that some numerical problems cannot be solved (efficiently) with a what-first-comes-to-mind-hacker-approach. So I hope it remains open.

答

此问题已在 apache commons math 3.1.1

以上测试用例已交付

2.5 percentile :0.06070354581334864
mean: 0.06187249616697166
median: 0.06187069085930821
97.5 percentile :0.0630517079399996

，它与r-package统计信息的结果匹配.广泛应用3.1-SNAPSHOT + x版本也不会引起任何问题.

which matches the results from the r-package stats. Extensive application of 3.1-SNAPSHOT + x versions also did not cause any problems.

如何在Java中计算逆累积beta分布函数

相关推荐