R's t-distribution says “full precision may not have been achieved”

感情迁移 提交于 2019-12-07 05:45:13

问题


I am working with a problem that routinely needs to compute the density of the t distribution rather far in the tails in R.

For example, using R's t distribution function, dt(1.424781, 1486, -5) returns [1] 2.75818e-10. Some of my final outputs (using this density as an input) do not match a reference value from analogous computations performed in MATLAB by my colleague, which I think may be due to imprecision in the t distribution's tails in R.

If I compare to MATLAB's t distribution function, nctpdf(1.424781, 1486, -5) returns ans = 4.3932e-10, which is quite a bit different than R's output.

edit: R prints two identical warning messages

In dt(1.424781, 1486, -5) : full precision may not have been achieved
in 'pnt{final}'

This is on Mac, R version 3.3.1


回答1:


It appears that the issue comes from the algorithm that R implements for the non-central t-distribution for this case. Two factors combine to produce the result:

  1. the supplied non-centrality parameter, -5, is an "extreme value."

From the Note section of the help file, ?pt,

The code for non-zero ncp is principally intended to be used for moderate values of ncp: it will not be highly accurate, especially in the tails, for large values.

So the algorithm that calculates these values is not intended to calculate extreme values like -5. We can see this by reducing the ncp value to a more moderate level, say -1:

dt(1.424781, 1486, -1)
[1] 0.0211143
  1. The requested value is in the upper tail of the distribution:

The Source section of ?pt says

For the non-central case of pt based on a C translation of

Lenth, R. V. (1989). Algorithm AS 243 — Cumulative distribution function of the non-central t distribution, Applied Statistics 38, 185–189.

This computes the lower tail only, so the upper tail suffers from cancellation and a warning will be given when this is likely to be significant.

For example, the same ncp value, -5 with the negative of the x value returns

dt(-1.424781, 1486, -5)
[1] 0.0006719519

without a warning.




回答2:


You could use Boost (available through the BH package) via Rcpp as an alternative:

// [[Rcpp::depends(BH)]]

#include <Rcpp.h>
#include <boost/math/distributions/non_central_t.hpp> 

using namespace boost::math;

// [[Rcpp::export]]
double dnct(const double x, const double df, const double ncp) {
  non_central_t dist(df, ncp);
  return pdf(dist, x);
}


/*** R
dnct(1.424781, 1486, -5)
*/

This returns:

[1] 4.393078e-10

I don't know if Boost or Matlab is more precise here, but results are at least similar. The boost docs give some information regarding precision.



来源:https://stackoverflow.com/questions/39183938/rs-t-distribution-says-full-precision-may-not-have-been-achieved

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!