statistics

Calculate variogram of raster data with NAs in R

别来无恙 提交于 2019-12-13 13:34:11
问题 Summary: I have a raster dataset which contains NA values, and want to calculate a variogram of it, ignoring the NAs. How can I do this? I have an image which I have loaded into R using the readGDAL function, stored as im . To make this reproducible, the result of dput on the image is available at https://gist.github.com/2780792. I am trying to display a variogram of this data and am struggling. I'll go through what I've tried so far: I tried the gstat package, but couldn't seem to get a

Inverse of Cumulative Normal Distribution Function with parameters

不羁的心 提交于 2019-12-13 13:06:38
问题 I want to implement equivalent of matlab icdf function in C++, I have already found this useful post: https://www.johndcook.com/blog/cpp_phi_inverse/. But I want it with optional mu and sigma parameters as in matlab. What I am supposed to change? 回答1: Inspired from https://gist.github.com/kmpm/1211922/6b7fcd0155b23c3dc71e6f4969f2c48785371292: double inverse_of_normal_cdf(const double p, const double mu, const double sigma) { if (p <= 0.0 || p >= 1.0) { std::stringstream os; os << "Invalid

SQL Top 10 Sales Every Month

帅比萌擦擦* 提交于 2019-12-13 12:15:34
问题 Greeting all. I have a SQL 2008 express database, lets name is tbl_Merchant, similar as following: Merchant | Sales | Month Comp.1 100 1 Comp.2 230 1 Comp.3 120 1 Comp.1 200 2 Comp.2 130 2 Comp.3 240 2 Comp.1 250 3 . . . . . . . . . I need to find the top 10 merchant with sales every month over 12 months. It is very easy if it is just one month. SELECT TOP 10 Merchant, Sales, Month FROM tbl_Merchant WHERE Month = 1 ORDER BY Sales DESC But I am stuck if I wan to find them over 12 months. I

Computing the statistical mode

余生颓废 提交于 2019-12-13 12:14:42
问题 I'm currently trying to verify whether or not, given an unsorted array A of length N and an integer k, whether there exists some element that occurs n/k times or more. My thinking for this problem was to compute the mode and then compare this to n/k. However, I don't know how to compute this mode quickly. My final result needs to be n log(k), but I have no idea really on how to do this. The quickest I could find was n k... 回答1: Use a hash table to count the frequency of each value: uint[int]

Correlation between two vectors?

岁酱吖の 提交于 2019-12-13 11:52:14
问题 I have two vectors: A_1 = 10 200 7 150 A_2 = 0.001 0.450 0.0007 0.200 I would like to know if there is correlation between these two vectors. I could subtract to each value the mean of the vector and than do: A_1' * A_2 Are there any better ways? 回答1: Given: A_1 = [10 200 7 150]'; A_2 = [0.001 0.450 0.007 0.200]'; (As others have already pointed out) There are tools to simply compute correlation, most obviously corr : corr(A_1, A_2); %Returns 0.956766573975184 (Requires stats toolbox) You can

Why is my Kurtosis function not producing the same output as scipy.stats.kurtosis?

耗尽温柔 提交于 2019-12-13 11:50:40
问题 I have a homework problem in which I'm supposed to write a function for Kurtosis as descirbed here: The theta in the denominator is the standard deviation (square-root of the variance) and the x-with-the-bar in the numerator is the mean of x . I've implemented the function as follows: import numpy as np from scipy.stats import kurtosis testdata = np.array([1, 2, 3, 4, 5]) def mean(obs): return (1. / len(obs)) * np.sum(obs) def variance(obs): return (1. / len(obs)) * np.sum((obs - mean(obs)) *

Searching for the best fit price for multiple customers [duplicate]

廉价感情. 提交于 2019-12-13 11:25:06
问题 This question already has an answer here : Comparing multiple price options for many customers algorithmically (1 answer) Closed 5 years ago . A restatement of Comparing multiple price options for many customers algorithmically without nearly as much cruft. We have 1,000,000 customers. The cost of goods sold for each of them can be expressed as price A or price B. Price A << Price B. Price A and Price B are not linear to each other. In some cases B is 2 times as expensive, in some it is 100

R extended summary numerical values including kurtosis, skew, etc? [duplicate]

怎甘沉沦 提交于 2019-12-13 10:40:55
问题 This question already has answers here : How to extend the 'summary' function to include sd, kurtosis and skew? (2 answers) Closed last year . Is there a one-liner in R that will give me the following stats for each numerical column of a dataframe? count, mean, median, q3, q1, iqr, mode, min, max, antimode, pstdev, sstdev, pvar, svar, mad, madraw, pskew, sskew, pkurt, skurt, dpo, jarque Something like an extended method of summary(dt) ? Any ideas? 回答1: The describe() method in the psych

How to generate the Weibull's parameters k and c in Matlab?

老子叫甜甜 提交于 2019-12-13 09:48:03
问题 Can anyone explain to me how to generate the Weibull distribution parameters k and c, in Matlab? I have a file of 8000 data of wind speed, and I'd like to do the following: Generate the Weibull's k and c parameters of those. Plot the probability density function against the wind speed. I am new in Matlab and have not yet been able to do this. 回答1: If you have the Statistics toolbox, you can use fitdist: pd = fitdist(x,'Weibull') where x is your data. I'm guessing it should return the

Outcome of a simulated dice and coin toss in R [closed]

[亡魂溺海] 提交于 2019-12-13 07:50:14
问题 This question is unlikely to help any future visitors; it is only relevant to a small geographic area, a specific moment in time, or an extraordinarily narrow situation that is not generally applicable to the worldwide audience of the internet. For help making this question more broadly applicable, visit the help center. Closed 6 years ago . The experiment involves rolling a fair die and getting x say, then tossing a fair coin x number of times and recording the number of tails. I need to do