statistics | 易学教程

automatically compare two series -Dissimilarity test

阅读更多关于 automatically compare two series -Dissimilarity test

问题 I have two series, series1 and series2. My aim is to find how much Series2 is different from Series1,on a bin to bin basis, (each bin represents a particular feature,) automatically/quantitatively. This image can be seen in its original size by clicking here. Series1 is the expected result. Series2 is the test/incoming series. I am providing a histogram plot, where Series2 is represented in dark brown colour. You can also note in the x-axis between 221 and 353 there is a significant variation

automatically compare two series -Dissimilarity test

阅读更多关于 automatically compare two series -Dissimilarity test

Get average based on value in another row

阅读更多关于 Get average based on value in another row

问题 I have values in an Excel file like this: QR | QR AVG | val1 | q1 5 q1 3 q1 4 q2 7 q2 9 q3 10 q3 11 q3 12 q3 11 q4 5 q5 5 q5 7 And I would like the QR AVG field to represent the average value partitioned by different QR values. In other words, I'd like to have the following values after my calculation: QR | QR AVG | val1 | q1 4 5 q1 4 3 q1 4 4 q2 8 7 q2 8 9 q3 11 10 q3 11 11 q3 11 12 q3 11 11 q4 5 5 q5 6 5 q5 6 7 Where I don't know the exact number of rows that I will have, and I will be

Use of Different .Net Languages?

阅读更多关于 Use of Different .Net Languages?

问题 Is there a breakdown of the popularity of the different .Net languages available? Does anyone know of any surveys that give this information, or even if it is possible to determine this? Update The answer is not a list of the different .Net languages. I would like to see statistics showing the relative usage/popularity of each .Net language. Thanks. 回答1: Not sure if this is what you are after but it's interesting nonetheless. I was surprised to see C# as far down as it was. http://langpop.com

what is the difference between scale transformation and coordinate system transformation

阅读更多关于 what is the difference between scale transformation and coordinate system transformation

问题 in the documentation for coord_trans function that is used for coordinates transformation it says that the difference between this function and scale_x_log10 is transformation occurs after statistics, and scale transformation occurs before , I didn't get the point check documentation here . and how the data is plotted using both methods 回答1: The quote from the documentation you supplied tells us that scale transformation occurs before any statistical analysis pertaining to the plot. The

k-means: Same clusters for every execution

阅读更多关于 k-means: Same clusters for every execution

问题 Is it possible to get same kmeans clusters for every execution for a particular data set. Just like for a random value we can use a fixed seed. Is it possible to stop randomness for clustering? 回答1: Yes. Use set.seed to set a seed for the random value before doing the clustering. Using the example in kmeans : set.seed(1) x <- rbind(matrix(rnorm(100, sd = 0.3), ncol = 2), matrix(rnorm(100, mean = 1, sd = 0.3), ncol = 2)) colnames(x) <- c("x", "y") set.seed(2) XX <- kmeans(x, 2) set.seed(2) YY

Non-Uniform Random Number Generator Implementation?

阅读更多关于 Non-Uniform Random Number Generator Implementation?

问题 I need a random number generator that picks numbers over a specified range with a programmable mean. For example, I need to pick numbers between 2 and 14 and I need the average of the random numbers to be 5. I use random number generators a lot. Usually I just need a uniform distribution. I don't even know what to call this type of distribution. Thank you for any assistance or insight you can provide. 回答1: You might be able to use a binomial distribution, if you're happy with the shape of

harmonic mean in python

阅读更多关于 harmonic mean in python

问题 The Harmonic Mean function in Python ( scipy.stats.hmean ) requires that the input be positive numbers. For example: from scipy import stats print stats.hmean([ -50.2 , 100.5 ]) results in: ValueError: Harmonic mean only defined if all elements greater than zero I don't mathematically see why this should be the case, except for the rare instance where you would end up dividing by zero. Instead of checking for a divide by zero, hmean() then throws an error upon inputing any positive number,

Choosing specific lags in ARIMA or VAR Model

阅读更多关于 Choosing specific lags in ARIMA or VAR Model

问题 I've seen this issue raised here and here but unfortunately the answers are not satisfactory. Inputting the lags in either the p argument in VAR or the order argument in arima , R will include all the lags at and below that stated value. However, what if you want specific lags only? For example, what if I wanted lags 1, 2, and 4 only in a VAR? Inputting P=4 in VAR will give me lags 1,2,3 and 4, but I would like to exclude the third lag. In the first link, the user provided an answer by

Histogram in JavaScript?

阅读更多关于 Histogram in JavaScript?

问题 I have this dataset for income: Income Number of people 0 245981 8.8 150444 30 126063 49.9 123519 70 115029 90.7 277149 109.1 355768 130 324246 150.3 353239 170.2 396008 190 396725 210 398640 230.1 401932 250 416079 270 412727 289.8 385192 309.7 343178 329.7 293707 349.6 239982 369.7 201557 389.3 165132 442.3 442075 543.4 196526 679.9 146784 883.9 48600 1555 44644 (As you can see, the width between income levels gets larger towards the end.) How do I make an accurate histogram of this data in