statistics | 易学教程

How often are SQL Server Index Usage Stats Updated and what triggers it?

阅读更多关于 How often are SQL Server Index Usage Stats Updated and what triggers it?

问题 There are some other similar question to this but, please, do not confuse. I know there's a function STATS_DATE() to know where the stats where updated, which is fine, but what I want to know is what triggers an update of this stats, or a cut-off. I know there's a report for this as well. But last week I saw the stats in certain server and they gave me very good information with amounts of 4 digits for the main tables in this particular database. Right now looking in the same production

How to build a chi-square distribution table

阅读更多关于 How to build a chi-square distribution table

问题 I would like to generate a chi-square distribution table in python as a function of the probability level and degree of freedom. How to calculate the probability, given a known chi-value and degree of freedom, is this: In[44]: scipy.stats.chisqprob(5.991, 2) Out[44]: 0.050011615026579088 However, what I know is the probability and the degree of freedom. Thus, I would like to compute the corresponding chi-value for a given probability. The end result should look similar to something like this.

Running (one pass) calculation of covariance

阅读更多关于 Running (one pass) calculation of covariance

问题 I got a set of 3d vectors (x,y,z), and I want to calculate the covariance matrix without storing the vectors. I will do it in C#, but eventually I will implement it in C on a microcontroller, so I need the algorithm in itself, and not a library. Pseudocode would be great also. 回答1: The formula is simple if you have Matrix and Vector classes at hand: Vector mean; Matrix covariance; for (int i = 0; i < points.size(); ++i) { Vector diff = points[i] - mean; mean += diff / (i + 1); covariance +=

Understanding T-SQL stdev, stdevp, var, and varp

阅读更多关于 Understanding T-SQL stdev, stdevp, var, and varp

问题 I'm having a difficult time understand what these statistics functions do and how they work. I'm having an even more difficult time understanding how stdev works vs stdevp and the var equivelant. Can someone please break these down into dumb for me? 回答1: In statistics Standard Deviation and Variance are measures of how much a metric in a population deviate from the mean (usually the average.) The Standard Deviation is defined as the square root of the Variance and the Variance is defined as

Generate distribution given percentile ranks

阅读更多关于 Generate distribution given percentile ranks

问题 I'd like to generate a distribution in R given the following score and percentile ranks. x <- 1:10 PercRank <- c(1, 7, 12, 23, 41, 62, 73, 80, 92, 99) PercRank = 1 for example tells that 1% of the data has a value/score <= 1 (the first value of x). Similarly, PercRank = 7 tells that 7% of the data has a value/score <= 2 etc.. I am not aware of how one could find the underlying distribution. I'd be glad if I could get some guidance on how to go about obtaining the pdf of the underlying

Generate distribution given percentile ranks

阅读更多关于 Generate distribution given percentile ranks

R : knnImputation Giving Error

阅读更多关于 R : knnImputation Giving Error

问题 Getting below error in R coding. in my Brand_X.xlsx dataset, there are few NA values which I am trying to compute using KNN imputation but I am getting below error. whats wrong here? Thanks! > library(readxl) > Brand_X <- read_excel("Brand_X.xlsx") > str(Brand_X) Classes ‘tbl_df’, ‘tbl’ and 'data.frame': 101 obs. of 8 variables: $ Rel_price_lag5: num 108 111 105 103 109 104 110 114 103 108 ... $ Rel_price_lag1: num 110 109 217 241 855 271 234 297 271 999 ... $ Rel_Price : num 122 110 109 217

How to plot normal distribution with percentage of data as label in each band/bin?

阅读更多关于 How to plot normal distribution with percentage of data as label in each band/bin?

问题 While plotting normal distribution graph of data, how can we put labels like in image below for percentage of data in each bin where each band has a width of 1 standard deviation using matplotlib/seaborn or plotly ? Currently, im plotting like this: hmean = np.mean(data) hstd = np.std(data) pdf = stats.norm.pdf(data, hmean, hstd) plt.plot(data, pdf) 回答1: Although I've labelled the percentages between the quartiles, this bit of code may be helpful to do the same for the standard deviations.

How to plot normal distribution with percentage of data as label in each band/bin?

阅读更多关于 How to plot normal distribution with percentage of data as label in each band/bin?

How to identify the best frequency in a time series?

阅读更多关于 How to identify the best frequency in a time series?

问题 I have a database metrics grouped by day, and I need to forecast the data for the next 3 months. These data have seasonality, (I believe that the seasonality is by days of the week). I want to use the Holt Winters method using R, I need to create a time series object, which asks for frequency, (That I think is 7). But how can I know if I'm sure? Have a function to identify the best frequency? I'm using: FID_TS <- ts(FID_DataSet$Value, frequency=7) FID_TS_Observed <- HoltWinters(FID_TS) If I