statistics | 易学教程

How to create a Rician random variable?

阅读更多关于 How to create a Rician random variable?

问题 I'm trying to model a signal detection problem using Sympy, and need two random variables. One with a Rayleigh distribution to model noise, and one with a Rician distribution to model signal+noise. Sympy provides a Rayleigh distribution, but not a Rician-- or at least not one by that name. What's the best way of creating one? Does it exist under a different name? Is there a way to manipulate existing distributions into a Rician? Following advice from @asmeurer, I've implemented my own Rice

How to call a function that returns multiple rows and columns in a data.table?

阅读更多关于 How to call a function that returns multiple rows and columns in a data.table?

问题 I want to call a function inside a data.table that calculates a set of summary statistics like the following: summ.stats <- function(vec) { list( Min = min(vec), Mean = mean(vec), S.D. = sd(vec), Median = median(vec), Max = max(vec)) } and I want to call it in the j of a data.table : DT <- data.table(a=c(1,2,3,1,2,3),b=c(1,4,3,2,1,4),c=c(2,3,4,5,2,1)) DT[, summ.stats(b), by=a] This is fine and I get: a Min Mean S.D. Median Max 1: 1 1 1.5 0.7071068 1.5 2 2: 2 1 2.5 2.1213203 2.5 4 3: 3 3 3.5 0

Condensed matrix function to find pairs

阅读更多关于 Condensed matrix function to find pairs

问题 For a set of observations: [a1,a2,a3,a4,a5] their pairwise distances d=[[0,a12,a13,a14,a15] [a21,0,a23,a24,a25] [a31,a32,0,a34,a35] [a41,a42,a43,0,a45] [a51,a52,a53,a54,0]] Are given in a condensed matrix form (upper triangular of the above, calculated from scipy.spatial.distance.pdist ): c=[a12,a13,a14,a15,a23,a24,a25,a34,a35,a45] The question is, given that I have the index in the condensed matrix is there a function (in python preferably) f to quickly give which two observations were used

T-test for multiple rows in R

阅读更多关于 T-test for multiple rows in R

问题 I have a table with 40+ columns and 200.000+ rows. Something like this: ID GROUP-A1 GROUP-A2 GROUP A3...A20 GROUP-B1 GROUP-B2 GROUP-B3...B20 1 5 6 3 5....3 10 21 9 15 2 3 4 6 2....13 23 42 34 23 3 5 3 1 0....12 10 12 43 15 4 0 0 2 5....3 10 21 23 15 I would like to run a t-test for the two groups A (1..20) and B (1..20) for every measurement I have (each row), which are independent. And possibly, have the resulting stats in the table next to each row or in a separate table, so I can easily

Function to transform empirical distribution to a uniform distribution in Matlab?

阅读更多关于 Function to transform empirical distribution to a uniform distribution in Matlab?

问题 I know the procedure of transforming one distribution to another by the use of CDF. However, I would like to know if there is existing function in Matlab which can perform this task? My another related question is that I computed CDF of my empirical using ecdf() function in Matlab for a distribution with 10,000 values. However, the output that I get from it contains only 9967 values. How can I get total 10,000 values for my CDF? Thanks. 回答1: As you say, all you need is the CDF. The CDF of a

Include values to the barplot and pie charts in R

阅读更多关于 Include values to the barplot and pie charts in R

问题 I have data in this form: proprete.freq <- table(cnData$proprete) proprete.freq.genre <-table(cnData$genre,cnData$proprete) I am using these functions (barplot and pie) to plot the data: barplot(proprete.freq.genre, col = heat.colors(length(rownames(proprete.freq.genre))) , main="Proprete", beside = TRUE) pie(proprete.freq, col=rainbow(3), names.arg=avis, main="Propreté") Here is the result: Question : How to include the value just on top of the barplots and below the categorical variables

How to extract longitudinal time-series data from a dataframe in R for time-series analysis and imputation

阅读更多关于 How to extract longitudinal time-series data from a dataframe in R for time-series analysis and imputation

问题 Thanks to joran for helping me to group data in my previous question where I wanted to make a data frame in R smaller so that I can do time-series analysis on the data. Now I would like to actually further extract data from the dataframe. The dataframe is made up of 6 columns. Columns 1 to 5 each have discrete names/values, such as a district, gender, year, month and age group. The sixth column is the number of death counts for that specific combination. An extract looks like this: District

d3 quantile or quartile scale given the quartile values

阅读更多关于 d3 quantile or quartile scale given the quartile values

问题 The current quantile scale takes all the input values as the domain to map the output range. But if the data is extremely large, I want the processing to happen on the server giving me the quartile values. So I get: var quartiles=[5, 10, 15, 20, 25, 30, 35, 40, 45]; // 9 values with the mean (25) at the middle and standard deviations to each side var valueToMark = 37; Using d3, how do I correctly create a quantile scale and mark them all on a line given only the quantiles and value to mark? p

Inverse of the lognormal distribution

阅读更多关于 Inverse of the lognormal distribution

问题 I need to find the inverse of a given lognormal distribution. Since there is no inbuilt function in R for inverse lognormal, I need to design my own. I have this lognormal distribution for a random variable 'x' f_lambda <- function(x,mu,sig) {dlnorm(x, meanlog = mu, sdlog = sig,log=FALSE)} On wikipedia it says G(y) = 1- F(1/y) where G(Y)n is the inverse distribution to F(X) and X= 1/Y. But, I am confused as to how to encode F(1/y) in r and what to use to define that distribution - mu or 1/mu.

rounding series of pandas dataframes

阅读更多关于 rounding series of pandas dataframes

问题 I am trying to solve one of the coursera's homework for beginners. I have read the data and tried to convert it as it shown in the code piece below. I am looking for the frequency distribution of the considered variables and for this reason I am trying to round the values. I tried several methods but nothing give me what I am expecting (see below please).. import pandas as pd import numpy as np # loading the database file data = pd.read_csv('gapminder-2.csv',low_memory=False) # number of