statistics

How to create a Rician random variable?

瘦欲@ 提交于 2019-12-12 07:22:22
问题 I'm trying to model a signal detection problem using Sympy, and need two random variables. One with a Rayleigh distribution to model noise, and one with a Rician distribution to model signal+noise. Sympy provides a Rayleigh distribution, but not a Rician-- or at least not one by that name. What's the best way of creating one? Does it exist under a different name? Is there a way to manipulate existing distributions into a Rician? Following advice from @asmeurer, I've implemented my own Rice

How to call a function that returns multiple rows and columns in a data.table?

时间秒杀一切 提交于 2019-12-12 07:18:47
问题 I want to call a function inside a data.table that calculates a set of summary statistics like the following: summ.stats <- function(vec) { list( Min = min(vec), Mean = mean(vec), S.D. = sd(vec), Median = median(vec), Max = max(vec)) } and I want to call it in the j of a data.table : DT <- data.table(a=c(1,2,3,1,2,3),b=c(1,4,3,2,1,4),c=c(2,3,4,5,2,1)) DT[, summ.stats(b), by=a] This is fine and I get: a Min Mean S.D. Median Max 1: 1 1 1.5 0.7071068 1.5 2 2: 2 1 2.5 2.1213203 2.5 4 3: 3 3 3.5 0

Condensed matrix function to find pairs

拜拜、爱过 提交于 2019-12-12 07:09:04
问题 For a set of observations: [a1,a2,a3,a4,a5] their pairwise distances d=[[0,a12,a13,a14,a15] [a21,0,a23,a24,a25] [a31,a32,0,a34,a35] [a41,a42,a43,0,a45] [a51,a52,a53,a54,0]] Are given in a condensed matrix form (upper triangular of the above, calculated from scipy.spatial.distance.pdist ): c=[a12,a13,a14,a15,a23,a24,a25,a34,a35,a45] The question is, given that I have the index in the condensed matrix is there a function (in python preferably) f to quickly give which two observations were used

T-test for multiple rows in R

99封情书 提交于 2019-12-12 06:56:43
问题 I have a table with 40+ columns and 200.000+ rows. Something like this: ID GROUP-A1 GROUP-A2 GROUP A3...A20 GROUP-B1 GROUP-B2 GROUP-B3...B20 1 5 6 3 5....3 10 21 9 15 2 3 4 6 2....13 23 42 34 23 3 5 3 1 0....12 10 12 43 15 4 0 0 2 5....3 10 21 23 15 I would like to run a t-test for the two groups A (1..20) and B (1..20) for every measurement I have (each row), which are independent. And possibly, have the resulting stats in the table next to each row or in a separate table, so I can easily

Function to transform empirical distribution to a uniform distribution in Matlab?

坚强是说给别人听的谎言 提交于 2019-12-12 06:01:02
问题 I know the procedure of transforming one distribution to another by the use of CDF. However, I would like to know if there is existing function in Matlab which can perform this task? My another related question is that I computed CDF of my empirical using ecdf() function in Matlab for a distribution with 10,000 values. However, the output that I get from it contains only 9967 values. How can I get total 10,000 values for my CDF? Thanks. 回答1: As you say, all you need is the CDF. The CDF of a

Include values to the barplot and pie charts in R

爱⌒轻易说出口 提交于 2019-12-12 05:56:01
问题 I have data in this form: proprete.freq <- table(cnData$proprete) proprete.freq.genre <-table(cnData$genre,cnData$proprete) I am using these functions (barplot and pie) to plot the data: barplot(proprete.freq.genre, col = heat.colors(length(rownames(proprete.freq.genre))) , main="Proprete", beside = TRUE) pie(proprete.freq, col=rainbow(3), names.arg=avis, main="Propreté") Here is the result: Question : How to include the value just on top of the barplots and below the categorical variables

How to extract longitudinal time-series data from a dataframe in R for time-series analysis and imputation

孤者浪人 提交于 2019-12-12 05:47:30
问题 Thanks to joran for helping me to group data in my previous question where I wanted to make a data frame in R smaller so that I can do time-series analysis on the data. Now I would like to actually further extract data from the dataframe. The dataframe is made up of 6 columns. Columns 1 to 5 each have discrete names/values, such as a district, gender, year, month and age group. The sixth column is the number of death counts for that specific combination. An extract looks like this: District

d3 quantile or quartile scale given the quartile values

十年热恋 提交于 2019-12-12 05:33:28
问题 The current quantile scale takes all the input values as the domain to map the output range. But if the data is extremely large, I want the processing to happen on the server giving me the quartile values. So I get: var quartiles=[5, 10, 15, 20, 25, 30, 35, 40, 45]; // 9 values with the mean (25) at the middle and standard deviations to each side var valueToMark = 37; Using d3, how do I correctly create a quantile scale and mark them all on a line given only the quantiles and value to mark? p

Inverse of the lognormal distribution

大憨熊 提交于 2019-12-12 04:58:18
问题 I need to find the inverse of a given lognormal distribution. Since there is no inbuilt function in R for inverse lognormal, I need to design my own. I have this lognormal distribution for a random variable 'x' f_lambda <- function(x,mu,sig) {dlnorm(x, meanlog = mu, sdlog = sig,log=FALSE)} On wikipedia it says G(y) = 1- F(1/y) where G(Y)n is the inverse distribution to F(X) and X= 1/Y. But, I am confused as to how to encode F(1/y) in r and what to use to define that distribution - mu or 1/mu.

rounding series of pandas dataframes

时光毁灭记忆、已成空白 提交于 2019-12-12 04:46:41
问题 I am trying to solve one of the coursera's homework for beginners. I have read the data and tried to convert it as it shown in the code piece below. I am looking for the frequency distribution of the considered variables and for this reason I am trying to round the values. I tried several methods but nothing give me what I am expecting (see below please).. import pandas as pd import numpy as np # loading the database file data = pd.read_csv('gapminder-2.csv',low_memory=False) # number of