statistics | 易学教程

What is a good statistical math package for .Net? [closed]

阅读更多关于 What is a good statistical math package for .Net? [closed]

问题 Closed. This question is off-topic. It is not currently accepting answers. Want to improve this question? Update the question so it's on-topic for Stack Overflow. Closed 5 years ago . I am looking for a library that does advanced math, statistics, statistical distribution, etc.. Currently I am looking for something that does binomial and poisson distribution. 回答1: MathDotNet should have the functions you are looking for, although it may be a bit of overkill depending on how much functionality

Variables Overview with xtable in R

阅读更多关于 Variables Overview with xtable in R

问题 I'm wondering if it's possible to create a xtable from the command str(x) to get an overview from the variables you use. This would be a nice feature to introduce someone to the dataset, but it's annoying to create it by yourself. So whta I tried is to make a xtable like this: str(cars) require(xtable) xtable(str(cars)) the cars dataset is given from R. Unfortunately xtable doesn't give a Latexcode for str() . Is it possible outsmart R here? Here are the main commands that xtable will

Python implementation of the Wilson Score Interval?

阅读更多关于 Python implementation of the Wilson Score Interval?

问题 After reading How Not to Sort by Average Rating, I was curious if anyone has a Python implementation of a Lower bound of Wilson score confidence interval for a Bernoulli parameter? 回答1: Reddit uses the Wilson score interval for comment ranking, an explanation and python implementation can be found here #Rewritten code from /r2/r2/lib/db/_sorts.pyx from math import sqrt def confidence(ups, downs): n = ups + downs if n == 0: return 0 z = 1.0 #1.44 = 85%, 1.96 = 95% phat = float(ups) / n return

Python implementation of the Wilson Score Interval?

阅读更多关于 Python implementation of the Wilson Score Interval?

count number of unique elements in each columns with dplyr in sparklyr

阅读更多关于 count number of unique elements in each columns with dplyr in sparklyr

问题 I'm trying to count the number of unique elements in each column in the spark dataset s. However It seems that spark doesn't recognize tally() k<-collect(s%>%group_by(grouping_type)%>%summarise_each(funs(tally(distinct(.))))) Error: org.apache.spark.sql.AnalysisException: undefined function TALLY It seems that spark doesn't recognize simple r functions either, like "unique" or "length". I can run the code on local data, but when I try to run the exact same code on spark table it doesn't work.

Error in chol.default(Cxx) : the leading minor of order is not positive definite

阅读更多关于 Error in chol.default(Cxx) : the leading minor of order is not positive definite

问题 I have a quite simple script in R. It loads in two data frames, and then performs rCCA with mixOmics : system('defaults write org.R-project.R force.LANG en_US.UTF-8') ## install.packages("mixOmics") library(mixOmics) TCIA <- read.csv("/Users/kimrants/Desktop/Data_for_R/TCIA", header=TRUE, sep=",", stringsAsFactors=FALSE) TCGA <- read.csv("/Users/kimrants/Desktop/Data_for_R/TCGA", header=TRUE, sep=",", stringsAsFactors=FALSE) # Remove first column (of ID) df_TCGA <- TCGA[,-1] df_TCIA<- TCIA[,

Understanding Markov Chain source code in R

阅读更多关于 Understanding Markov Chain source code in R

问题 The following source code is from a book. Comments are written by me to understand the code better. #================================================================== # markov(init,mat,n,states) = Simulates n steps of a Markov chain #------------------------------------------------------------------ # init = initial distribution # mat = transition matrix # labels = a character vector of states used as label of data-frame; # default is 1, .... k #----------------------------------------------

How to apply Henze-Zirkler's Multivariate Normality Test in Jupyter notebook with rpy2

阅读更多关于 How to apply Henze-Zirkler's Multivariate Normality Test in Jupyter notebook with rpy2

问题 I am interested in Applying Henze-Zirkler's Multivariate Normality Test in python 3x and I was wondering if I may do so in python in Jupyter notebook. I have fitted a VAR model with my data and the then I would like to test whether the residuals from this fitted VAR model are normally distributed. How may I do so in Jupyter notebook using python? 回答1: This is another answer since I discover this method later. If you do not want to import the library of R into Python. One may call the output

How to get a normalised slope of a trend

阅读更多关于 How to get a normalised slope of a trend

问题 I am analysing the distances of users to userx over 6 weeks in a social network. Note: 'No path' means the two users are not conncted yet (at least by friends of friends). week1 week2 week3 week4 week5 week6 user1 No path No path No path No path 3 1 user2 No path No path No path 5 3 1 user3 5 4 4 4 4 3 userN ... I want to see how well the users connect with userx . For that I initially thought of using the value of regression slope for the interpretation (i.e. the low regression slope, the

Sub setting panel data

阅读更多关于 Sub setting panel data

问题 Very new, so let me know if this is asking too much. I am trying to sub set panel data, in R, into two different categories; one that has complete information for variables and one that has incomplete information for variables. My data looks like this: Person Year Income Age Sex 1 2003 1500 15 1 1 2004 1700 16 1 1 2005 2000 17 1 2 2003 1400 25 0 2 2004 1900 26 0 2 2005 2000 27 0 What I need to do is go through each column ( not columns 1 and 2 ) and if the data is full for the variable (