summary

Skimr - cant seem to produce the histograms

主宰稳场 提交于 2019-12-01 06:56:27
came across this seemingly new package - skimr, which looks pretty nifty, and was trying it out and looks like I'm missing some package installation. Skim works fine except that it doesn't print the histogram, it is supposed to print for numeric variables. I am merely trying the examples given in the documentation. Link to skimr documentation here - https://github.com/ropenscilabs/skimr#skimr this is the code I'm using devtools::install_github("hadley/colformat") devtools::install_github("ropenscilabs/skimr") library(skimr) a<-skim(mtcars) dim(a) View(a) instead of histograms being printed, I

How do you summarize columns based on unique IDs without knowing IDs in R?

一笑奈何 提交于 2019-12-01 02:01:00
I've been going through the posts regarding summarizing data, but haven't seem to have found what I'm looking for. I wish to create a summary "count-table" which will allow me to see how often a certain medication was given to patients. The fact that some patients received multiple medications simultaneously doesn't matter, because I simply want a summary of all the medication given and then calculate which percentage each medication class is of all medication given. The issue is, that I don't know the names of the possible medication given, they're "hidden" somewhere in the data.frame , thus,

Summary data tables from wide data.frames

孤街浪徒 提交于 2019-12-01 01:12:57
I am trying to find lazy/easy ways of creating summary tables/ data.frames from wide data.frames . Assume a following data.frame, but with many more columns so that specifying the column names takes a long time: set.seed(2) x <- data.frame(Rep = rep(1:3, 4), Temp = c(rep(10,6), rep(20,6)), pH = rep(c(rep(8.1, 3), rep(7.6, 3)), 2), Var1 = rnorm(12, 5,2), Var2 = c(rnorm(6,4,1), rnorm(6,3,5)), Var3 = rt(12, 20)) x[1:3] <- as.data.frame(apply(x[1:3], 2, function(x) as.factor(x))) Now I can calculate summary statistics with plyr : (mu <- ddply(x, .(Temp, pH), numcolwise(mean))) (std <- ddply(x, .

How do you summarize columns based on unique IDs without knowing IDs in R?

£可爱£侵袭症+ 提交于 2019-11-30 21:25:27
问题 I've been going through the posts regarding summarizing data, but haven't seem to have found what I'm looking for. I wish to create a summary "count-table" which will allow me to see how often a certain medication was given to patients. The fact that some patients received multiple medications simultaneously doesn't matter, because I simply want a summary of all the medication given and then calculate which percentage each medication class is of all medication given. The issue is, that I don

using dplyr's do() with summary()

雨燕双飞 提交于 2019-11-30 21:12:15
I would like to be able to use dplyr 's split-apply-combine strategy to the apply the summary() command. Take a simple data frame: df <- data.frame(class = c('A', 'A', 'B', 'B'), value = c(100, 120, 800, 880)) Ideally we would do something like this: df %>% group_by(class) %>% do(summary(.$value)) Unfortunately this does not work. Any ideas? You can use the SE version of data_frame , that is, data_frame_ and perform: df %>% group_by(class) %>% do(data_frame_(summary(.$value))) Alternatively, you can use as.list() wrapped by data.frame() with the argument check.names = FALSE : df %>% group_by

Create summary table of categorical variables of different lengths

扶醉桌前 提交于 2019-11-30 21:11:54
In SPSS it is fairly easy to create a summary table of categorical variables using "Custom Tables": How can I do this in R? General and expandable solutions are preferred, and solutions using the Plyr and/or Reshape2 packages, because I am trying to learn those. Example Data: (mtcars is in the R installation) df <- colwise(function(x) as.factor(x) ) (mtcars[,8:11]) P.S. Please note, my goal is to get everything in one table like in the picture. I have been strugling for many hours but my attempts have been so poor that posting the code probably won't add to the comprehensibility of the

Grouping Over All Possible Combinations of Several Variables With dplyr

岁酱吖の 提交于 2019-11-30 14:21:05
Given a situation such as the following library(dplyr) myData <- tbl_df(data.frame( var1 = rnorm(100), var2 = letters[1:3] %>% sample(100, replace = TRUE) %>% factor(), var3 = LETTERS[1:3] %>% sample(100, replace = TRUE) %>% factor(), var4 = month.abb[1:3] %>% sample(100, replace = TRUE) %>% factor())) I would like to group `myData' to eventually find summary data grouping by all possible combinations of var2, var3, and var4. I can create a list with all possible combinations of variables as character values with groupNames <- names(myData)[2:4] myGroups <- Map(combn, list(groupNames), seq

Android preference summary . How to set 3 lines in summary?

笑着哭i 提交于 2019-11-30 12:53:45
Summary of preference is allowed only 2 lines . If I want to display 3 lines or more in summary . How can I do ? You can create you Preference class by extending any existing preference: public class LongSummaryCheckboxPreference extends CheckboxPreference { public LongSummaryCheckboxPreference(Context ctx, AttributeSet attrs, int defStyle) { super(ctx, attrs, defStyle); } public LongSummaryCheckboxPreference(Context ctx, AttributeSet attrs) { super(ctx, attrs); } @Override protected void onBindView(View view) { super.onBindView(view); TextView summary= (TextView)view.findViewById(android.R.id

How to get the sum of each four rows of a matrix in R

自作多情 提交于 2019-11-30 09:29:25
I have a 4n by m matrix (sums at 7.5 min intervals for a year). I would like to transform these to 30 min sums, e.g. convert a 70080 x 1 to a 17520 matrix. What is the most computationally efficient way to do this? More specifics: here is an example (shortened to one day instead of one year) library(lubridate) start.date <- ymd_hms("2009-01-01 00:00:00") n.seconds <- 192 # one day in seconds time <- start.date + c(seq(n.seconds) - 1) * seconds(450) test.data <- data.frame(time = time, observation = sin(1:n.seconds / n.seconds * pi)) R version: 2.13; Platform: x86_64-pc-linux-gnu (64-bit)

Summary of data for each year in R

 ̄綄美尐妖づ 提交于 2019-11-30 07:54:58
I have a data with two columns. In one column it is date and in another column it is flow data. I was able to read the data as date and flow data. I used the following code: creek <- read.csv("creek.csv") library(ggplot2) creek[1:10,] colnames(creek) <- c("date","flow") creek$date <- as.Date(creek$date, "%m/%d/%Y") The link to my data is https://www.dropbox.com/s/eqpena3nk82x67e/creek.csv Now, I want to find the summary of each year. I want to especially know mean, median, maximum etc. Thanks. Regards, Jdbaba Base R Here are two methods from base R. The first uses cut , split and lapply along