r-faq

Create an empty data.frame

久未见 提交于 2019-12-16 20:04:18
问题 I'm trying to initialize a data.frame without any rows. Basically, I want to specify the data types for each column and name them, but not have any rows created as a result. The best I've been able to do so far is something like: df <- data.frame(Date=as.Date("01/01/2000", format="%m/%d/%Y"), File="", User="", stringsAsFactors=FALSE) df <- df[-1,] Which creates a data.frame with a single row containing all of the data types and column names I wanted, but also creates a useless row which then

Why does summarize or mutate not work with group_by when I load `plyr` after `dplyr`?

送分小仙女□ 提交于 2019-12-16 19:51:38
问题 Note: The title of this question has been edited to make it the canonical question for issues when plyr functions mask their dplyr counterparts. The rest of the question remains unchanged. Suppose I have the following data: dfx <- data.frame( group = c(rep('A', 8), rep('B', 15), rep('C', 6)), sex = sample(c("M", "F"), size = 29, replace = TRUE), age = runif(n = 29, min = 18, max = 54) ) With the good old plyr I can create a little table summarizing my data with the following code: require

Error in <my code> : object of type 'closure' is not subsettable

梦想的初衷 提交于 2019-12-16 19:44:45
问题 I was finally able to work out the code for my scraping. It seemed to be working fine and then all of a sudden when I ran it again, I got the following error message: Error in url[i] = paste("http://en.wikipedia.org/wiki/", gsub(" ", "_", : object of type 'closure' is not subsettable I am not sure why as I changed nothing in my code. Please advise. library(XML) library(plyr) names <- c("George Clooney", "Kevin Costner", "George Bush", "Amar Shanghavi") for(i in 1:length(names)) { url[i] =

Pass a data.frame column name to a function

早过忘川 提交于 2019-12-16 19:26:13
问题 I'm trying to write a function to accept a data.frame ( x ) and a column from it. The function performs some calculations on x and later returns another data.frame. I'm stuck on the best-practices method to pass the column name to the function. The two minimal examples fun1 and fun2 below produce the desired result, being able to perform operations on x$column , using max() as an example. However, both rely on the seemingly (at least to me) inelegant call to substitute() and possibly eval()

Use dynamic variable names in `dplyr`

徘徊边缘 提交于 2019-12-16 19:16:07
问题 I want to use dplyr::mutate() to create multiple new columns in a data frame. The column names and their contents should be dynamically generated. Example data from iris: library(dplyr) iris <- tbl_df(iris) I've created a function to mutate my new columns from the Petal.Width variable: multipetal <- function(df, n) { varname <- paste("petal", n , sep=".") df <- mutate(df, varname = Petal.Width * n) ## problem arises here df } Now I create a loop to build my columns: for(i in 2:5) { iris <-

Collapse / concatenate / aggregate a column to a single comma separated string within each group

倖福魔咒の 提交于 2019-12-16 18:13:10
问题 I want to aggregate one column in a data frame according to two grouping variables, and separate the individual values by a comma. Here is some data: data <- data.frame(A = c(rep(111, 3), rep(222, 3)), B = rep(1:2, 3), C = c(5:10)) data # A B C # 1 111 1 5 # 2 111 2 6 # 3 111 1 7 # 4 222 2 8 # 5 222 1 9 # 6 222 2 10 "A" and "B" are grouping variables, and "C" is the variable that I want to collapse into a comma separated character string. I have tried: library(plyr) ddply(data, .(A,B),

How to join (merge) data frames (inner, outer, left, right)

心已入冬 提交于 2019-12-13 07:29:16
问题 Given two data frames: df1 = data.frame(CustomerId = c(1:6), Product = c(rep("Toaster", 3), rep("Radio", 3))) df2 = data.frame(CustomerId = c(2, 4, 6), State = c(rep("Alabama", 2), rep("Ohio", 1))) df1 # CustomerId Product # 1 Toaster # 2 Toaster # 3 Toaster # 4 Radio # 5 Radio # 6 Radio df2 # CustomerId State # 2 Alabama # 4 Alabama # 6 Ohio How can I do database style, i.e., sql style, joins? That is, how do I get: An inner join of df1 and df2 : Return only the rows in which the left table

How to write trycatch in R

与世无争的帅哥 提交于 2019-12-13 07:09:51
问题 I want to write trycatch code to deal with error in downloading from the web. url <- c( "http://stat.ethz.ch/R-manual/R-devel/library/base/html/connections.html", "http://en.wikipedia.org/wiki/Xz") y <- mapply(readLines, con=url) These two statements run successfully. Below, I create a non-exist web address: url <- c("xxxxx", "http://en.wikipedia.org/wiki/Xz") url[1] does not exist. How does one write a trycatch loop (function) so that: When the URL is wrong, the output will be: "web URL is

Display exact value of a variable in R

早过忘川 提交于 2019-12-12 04:21:06
问题 > x <- 1.00042589212565 > x [1] 1.000426 If I wanted to print the exact value of x , how would I do it? Sorry if this is a dumb question. I tried Googling for "R" and "exact" or "round" but all I get are articles about how to round. Thank you in advance! 回答1: Globally solution during all the session options(digits=16) > x [1] 1.00042589212565 or locally just for x: sprintf("%.16f", x) [1] "1.0004258921256499" 回答2: print(x, digits=15) or format(x, digits=15) or sprintf("%.14f", x) 来源: https:/

How do I install an R package from source?

我的梦境 提交于 2019-12-12 03:55:01
问题 A friend sent me along this great tutorial on webscraping NYtimes with R. I would really love to try it. However, the first step is to installed a package called RJSONIO from source. I know R reasonably well, but I have no idea how to install a package from source. I'm running Mac OSX. 回答1: If you have the file locally, then use install.packages() and set the repos=NULL : install.packages(path_to_file, repos = NULL, type="source") Where path_to_file would represent the full path and file name