subset

How to locate a structured region of data inside of a not structured data frame in R?

余生长醉 提交于 2019-12-13 08:45:04
问题 I have a certain kind of data frames that contain a subset of interest. The problem is that this subset, is non consistent between the different data frames . Nonetheless, in a more abstract level , follows a general structure: a rectangular region inside the data frame. example1 <- data.frame(x = c("name", "129-2", NA, NA, "acc", 2, 3, 4, NA, NA), y = c(NA, NA, NA, NA, "deb", 3, 2, 5, NA, NA), z = c(NA, NA, NA, NA, "asset", 1, 1, 2, NA, NA)) print(example1) x y z 1 name <NA> <NA> 2 129-2 <NA

Different results for 2 subset data methods in R

半世苍凉 提交于 2019-12-13 08:42:36
问题 I'm subseting my data, and I'm getting different results for the following codes: subset(df, x==1) df[df$x==1,] x 's type is integer Am I doing something wrong? Thank you in advance 回答1: Without example data, it is difficult to say what your problem is. However, my hunch is that the following probably explains your problem: df <- data.frame(quantity=c(1:3, NA), item=c("Coffee", "Americano", "Espresso", "Decaf")) df quantity item 1 Coffee 2 Americano 3 Espresso NA Decaf Let's subset with [ df

combination of pair subsets from a list in lisp

点点圈 提交于 2019-12-13 07:58:10
问题 How to create all possible pairs subsets from a list in conman lisp. For example the list A contain four elements list A= ("A" "B" "C" "D") the expected output is as follows: (("A","B"),("A","C"), ("A","D"),("B","C"),("B","D"), ("C","D")) Could someone please help me out to generate these subsets. Thanks a lot 回答1: Read up on mapcar et al: (defparameter a (list 1 2 3 4)) (mapcon (lambda (tail) (mapcar (lambda (x) (cons (car tail) x)) (cdr tail))) a) ==> ((1 . 2) (1 . 3) (1 . 4) (2 . 3) (2 . 4

Outputting various subsets from one data frame based on dates

萝らか妹 提交于 2019-12-13 06:41:13
问题 I want to create numerous subsets of data based on date sequences defined from a separate dataframe. For example, one dataframe will have dates and daily recorded values across multiple years. I have created a hypothetical dataframe below. I want to conduct various subsets from this dataframe based on start and end dates defined elsewhere. set.seed(24) df1 <- as.data.frame(matrix(sample(0:3000, 300*10, replace=TRUE), ncol=1)) df2 <- as.data.frame(seq(as.Date("2004/1/1"), by = "day", length

r - find same times in n number of data frames

拜拜、爱过 提交于 2019-12-13 06:22:00
问题 Consider the following example: Date1 = seq(from = as.POSIXct("2010-05-03 00:00"), to = as.POSIXct("2010-06-20 23:00"), by = 120) Dat1 <- data.frame(DateTime = Date1, x1 = rnorm(length(Date1))) Date2 <- seq(from = as.POSIXct("2010-05-01 03:30"), to = as.POSIXct("2010-07-03 22:00"), by = 120) Dat2 <- data.frame(DateTime = Date2, x1 = rnorm(length(Date2))) Date3 <- seq(from = as.POSIXct("2010-06-08 01:30"), to = as.POSIXct("2010-07-13 11:00"), by = 120) Dat3Matrix <- matrix(data = rnorm(length

Plotting a data.frame from within a function with ggplot2

笑着哭i 提交于 2019-12-13 06:21:47
问题 I have this function to take an object returned by the IRT package sirt and plot item response functions for a set of items that the user can specify: plotRaschIRF <- function(x,items=NULL,thl=-5,thu=5,thi=.01,D=1.7) { if (!class(x)=="rasch.mml") stop("Object must be of class rasch.mml") thetas <- seq(thl,thu,thi) N <- length(thetas) n <- length(x$item$b) tmp <- data.frame(item=rep(1:n,each=N),theta=rep(thetas,times=n),b=rep(x$item$b,each=N)) probs <- exp(D*(tmp[,2]-tmp[,3]))/(1+exp(D*(tmp[,2

R: Undefined Columns Selected Error when Subsetting DF

依然范特西╮ 提交于 2019-12-13 05:56:39
问题 I have a dataframe data with the following structure: Classes ‘tbl_df’ and 'data.frame': 4391 obs. of 53 variables When I try to subset it to get the top 100 rows using data100 = data[1:100,] I get this error: Error in `[.data.frame`(X[[i]], ...) : undefined columns selected What could be the reason? 回答1: Found the answer - I needed to use as.data.frame(data) before subsetting because tbl_df is not subsettable the same way as a data frame. This was needed due to using dplyr earlier and it

R- create new dataframe variable from subset of two variables with missing data NA

痞子三分冷 提交于 2019-12-13 05:52:57
问题 I have a simple example data frame with two data columns (data1 and data2) and two grouping variables (Measure 1 and 2). Measure 1 and 2 have missing data NA. d <- data.frame(Measure1 = 1:2, Measure2 = 3:4, data1 = 1:10, data2 = 11:20) d$Measure1[4]=NA d$Measure2[8]=NA d Measure1 Measure2 data1 data2 1 1 3 1 11 2 2 4 2 12 3 1 3 3 13 4 NA 4 4 14 5 1 3 5 15 6 2 4 6 16 7 1 3 7 17 8 2 NA 8 18 9 1 3 9 19 10 2 4 10 20 I want to create a new variable ( d$new ) that contains data1, but only for rows

Selecting and plotting months in ggplot2

丶灬走出姿态 提交于 2019-12-13 04:55:04
问题 I have a time series dataset in this format with two columns date (e.g Jan 1980, Feb 1980...Dec 2013) and it's corresponding temperature. This dataset is from 1980 to 2013. I am trying to subset and plot time series in ggplot for the months separately (e.g I only want all Feb so that I can plot it using ggplot). Tried the following, but the Feb1 is empty Feb1 <- subset(temp, date ==5) The structure of my dataset is: 'data.frame': 408 obs. of 2 variables: $ date :Class 'yearmon' num [1:359]

Linear regression with conditional statement in R

我的未来我决定 提交于 2019-12-13 04:34:32
问题 I have a huge database and I need to run different regressions with conditional statements. So I see to options to do it: 1) in the regression include the command data subset (industrycodes==12) and 2) I don't obtain the same results as if cut the data to the values when furniture==12. And they should be the same. Could somebody help me with the codes, I think I have a problem with this. I put an example very basic to explain it. ID roa employees industrycodes 1 0,5 10 12 2 0,3 20 11 3 0,8 15