r | 易学教程

R Data table - how to use previous row value within group [duplicate]

阅读更多关于 R Data table - how to use previous row value within group [duplicate]

问题 This question already has answers here : How to create a lag variable within each group? (5 answers) Closed 5 years ago . I wish to calculate the difference between the current row and previous row, by groups. x = data.table(a=c(15, 25, 10, 12), b = c(1,1,2,2)) > x a b 1: 15 1 2: 25 1 3: 10 2 4: 12 2 > x[, c:= a - c(NA, a[.I-1]), by=b] Warning messages: 1: In a - c(NA, a[.I - 1]) : longer object length is not a multiple of shorter object length 2: In `[.data.table`(x, , `:=`(c, a - c(NA, a[.I

Reading multiple csv files

阅读更多关于 Reading multiple csv files

问题 I need to read multiple csv files and print the first six values for each of these. I tried this code but it is obviously wrong because the value of di is overwritten each iteration of the loop. How can I read multiple files? library(xlsx) for(i in 1:7){ di = read.csv(file.choose(),header=T) print(di) } d = list(d1,d2,d3,d4,d5,d6,d7) lapply(d,head) 回答1: If you want to keep you data frames in a list, rather than assigning each to a new object. Option 1: fs <- dir(pattern = ".csv") d <- list()

ggplot2 add manual legend for two data series

阅读更多关于 ggplot2 add manual legend for two data series

问题 I have this dataframe: Control Stress days sd_control sd_stress X1 0.9702100 0.9343627 X1 0.001900535 0.07035645 X2 0.9666619 0.8595523 X2 0.014946893 0.04066567 X3 0.9165654 0.7160598 X3 0.072655343 0.07025344 X4 0.9208237 0.6668044 X4 0.050870831 0.08736982 X5 0.8766547 0.7660685 X5 0.073588197 0.04868614 X6 0.9599553 0.7937444 X6 0.041559836 0.05326769 X7 0.9736297 0.8188934 X7 0.003817743 0.06272428 and based on this data I've done this plot: With the following code: significance <- data

Find values that are between list of numbers

阅读更多关于 Find values that are between list of numbers

问题 I have two list of numbers like below. x <- c(1, 5, 10, 17, 21, 30) y <- c(2, 7, 19) In my dataset, x divides 1 to 30 in different segments (from 1-5, 5-10, 10-17, 17-21, 21-30). Would it be possible to match these segments to numbers in y ? (In this case, I'd want to get c(1,5,17) as an output because 2 is between 1 and 5, 7 is between 5 and 10, and 19 is in between 17 and 21.) 回答1: You can do this with sapply and a simple function sapply(y, function(a) x[max(which(x<a))]) [1] 1 5 17 回答2:

ggplot2 add manual legend for two data series

阅读更多关于 ggplot2 add manual legend for two data series

specific country map with district/cities using R

阅读更多关于 specific country map with district/cities using R

问题 I am trying to draw some specific countries map such as, Bangladesh, Bhutan etc. with its district/cities in R. As an example, I can draw US map using the following lines of codes. Is there any such library/package that can give me any countries map with its cities/district/province? Any clue is appreciated. library(maps) states <- map_data("state") 回答1: You can download shapefile of any country from the following website https://www.diva-gis.org/gdata Then read and plot them in R using

How to check multiple values using if condition [duplicate]

阅读更多关于 How to check multiple values using if condition [duplicate]

问题 This question already has answers here : Idiom for ifelse-style recoding for multiple categories (12 answers) Closed 2 years ago . I have like below mentioned dataframe: Records: ID Remarks Value 1 ABC 10 1 AAB 12 1 ZZX 15 2 XYZ 12 2 ABB 14 By utilizing the above mentioned dataframe, I want to add new column Status in the existing dataframe. Where if the Remarks is ABC, AAB or ABB than status would be TRUE and for XYZ and ZZX it should be FALSE . I am using below mentioned method for that but

Trouble-shooting Box Cox transformation in R ( need to use for loop or apply)

阅读更多关于 Trouble-shooting Box Cox transformation in R ( need to use for loop or apply)

问题 Please find below my data ( rows are disease group 0= control, 1=Ulcerative Colitis and 2=Crohns), columns are gene expression values. structure(c(5.54312e-05, 5.6112e-06, 9.74312e-05, 1.3612e-06, 1.29312e-05, 7.2512e-06, 0.0002159302, 3.6312e-06, 0.0001467552, 1.53312e-05, 0.0009132182, 1.9312e-06, 0.0074214952, 0.0006480372, 5.1312e-06, 6.1812e-06, 4.7612e-06, 0.0001199302, 0.0008845182, 0.0008506632, 0.0002366382, 7.3912e-06, 8.5112e-06, 2.63312e-05, 0.0013685242, 1.12312e-05, 0.0001775992

Print matrix without column names but kept alligned?

阅读更多关于 Print matrix without column names but kept alligned?

问题 I would like to print a matrix without column names and found this answer. However, this will result in the columns of the output not being aligned anymore, when the row names are kept and are of different length: m <- matrix(LETTERS[1:12],nrow = 2, ncol = 6) rownames(m) <- c("First Row", "Second Row") Using print just ignores the col.names = FALSE argument (why?): print(m, col.names=FALSE, quote=FALSE) > [,1] [,2] [,3] [,4] [,5] [,6] > First Row A C E G I K > Second Row B D F H J L Using

Search PDF's extract lines with keyword and print Not available if keyword not found

阅读更多关于 Search PDF's extract lines with keyword and print Not available if keyword not found

问题 Link for input PDF's https://drive.google.com/drive/folders/1dcgDpfiVjMTGmYSRGnQA65YjZzv0AwXL?usp=sharing Code goes through all the PDF files in the path and creates a corpus and separates each line with a separator. Next it checks through all the lines with the given search list and pulls that line and tells if the search word is present in the PDF or not (a <- sapply(unlist(Table_search), grepl, x = tablelines)). setwd("D:") tables<- list.files(pattern='pdf$') tablecorpus <- Corpus