ff | 易学教程

delete rows ff package

阅读更多关于 delete rows ff package

问题 Since a while now I´ve been using ff package in order to work with big data. The R object I´ve worked with has about 130.000.000 rows and 14 columns. Two of those columns, Temperature and Precipitation have missing values “NA” so I need to delete those rows in order to move forward with my work. I´ve been trying to do it like I would in a normal R object: data<-data[!is.na(data$temp),] But I keep getting an error: Error: vmode(index) == "integer" is not TRUE Does anyone have been able to

delete rows ff package

阅读更多关于 delete rows ff package

Since a while now I´ve been using ff package in order to work with big data. The R object I´ve worked with has about 130.000.000 rows and 14 columns. Two of those columns, Temperature and Precipitation have missing values “NA” so I need to delete those rows in order to move forward with my work. I´ve been trying to do it like I would in a normal R object: data<-data[!is.na(data$temp),] But I keep getting an error: Error: vmode(index) == "integer" is not TRUE Does anyone have been able to delete rows in a ffdf object? I´d appreciate any help. Indexing based on a logical ff_vector is not

ff package write error

阅读更多关于 ff package write error

问题 I'm trying to work with a 1909x139352 dataset using R. Since my computer only has 2GB of RAM, the dataset turns out to be too big (500MB) for the conventional methods. So I decided to use the ff package. However, I've been having some troubles. The function read.table.ffdf is unable to read the first chunk of data. It crashes with the next error: txtdata <- read.table.ffdf(file="/directory/myfile.csv", FUN="read.table", header=FALSE, sep=",", colClasses=c("factor",rep("integer",139351)),

Still struggling with handling large data set

阅读更多关于 Still struggling with handling large data set

问题 I have been reading around on this website and haven't been able to find the exact answer. If it already exists, I apologize for the repost. I am working with data sets that are extremely large (600 million rows, 64 columns on a computer with 32 GB of RAM). I really only need much smaller subsets of this data, but am struggling to perform any functions besides simply importing one data set in with fread, and selecting the 5 columns I need. After that, I try to overwrite my dataset with the

ff package write error

阅读更多关于 ff package write error

I'm trying to work with a 1909x139352 dataset using R. Since my computer only has 2GB of RAM, the dataset turns out to be too big (500MB) for the conventional methods. So I decided to use the ff package. However, I've been having some troubles. The function read.table.ffdf is unable to read the first chunk of data. It crashes with the next error: txtdata <- read.table.ffdf(file="/directory/myfile.csv", FUN="read.table", header=FALSE, sep=",", colClasses=c("factor",rep("integer",139351)), first.rows=100, next.rows=100, VERBOSE=TRUE) read.table.ffdf 1..100 (100) csv-read=77.253sec Error en ff

What is the meaning of this error “Error in if (any(B < 1)) stop(”B too small“)” while using tabplot package

阅读更多关于 What is the meaning of this error “Error in if (any(B < 1)) stop(”B too small“)” while using tabplot package

I found the tabplot package for visualizin a large data base. I ran it using the code below but I get this error on different data frames: "Error in if (any(B < 1)) stop("B too small") : missing value where TRUE/FALSE needed In addition: Warning message: In bbatch(n, as.integer(BATCHBYTES/theobytes)) : NAs introduced by coercion" Here is an example: dat <- read.table(text = " birds wolfs snakes 3 9 7 3 8 4 1 2 8 1 2 3 1 8 3 6 1 2 6 7 1 6 1 5 5 9 7 3 8 7 4 2 7 1 2 3 7 6 3 6 1 1 6 3 9 6 1 1 ",header = TRUE) install.packages("tabplot") package ‘ff’ successfully unpacked and MD5 sums checked

R could not allocate memory on ff procedure. How come?

阅读更多关于 R could not allocate memory on ff procedure. How come?

I'm working on a 64-bit Windows Server 2008 machine with Intel Xeon processor and 24 GB of RAM. I'm having trouble trying to read a particular TSV (tab-delimited) file of 11 GB (>24 million rows, 20 columns). My usual companion, read.table , has failed me. I'm currently trying the package ff , through this procedure: > df <- read.delim.ffdf(file = "data.tsv", + header = TRUE, + VERBOSE = TRUE, + first.rows = 1e3, + next.rows = 1e6, + na.strings = c("", NA), + colClasses = c("NUMERO_PROCESSO" = "factor")) Which works fine for about 6 million records, but then I get an error, as you can see:

Functions for creating and reshaping big data in R using the FF package

阅读更多关于 Functions for creating and reshaping big data in R using the FF package

I'm new to R and the FF package, and am trying to better understand how FF allows users to work with large datasets (>4Gb). I have spent a considerable amount of time trawling the web for tutorials, but the ones I could find generally go over my head. I learn best by doing, so as an exercise, I would like to know how to create a long-format time-series dataset, similar to R's in-built "Indometh" dataset, using arbitrary values. Then I would like to reshape it into wide format. Then I would like to save the output as a csv file. With small datasets this is simple, and can be achieved using the

linux中的set ff=unix

阅读更多关于 linux中的set ff=unix

set ff=unix : 告诉 vi 编辑器，使用unix换行符。操作步骤： 1.用vi命令打开文件 2.直接输入　　：set ff=unix 来源： https://www.cnblogs.com/lwcode6/p/11647955.html

Replace NAs in a ffdf object

阅读更多关于 Replace NAs in a ffdf object

问题 I`m working with a ffdf object which has NAs in some of the columns. The NAs are the result of a left outer merge using merge.ffdf .I would like to replace the NAs with 0s but not managing to do it. Here is the code I am running: library(ffbase) deals <- merge(deals,rk,by.x=c("DEALID","STICHTAG"),by.y=c("ID","STICHTAG"),all.x=TRUE) attributes(deals) $names [1] "virtual" "physical" "row.names" $class [1] "ffdf" vmode(deals$CREDIT_R) [1] "double" idx <- ffwhich(deals,is.na(CREDIT_R)) # CREDIT_R