Removing Whitespace From a Whole Data Frame in R

后端 未结 10 592
孤街浪徒
孤街浪徒 2020-12-05 03:01

I\'ve been trying to remove the white space that I have in a data frame (using R). The data frame is large (>1gb) and has multiple columns that contains whi

10条回答
  •  执笔经年
    2020-12-05 03:39

    R is simply not the right tool for such file size. However have 2 options :

    Use ffdply and ff base

    Use ff and ffbase packages:

    library(ff)
    library(ffabse)
    x <- read.csv.ffdf(file=your_file,header=TRUE, VERBOSE=TRUE,
                     first.rows=1e4, next.rows=5e4)
    x$split = as.ff(rep(seq(splits),each=nrow(x)/splits))
    ffdfdply( x, x$split , BATCHBYTES=0,function(myData)        
                 apply(myData,2,function(x)gsub('\\s+', '',x))
    

    Use sed (my preference)

    sed -ir "s/(\S)\s+(/S)/\1\2/g;s/^\s+//;s/\s+$//" your_file 
    

提交回复
热议问题