How to read only lines that fulfil a condition from a csv into R?

前端 未结 5 1306
猫巷女王i
猫巷女王i 2020-11-27 16:03

I am trying to read a large csv file into R. I only want to read and work with some of the rows that fulfil a particular condition (e.g. Variable2 >= 3). Thi

5条回答
  •  眼角桃花
    2020-11-27 16:41

    You can open the file in read mode using the function file (e.g. file("mydata.csv", open = "r")).

    You can read the file one line at a time using the function readLines with option n = 1, l = readLines(fc, n = 1).

    Then you have to parse your string using function such as strsplit, regular expressions, or you can try the package stringr (available from CRAN).

    If the line met the conditions to import the data, you import it.

    To summarize I would do something like this:

    df = data.frame(var1=character(), var2=int(), stringsAsFactors = FALSE)
    fc = file("myfile.csv", open = "r")
    
    i = 0
    while(length( (l <- readLines(fc, n = 1) ) > 0 )){ # note the parenthesis surrounding l <- readLines..
    
       ##parse l here: and check whether you need to import the data.
    
       if (need_to_add_data){
         i=i+1
         df[i,] = #list of data to import
      }
    
    }
    

提交回复
热议问题