How to read data when some numbers contain commas as thousand separator?

后端 未结 11 1425
情书的邮戳
情书的邮戳 2020-11-22 02:29

I have a csv file where some of the numerical values are expressed as strings with commas as thousand separator, e.g. \"1,513\" instead of 1513. Wh

11条回答
  •  一整个雨季
    2020-11-22 02:58

    "Preprocess" in R:

    lines <- "www, rrr, 1,234, ttt \n rrr,zzz, 1,234,567,987, rrr"
    

    Can use readLines on a textConnection. Then remove only the commas that are between digits:

    gsub("([0-9]+)\\,([0-9])", "\\1\\2", lines)
    
    ## [1] "www, rrr, 1234, ttt \n rrr,zzz, 1234567987, rrr"
    

    It's als useful to know but not directly relevant to this question that commas as decimal separators can be handled by read.csv2 (automagically) or read.table(with setting of the 'dec'-parameter).

    Edit: Later I discovered how to use colClasses by designing a new class. See:

    How to load df with 1000 separator in R as numeric class?

提交回复
热议问题