trimws bug? leading whitespace not removed

喜你入骨 提交于 2019-11-29 14:33:30

0xa0 is encoding another type of space (the non-breaking space) in R, while 0x20 is the white space.
trimws searches for white spaces or tabs or linebreaks or carriage returns (represented by [ \t\r\n]+) but not for non-breaking spaces, hence it does not work.
You can use sub (to suppress either leading or trailing spaces) or gsub (to suppress both trailing and leading spaces) to remove any kind of trailing or leading space(s) (including the one represented by 0xa0):

sub("^\\s+", "", x)
[1] "11.132592"

And for removing leading and trailing spaces:

gsub("(^\\s+)|(\\s+$)", "", x)

A possible solution is replace the wrongly encoded spaces with the right ones:

trimws(rawToChar(replace(x1, x1 == as.raw(0xa0), as.raw(0x20))))

which gives:

[1] "11.132592"

For conversion to numeric, just wrap above code in as.numeric.


Used data:

x1 <- as.raw(c(0xa0, 0x31, 0x31, 0x2e, 0x31, 0x33, 0x32, 0x35, 0x39, 0x32))
标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!