问题
I would like to search and replace in my database some characters but not in all the lines.
Here's my data base :
1. 41 R JEAN JAURES 93170
2. 42 AV DE STALINGRAD 93170
3. 51 57 R JULES FERRY 93170
4. 1 R DU HAVRE 93170
I would like to replace to have :
5. 41 RUE JEAN JAURES 93170
6. 42 AVENUE DE STALINGRAD 93170
7. 51 57 RUE JULES FERRY 93170
8. 1 RUE DU HAVRE 93170
So, I try the sub()
function, but in 2.
it will replace the first R
so it will be STALINGRUEAD instead of STALINGRAD.
I also try the substr()
but like in 3. there might be some long number of character before the letter to replace. As I have ~600k addresses there will be lot of exceptions like this.
Is there a way to add some restrictions in those functions to fulfill my goal?
回答1:
You can use \\s+
to match 1 or more spaces and \\s*
to match 0 or more spaces.
vec <- c("41 R JEAN JAURES 93170",
"42 AV DE STALINGRAD 93170",
"51 57 R JULES FERRY 93170",
"1 R DU HAVRE 93170")
library(magrittr)
vec %>%
gsub("\\s*R\\s+", " RUE ", .) %>%
gsub("\\s*AV\\s+", " AVENUE ", .)
[1] "41 RUE JEAN JAURES 93170" "42 AVENUE DE STALINGRAD 93170"
[3] "51 57 RUE JULES FERRY 93170" "1 RUE DU HAVRE 93170"
Furthermore you might consider \\b
for word boundaries (which includes space):
vec %>%
gsub("\\bR\\s+", "RUE ", .) %>%
gsub("\\bAV\\s+", "AVENUE ", .)
回答2:
You can try some regular expressions with stringr
. If 'R' for 'RUE' will consistently be the first 'R' character in each line, you could use stringr::str_replace
, which replaces only the first match in each string:
library(tidyverse)
#> Warning: package 'dplyr' was built under R version 3.5.1
data <- c(
"1. 41 R JEAN JAURES 93170",
"2. 42 AV DE STALINGRAD 93170",
"3. 51 57 R JULES FERRY 93170",
"4. 1 R DU HAVRE 93170")
data %>%
str_replace("(?<!\\w)R(?!\\w)", "RUE")
#> [1] "1. 41 RUE JEAN JAURES 93170" "2. 42 AV DE STALINGRAD 93170"
#> [3] "3. 51 57 RUE JULES FERRY 93170" "4. 1 RUE DU HAVRE 93170"
Edit: added a second reprex after the "R" per the comments
来源:https://stackoverflow.com/questions/51540961/search-and-replace-only-specific-lines-in-r