问题
Here is sample to my function that getscodes
df= read.csv("secondary.csv",header = TRUE)
回答1:
S <- "s / O sk hungu 101 / 90 MODEL HOUSE TALAB GAGNI SHUKUL LUCKNOW UTTAR PRADESH LUCKNOW UTTAR PRADESH 226001"
I recommend making all possible N-x strings where N is length of your string and x is variable length
allchr <- unlist(strsplit(S, ""))
listsubstr <- sapply(1:length(allchr), function(I) paste0(allchr[I:length(allchr)], collapse=""))
# [1] "s / O sk hungu 101 / 90 MODEL HOUSE TALAB GAGNI SHUKUL LUCKNOW UTTAR PRADESH LUCKNOW UTTAR PRADESH 226001"
# [2] " / O sk hungu 101 / 90 MODEL HOUSE TALAB GAGNI SHUKUL LUCKNOW UTTAR PRADESH LUCKNOW UTTAR PRADESH 226001"
# [3] "/ O sk hungu 101 / 90 MODEL HOUSE TALAB GAGNI SHUKUL LUCKNOW UTTAR PRADESH LUCKNOW UTTAR PRADESH 226001"
# [4] " O sk hungu 101 / 90 MODEL HOUSE TALAB GAGNI SHUKUL LUCKNOW UTTAR PRADESH LUCKNOW UTTAR PRADESH 226001"
You can iterate through this list to check for valid geocodes. I have to provide pseudocode since I'm not sure how to check if a string is a valid geocode.
sapply(listsubstr, function(I) is.geocode(I)) # contains pseudocode
You could also do this with recursion though.
myfun <- function(x) {
if (x is gecode) { # contains pseudocode
return(x)
} else {
myfun(substr(x, 2, nchar(S)))
}
}
来源:https://stackoverflow.com/questions/46359191/how-to-identify-locations-from-text