问题
I have a data frame resembling the extract below:
Observation Identifier Value
Obs001 ABC_2001 54
Obs002 ABC_2002 -2
Obs003 1
Obs004 1
Obs005 Def_2001/05
I would like to transform this data frame into a data frame where portions of the string after the "_" sign would be removed: as illustrated below:
Observation Identifier_NoTime Value
Obs001 ABC 54
Obs002 ABC -2
Obs003 1
Obs004 1
Obs005 Def
I tried experimenting with strsplit, gsub and sub as discussed here but cannot force those commends to work. I have to account for the fact that:
- Column has missing values and I want to leave them where they are
- String "_" is located in different places in the variable
- I also want to leave the rest of the data frame the way it is
回答1:
You could try the below sub command to remove all the non-space characters from _ symbol.
sub("_\\S*", "", string)
Explanation:
_Matches a literal_symbol.\S*Matches zero or more non-space characters.
OR
This would remove all the characters from _ symbol,
sub("_.*", "", string)
Explanation:
_Matches a literal_symbol..*Matches any character zero or more times.
来源:https://stackoverflow.com/questions/26611922/remove-everything-after-a-string-in-a-data-frame-column-with-missing-values