R gsub everything after blank
I am stuggling to figure out how to gsub everything after the "blank" of the first hour value.
as.data.frame(valeur)
valeur
1 8:01 8:15
2 17:46 18:00
3 <NA>
4 <NA>
5 <NA>
6 <NA>
7 8:01 8:15
8 17:46 18:00
What I need is
valeur
1 8:01
2 17:46
3 <NA>
4 <NA>
5 <NA>
6 <NA>
7 8:01
8 17:46
Any clue ?
I tried
gsub("[:blank:].*$","",valeur)
Almost
valeur = c(" 8:01 8:15 ", " 17:46 18:00 ", NA, NA, NA, NA, " 8:01 8:15 ",
" 17:46 18:00 ")
I guess you have leading/lagging spaces from the 'valeur' output. We can remove those with gsub . We match one or more space from the beginning of the string ( ^s+ ) or ( | ) space at the end of the string ( s+$ ), replace with '' .
valeur1 <- gsub('^s+|s+$', '', valeur)
If we need the first non-space characters, we match the space ( s+ ) followed by non-space ( S+ ) till the end of the string and replace with '' .
sub('s+S+$', '', valeur1)
#[1] "8:01" "17:46" NA NA NA NA "8:01" "17:46"
To get the last non-space characters, use sub to match one or more characters that are not a space ( S+ ) from the beginning of the string ('^') followed by one or more space ( s+ ) and replace it with '' to get the last non-space character.
sub('^S+s+', '', valeur1)
#[1] "8:15" "18:00" NA NA NA NA "8:15" "18:00"
The above can be done in a single step where we match zero or more space at the beginning ( ^s* ) or ( | ) a one or more space ( s+ ) followed by one or more non-space characters ( S+ ), followed by zero or more space characters at the end ( s*$ ) and replace by '' .
gsub("^s*|s+S+s*$","",valeur)
#[1] "8:01" "17:46" NA NA NA NA "8:01" "17:46"
Or another option is stri_extract_first or stri_extract_last from library(stringi) where we match one or more non-space characters at the beginning or the end.
library(stringi)
stri_extract_first(valeur, regex='S+')
#[1] "8:01" "17:46" NA NA NA NA "8:01" "17:46"
For the last non_space characters
stri_extract_last(valeur, regex='S+')
#[1] "8:15" "18:00" NA NA NA NA "8:15" "18:00"
为了贡献,只想到:
substr(x = valeur, start = 2, stop = 6)
[1] "8:01 " "17:46" NA NA NA NA "8:01 " "17:46"
链接地址: http://www.djcxy.com/p/38320.html
上一篇: 使用readxl和正确的变量类型将多个excel电子表格读入R中
下一篇: R gsub空白后的所有内容
