R gsub everything after blank

I am stuggling to figure out how to gsub everything after the "blank" of the first hour value.

as.data.frame(valeur)

         valeur
1    8:01 8:15 
2  17:46 18:00 
3          <NA>
4          <NA>
5          <NA>
6          <NA>
7    8:01 8:15 
8  17:46 18:00 

What I need is

     valeur
1          8:01
2         17:46
3          <NA>
4          <NA>
5          <NA>
6          <NA>
7          8:01
8         17:46

Any clue ?

I tried

 gsub("[:blank:].*$","",valeur)

Almost

valeur = c(" 8:01 8:15 ", " 17:46 18:00 ", NA, NA, NA, NA, " 8:01 8:15 ", 
" 17:46 18:00 ")

I guess you have leading/lagging spaces from the 'valeur' output. We can remove those with gsub . We match one or more space from the beginning of the string ( ^s+ ) or ( | ) space at the end of the string ( s+$ ), replace with '' .

valeur1 <- gsub('^s+|s+$', '', valeur)

If we need the first non-space characters, we match the space ( s+ ) followed by non-space ( S+ ) till the end of the string and replace with '' .

sub('s+S+$', '', valeur1)
#[1] "8:01"  "17:46" NA      NA      NA      NA      "8:01"  "17:46"

To get the last non-space characters, use sub to match one or more characters that are not a space ( S+ ) from the beginning of the string ('^') followed by one or more space ( s+ ) and replace it with '' to get the last non-space character.

sub('^S+s+', '', valeur1)
#[1] "8:15"  "18:00" NA      NA      NA      NA      "8:15"  "18:00"

The above can be done in a single step where we match zero or more space at the beginning ( ^s* ) or ( | ) a one or more space ( s+ ) followed by one or more non-space characters ( S+ ), followed by zero or more space characters at the end ( s*$ ) and replace by '' .

 gsub("^s*|s+S+s*$","",valeur)
 #[1] "8:01"  "17:46" NA      NA      NA      NA      "8:01"  "17:46"

Or another option is stri_extract_first or stri_extract_last from library(stringi) where we match one or more non-space characters at the beginning or the end.

 library(stringi)
 stri_extract_first(valeur, regex='S+')
 #[1] "8:01"  "17:46" NA      NA      NA      NA      "8:01"  "17:46"

For the last non_space characters

 stri_extract_last(valeur, regex='S+')
 #[1] "8:15"  "18:00" NA      NA      NA      NA      "8:15"  "18:00"

为了贡献,只想到:

substr(x = valeur, start = 2, stop = 6)
[1] "8:01 " "17:46" NA      NA      NA      NA      "8:01 " "17:46"
链接地址: http://www.djcxy.com/p/38320.html

上一篇: 使用readxl和正确的变量类型将多个excel电子表格读入R中

下一篇: R gsub空白后的所有内容