Regex group capture in R with multiple capture
In R, is it possible to extract group capture from a regular expression match? As far as I can tell, none of grep
, grepl
, regexpr
, gregexpr
, sub
, or gsub
return the group captures.
I need to extract key-value pairs from strings that are encoded thus:
((.*?) :: (0.[0-9]+))
I can always just do multiple full-match greps, or do some outside (non-R) processing, but I was hoping I can do it all within R. Is there's a function or a package that provides such a function to do this?
str_match()
, from the stringr
package, will do this. It returns a character matrix with one column for each group in the match (and one for the whole match):
> s = c("(sometext :: 0.1231313213)", "(moretext :: 0.111222)")
> str_match(s, "((.*?) :: (0.[0-9]+))")
[,1] [,2] [,3]
[1,] "(sometext :: 0.1231313213)" "sometext" "0.1231313213"
[2,] "(moretext :: 0.111222)" "moretext" "0.111222"
gsub does this, from your example:
gsub("((.*?) :: (0.[0-9]+))","1 2", "(sometext :: 0.1231313213)")
[1] "sometext 0.1231313213"
you need to double escape the s in the quotes then they work for the regex.
Hope this helps.
尝试regmatches()
和regexec()
:
regmatches("(sometext :: 0.1231313213)",regexec("((.*?) :: (0.[0-9]+))","(sometext :: 0.1231313213)"))
[[1]]
[1] "(sometext :: 0.1231313213)" "sometext" "0.1231313213"
链接地址: http://www.djcxy.com/p/74782.html
上一篇: 仍然在比赛中表现出来的比赛
下一篇: R中的正则表达式组捕获多个捕获