R regex gsub separate letters and numbers
I have a string that's mixed letters and numbers:
"The sample is 22mg"
I'd like to split strings where a number is immediately followed by letter like this:
"The sample is 22 mg"
I've tried this:
gsub('[0-9]+[[aA-zZ]]', '[0-9]+ [[aA-zZ]]', 'This is a test 22mg')
but am not getting the desired results.
Any suggestions?
You need to use capturing parentheses in the regular expression and group references in the replacement. For example:
gsub('([0-9])([[:alpha:]])', '1 2', 'This is a test 22mg')
There's nothing R-specific here; the R help for regex
and gsub
should be of some use.
You need backreferencing:
test <- "The sample is 22mg"
> gsub("([0-9])([a-zA-Z])","1 2",test)
[1] "The sample is 22 mg"
Anything in parentheses gets remembered. Then they're accessed by 1 (for the first entity in parens), 2, etc. The first backslash escapes the backslash's interpretation in R so that it gets passed to the regular expression parser.
链接地址: http://www.djcxy.com/p/38312.html上一篇: 将时间重新格式化为可以操纵的数据
下一篇: R正则表达式gsub分隔字母和数字