R正则表达式gsub分隔字母和数字

2018-06-13 10:22:54

我有一个混合字母和数字的字符串：

"The sample is 22mg"

我想分割字符串，其中一个数字后面紧跟着像这样的字母：

"The sample is 22 mg"

我试过这个：

gsub('[0-9]+[[aA-zZ]]', '[0-9]+ [[aA-zZ]]', 'This is a test 22mg')

但没有得到理想的结果。

有什么建议么？

您需要在替换中使用正则表达式和组引用中的捕获括号。例如：

gsub('([0-9])([[:alpha:]])', '1 2', 'This is a test 22mg')

这里没有特定的R; regex和gsub的R帮助应该有些用处。

你需要反向引用：

test <- "The sample is 22mg"
> gsub("([0-9])([a-zA-Z])","1 2",test)
[1] "The sample is 22 mg"

括号中的任何内容都会被记住。然后它们被 1（对于parens中的第一个实体）， 2等访问。第一个反斜杠在R中转义反斜杠的解释，以便它传递给正则表达式解析器。

链接地址: http://www.djcxy.com/p/38311.html