直接在C ++表达式中使用正则表达式捕获

2018-06-26 18:01:41

我试图直接在正则表达式中使用捕获的组。但是，当我尝试这样做时，程序无限期地挂起。

例如：

string input = "<Tag>blahblah</Tag>";
regex r1("<([a-zA-Z]+)>[a-z]+</1>");
string result = regex_replace(result, regex, "");

如果我添加另一个斜杠到捕获"<([a-zA-Z]+)>[az]</1>" ，程序编译但会抛出一个“regex_error（regex_constants :: error_backref）”异常。

笔记：
编译器：Apple LLVM 5.1
我将此作为从文本块中清除垃圾的过程的一部分。该文档不一定是HTML / XML，所需的文本并不总是在标签内。所以如果可能的话，我希望能够用正则表达式来做到这一点，而不是解析器。

字符串文字中的反斜杠字符是一个转义字符。

或者转义"<([a-zA-Z]+)>[az]+</1>"或使用原始文字， R"(<([a-zA-Z]+)>[az]+</1>)"

有了这个，你的程序就可以像你期望的那样工作：

#include <regex>
#include <iostream>

int main()
{
    std::string input = "Hello<Tag>blahblah</Tag> World";
    std::regex r1("<([a-zA-Z]+)>[a-z]+</1>");
    std::string result = regex_replace(input, r1, "");

    std::cout << "The result is '" << result << "'n";
}

演示：http://coliru.stacked-crooked.com/a/ae20b09d46f975e9

你用1得到的异常表明你的编译器被配置为使用GNU libstdc ++，其中regex没有实现。查找如何设置它以使用LLVM libc ++或使用boost.regex。

链接地址: http://www.djcxy.com/p/74815.html

上一篇: Using a regex capture directly in expression in C++

下一篇: Repeating numbered capture groups in Perl