用于匹配整个单词的Glib regex ?

时间:2021-09-20 04:41:03

For matching a whole word, the regex \bword\b should suffice. Yet the following code always returns 0 matches

为了匹配整个单词,regex \bword\b应该足够了。但是下面的代码总是返回0个匹配

try {
        string pattern = "\bhtml\b";
        Regex wordRegex = new Regex (pattern, RegexCompileFlags.CASELESS, RegexMatchFlags.NOTEMPTY);
        MatchInfo matchInfo;
        string lineOfText = "<!DOCTYPE html PUBLIC \"-//W3C//DTD XHTML 1.0 Transitional//EN\" \"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd\">";

        wordRegex.match (lineOfText, RegexMatchFlags.NOTEMPTY, out matchInfo);
        stdout.printf ("Match count is: %d\n", matchInfo.get_match_count ());
    } catch (RegexError regexError) {
        stderr.printf ("Regex error: %s\n", regexError.message);
    }

This should be working as testing the \bhtml\b pattern returns one match for the provided string in testing engines. But on this program it returns 0 matches. Is the code wrong? What regex in Glib would be used to match a whole word?

这应该可以测试\bhtml\b模式返回一个匹配的测试引擎中提供的字符串。但是在这个程序中,它返回0个匹配。的代码是错误的吗?Glib中的regex将用于匹配整个单词?

2 个解决方案

#1


1  

It looks like you have to escape the backslash too:

看起来你也得避开反斜线:

try {
        string pattern = "\\bhtml\\b";
        Regex wordRegex = new Regex (pattern, RegexCompileFlags.CASELESS, RegexMatchFlags.NOTEMPTY);
        MatchInfo matchInfo;
        string lineOfText = "<!DOCTYPE html PUBLIC \"-//W3C//DTD XHTML 1.0 Transitional//EN\" \"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd\">";

        wordRegex.match (lineOfText, RegexMatchFlags.NOTEMPTY, out matchInfo);
        stdout.printf ("Match count is: %d\n", matchInfo.get_match_count ());
    } catch (RegexError regexError) {
        stderr.printf ("Regex error: %s\n", regexError.message);
    }

Output:

输出:

Match count is: 1

Demo

演示

#2


1  

You can simplify your code with regular expression literals:

可以使用正则表达式文字简化代码:

Regex regex = /\bhtml\b/i;

You don't have to quote backslashes in the regular expression literal syntax. (Front slashes would be problematic though.)

您不必在正则表达式字面语法中引用反斜杠。(不过,前面的斜杠会有问题。)

Full example:

完整的例子:

void test_match (string text, Regex regex) {
    MatchInfo match_info;
    if (regex.match (text, RegexMatchFlags.NOTEMPTY, out match_info)) {
        stdout.printf ("Match count is: %d\n", match_info.get_match_count ());
    }
    else {
        stdout.printf ("No match");
    }
}

int main () {
    Regex regex = /\bhtml\b/i;
    test_match ("<!DOCTYPE html PUBLIC>", regex);

    return 0;
}

#1


1  

It looks like you have to escape the backslash too:

看起来你也得避开反斜线:

try {
        string pattern = "\\bhtml\\b";
        Regex wordRegex = new Regex (pattern, RegexCompileFlags.CASELESS, RegexMatchFlags.NOTEMPTY);
        MatchInfo matchInfo;
        string lineOfText = "<!DOCTYPE html PUBLIC \"-//W3C//DTD XHTML 1.0 Transitional//EN\" \"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd\">";

        wordRegex.match (lineOfText, RegexMatchFlags.NOTEMPTY, out matchInfo);
        stdout.printf ("Match count is: %d\n", matchInfo.get_match_count ());
    } catch (RegexError regexError) {
        stderr.printf ("Regex error: %s\n", regexError.message);
    }

Output:

输出:

Match count is: 1

Demo

演示

#2


1  

You can simplify your code with regular expression literals:

可以使用正则表达式文字简化代码:

Regex regex = /\bhtml\b/i;

You don't have to quote backslashes in the regular expression literal syntax. (Front slashes would be problematic though.)

您不必在正则表达式字面语法中引用反斜杠。(不过,前面的斜杠会有问题。)

Full example:

完整的例子:

void test_match (string text, Regex regex) {
    MatchInfo match_info;
    if (regex.match (text, RegexMatchFlags.NOTEMPTY, out match_info)) {
        stdout.printf ("Match count is: %d\n", match_info.get_match_count ());
    }
    else {
        stdout.printf ("No match");
    }
}

int main () {
    Regex regex = /\bhtml\b/i;
    test_match ("<!DOCTYPE html PUBLIC>", regex);

    return 0;
}