如何在C#中使用RegEx或LINQ查找字符串中的最后2位数字的最佳选项?

时间:2022-09-27 19:28:25

I am trying to get the last 2 digits in a string with RegEx or LINQ. For example I got these strings:

我试图用RegEx或LINQ获取字符串中的最后2位数。例如,我得到了这些字符串:

N43OET28W -> result should be 28
N1OET86W  -> result should be 86
S02CT55A  -> result should be 55
M4AKT99A  -> result should be 99
1W24ET39W -> result should be 39
S03KT45A  -> result should be 45
M1AKT23A  -> result should be 23
N1OET35W  -> result should be 35
N12FET42W -> result should be 42
MAKTFDAAD -> result should be null or 0
N3XUK407Q -> result should be 07
MAKT23A   -> result should be 23

For now I tried this code:

现在我尝试了这段代码:

  getIntPattern("N1WET99W");
  getIntPattern("S03KT45A");
  getIntPattern("M1AKT23A");
  getIntPattern("N1OET35W");
  getIntPattern("N1OET42W");
  getIntPattern("MAKTFDAAD");
  getIntPattern("N12FET42W");

  private int getIntPattern(string text)
  {
    int result = 0;
    string m = Regex.Matches(text, @".*?\d+.*?(\d+)")
               .Cast<Match>()
               .Select(match => match.Groups[1].Value).First();
    int.TryParse(m, out result);
    return result;
  }

Is there a better way to achieve this? The input string doesn't have the same length, and can contain more digits at the beginning. I only need the last two digits.

有没有更好的方法来实现这一目标?输入字符串的长度不同,并且可以在开头包含更多数字。我只需要最后两位数字。

6 个解决方案

#1


2  

You can try Linq: try each 2-letter substring starting from string's end:

您可以尝试Linq:从字符串的结尾开始尝试每个2个字母的子字符串:

  string source = "N43OET28W";

  string result = Enumerable
    .Range(2, source.Length - 1)
    .Select(index => source.Substring(source.Length - index, 2))
    .Where(item => item.All(c => char.IsDigit(c)))
    .FirstOrDefault();

If you are looking for speed, say, you have many items to analyze I suggest for loop:

如果你正在寻找速度,比方说,你有许多项目要分析我建议循环:

  int result = -1;
  int last = -1;

  for (int i = source.Length - 1; i >= 0; --i) {
    int current = source[i] - '0';

    if (current >= 0 && current <= 9)
      if (last >= 0 && last <= 9) {
        result = current * 10 + last;

        break;
      }
      else
        last = current;
    else
      last = -1;
  }

#2


2  

I'd use this method:

我会用这个方法:

public static int? GetLastDigits(string text, int maxDigits = int.MaxValue)
{
    var digits = new Stack<char>();  // Last-in-First-out because we iterate backwards
    for (int i = text.Length - 1; i >= 0; i--)
    {
        if (char.IsDigit(text[i]))
            digits.Push(text[i]);
        else if (digits.Count > 0)
            break;
        if (digits.Count == maxDigits)
            break;
    }

    if (digits.Count == 0)
        return null;
    return int.Parse(string.Concat(digits));
}

#3


2  

Your combination of LINQ and regex can be reduced to just regex by some careful design of the regular expression:

通过仔细设计正则表达式,可以将LINQ和正则表达式的组合简化为正则表达式:

private static int? GetIntPattern(string text) {
    var m = Regex.Match(text, @"(\d{2})\D*$");
    int res;
    return m != null && int.TryParse(m.Groups[1].Value, out res) ? (int?)res : null;
}

The idea is to "anchor" the two digits to the end of the string with \D*$, making it the last two available digits.

我们的想法是用\ D * $将两个数字“锚定”到字符串的末尾,使其成为最后两个可用数字。

Demo.

演示。

#4


0  

here are my two cents. This can be easily achieved with regex and bit of Parallel Linq if performance is key stakeholder.

这是我的两分钱。如果性能是关键利益相关者,那么使用正则表达式和Parallel Linq可以很容易地实现这一点。

assuming data holds list of text needed to be mined.

假设数据包含需要挖掘的文本列表。

var rx = new Regex(
    "(\\d{2})(?:(?=\\D$)|$)",
    RegexOptions.Compiled | RegexOptions.Singleline);

1)

1)

foreach (var item in data)
{
    var match = rx.Match(item);
    var value = match?.Groups[1].Value;

    Console.WriteLine(string.IsNullOrEmpty(value)?"0":value);
}

2)

2)

var result = data.AsParallel().Select(
    x =>
    {
        var match = rx.Match(x);
        var value = match?.Groups[1].Value;
        return string.IsNullOrEmpty(value) ? "0" : value;
    }).ToArray();

foreach (var s in result)
{
    Console.WriteLine(s);
}

It will emit on screen:

它将在屏幕上发出:

28
86
55
99
39
45
23
35
42
0
07
23

See full demo at ideone

查看ideone的完整演示

#5


0  

This isn't terribly efficient, but given a string extension function Reverse

这不是非常有效,但给出了字符串扩展函数Reverse

public static string Reverse(this string s) => new String(s.ToCharArray().Reverse().ToArray());

You can pull out the last two digits as the first two digits with a regular expression:

您可以使用正则表达式将最后两位数作为前两位数字拉出:

private int getIntPattern(string text) => Convert.ToInt32("0"+Regex.Match(text.Reverse(), @"\d{2}").Value.Reverse());

I think most efficient is going to be just scan backwards and find what you are looking for:

我认为最有效率的只是向后扫描并找到你要找的东西:

private int getIntPattern(string text) {
    for (var pos = text.Length - 1; pos > 0; --pos)
        if (Char.IsDigit(text[pos]) && Char.IsDigit(text[pos - 1]))
            return Convert.ToInt32(text.Substring(pos-1, 2));
    return 0;
}

#6


0  

You can add an end anchor, $, to start the search from end of string...

您可以添加结束锚$,从字符串末尾开始搜索...

string pattern = @"[0-9]{2}(?=[A-Za-z]*$)";
string input = "N43OET28W";
var matches = Regex.Matches(input, pattern);

Explanation: Start from end of string, ignore characters, take 2 digits.

说明:从字符串结尾开始,忽略字符,取2位数。

q(?=u): Positive lookahead, match q if it is followed by u

q(?= u):正向前瞻,如果后面跟着你,则匹配q

$: end of string anchor

$:字符串锚点结束

[0-9]{2}: exactly 2 digits

[0-9] {2}:正好是2位数

#1


2  

You can try Linq: try each 2-letter substring starting from string's end:

您可以尝试Linq:从字符串的结尾开始尝试每个2个字母的子字符串:

  string source = "N43OET28W";

  string result = Enumerable
    .Range(2, source.Length - 1)
    .Select(index => source.Substring(source.Length - index, 2))
    .Where(item => item.All(c => char.IsDigit(c)))
    .FirstOrDefault();

If you are looking for speed, say, you have many items to analyze I suggest for loop:

如果你正在寻找速度,比方说,你有许多项目要分析我建议循环:

  int result = -1;
  int last = -1;

  for (int i = source.Length - 1; i >= 0; --i) {
    int current = source[i] - '0';

    if (current >= 0 && current <= 9)
      if (last >= 0 && last <= 9) {
        result = current * 10 + last;

        break;
      }
      else
        last = current;
    else
      last = -1;
  }

#2


2  

I'd use this method:

我会用这个方法:

public static int? GetLastDigits(string text, int maxDigits = int.MaxValue)
{
    var digits = new Stack<char>();  // Last-in-First-out because we iterate backwards
    for (int i = text.Length - 1; i >= 0; i--)
    {
        if (char.IsDigit(text[i]))
            digits.Push(text[i]);
        else if (digits.Count > 0)
            break;
        if (digits.Count == maxDigits)
            break;
    }

    if (digits.Count == 0)
        return null;
    return int.Parse(string.Concat(digits));
}

#3


2  

Your combination of LINQ and regex can be reduced to just regex by some careful design of the regular expression:

通过仔细设计正则表达式,可以将LINQ和正则表达式的组合简化为正则表达式:

private static int? GetIntPattern(string text) {
    var m = Regex.Match(text, @"(\d{2})\D*$");
    int res;
    return m != null && int.TryParse(m.Groups[1].Value, out res) ? (int?)res : null;
}

The idea is to "anchor" the two digits to the end of the string with \D*$, making it the last two available digits.

我们的想法是用\ D * $将两个数字“锚定”到字符串的末尾,使其成为最后两个可用数字。

Demo.

演示。

#4


0  

here are my two cents. This can be easily achieved with regex and bit of Parallel Linq if performance is key stakeholder.

这是我的两分钱。如果性能是关键利益相关者,那么使用正则表达式和Parallel Linq可以很容易地实现这一点。

assuming data holds list of text needed to be mined.

假设数据包含需要挖掘的文本列表。

var rx = new Regex(
    "(\\d{2})(?:(?=\\D$)|$)",
    RegexOptions.Compiled | RegexOptions.Singleline);

1)

1)

foreach (var item in data)
{
    var match = rx.Match(item);
    var value = match?.Groups[1].Value;

    Console.WriteLine(string.IsNullOrEmpty(value)?"0":value);
}

2)

2)

var result = data.AsParallel().Select(
    x =>
    {
        var match = rx.Match(x);
        var value = match?.Groups[1].Value;
        return string.IsNullOrEmpty(value) ? "0" : value;
    }).ToArray();

foreach (var s in result)
{
    Console.WriteLine(s);
}

It will emit on screen:

它将在屏幕上发出:

28
86
55
99
39
45
23
35
42
0
07
23

See full demo at ideone

查看ideone的完整演示

#5


0  

This isn't terribly efficient, but given a string extension function Reverse

这不是非常有效,但给出了字符串扩展函数Reverse

public static string Reverse(this string s) => new String(s.ToCharArray().Reverse().ToArray());

You can pull out the last two digits as the first two digits with a regular expression:

您可以使用正则表达式将最后两位数作为前两位数字拉出:

private int getIntPattern(string text) => Convert.ToInt32("0"+Regex.Match(text.Reverse(), @"\d{2}").Value.Reverse());

I think most efficient is going to be just scan backwards and find what you are looking for:

我认为最有效率的只是向后扫描并找到你要找的东西:

private int getIntPattern(string text) {
    for (var pos = text.Length - 1; pos > 0; --pos)
        if (Char.IsDigit(text[pos]) && Char.IsDigit(text[pos - 1]))
            return Convert.ToInt32(text.Substring(pos-1, 2));
    return 0;
}

#6


0  

You can add an end anchor, $, to start the search from end of string...

您可以添加结束锚$,从字符串末尾开始搜索...

string pattern = @"[0-9]{2}(?=[A-Za-z]*$)";
string input = "N43OET28W";
var matches = Regex.Matches(input, pattern);

Explanation: Start from end of string, ignore characters, take 2 digits.

说明:从字符串结尾开始,忽略字符,取2位数。

q(?=u): Positive lookahead, match q if it is followed by u

q(?= u):正向前瞻,如果后面跟着你,则匹配q

$: end of string anchor

$:字符串锚点结束

[0-9]{2}: exactly 2 digits

[0-9] {2}:正好是2位数