c#:使用Regex在一长串文本中匹配4个数字

时间:2022-09-13 08:31:27

Below is a text file not code,

下面是一个文本文件而不是代码,

Register: 0x0090 = 0x009D  //blah blah blah blah
Register: 0x0091 = 0x03F6 //blah blah blah blah
Register: 0x0092 = 0x0048  //blah blah blah blah
Register: 0x0093 = 0x00C8  //blah blah blah blah

I need to extract the register contents, excluding the "0x".I've been going crazy trying to solve this, I have come up with two solutions, both are close to working I guess. I've been using Regex as its what I've learned so far, if you explain another method please give a good explanation.

我需要提取寄存器内容,不包括“0x”。我一直在努力解决这个问题,我想出了两种解决方案,都接近于工作。到目前为止,我一直在使用Regex作为我学到的东西,如果您解释另一种方法,请给出一个好的解释。

To get the line I want, I am using StreamReader, assuming I want the third line I would do it like this,

为了得到我想要的直线,我使用StreamReader,假设我想要第三条直线,

stringLine1 = stringLine1 + objReader.ReadLine() + "\r\n";
stringLine2 = stringLine2 + objReader.ReadLine() + "\r\n";
stringLine3 = stringLine3 + objReader.ReadLine() + "\r\n";

Using Regex, solution 1:

使用正则表达式,解决方案1:

stringLine3 = Regex.Match(stringLine3, @"[^Register: 0x0092 = 0x][0-9A-Z]+").Value;

Problem with this method, is if register has content 0028,-it doesn't read the 2!

这个方法的问题是,如果寄存器有内容0028,-它不读取2!

Solution 2:

解决方案2:

stringLine3 = Regex.Match(stringLine3, @"(?<=x)\d{4}").Value;

So this is a positive lookbehind which grabs 4 numbers proceeded by an 'x', The problem ofcourse is that it is grabbing the register number instead of the contents....

这是一个积极的抓住4个数字开始的向后插入一个“x”,这个问题当然是抓住....注册号码而不是内容

Any suggestions on how to fix this or do it better?

对于如何解决这个问题或者做得更好有什么建议吗?

2 个解决方案

#1


1  

Using a lookbehind :

使用一个向后插入:

(?<== 0x)[0-9A-F]{4}

Or using a group :

或使用团体:

^Register: 0x[0-9A-F]{4} = 0x([0-9A-F]{4})

In this second case, you must retrieve the first group instead of the whole match.

在第二种情况下,您必须检索第一个组,而不是整个匹配。

#2


1  

The fist regex you have is wrong. It should be like

你的第一个正则表达式是错误的。它应该像

Register: 0x0092 = 0x([0-9A-Z]+)

Here the content of the register is captured in group 1.

在这里,寄存器的内容在第1组中被捕获。

  • ([0-9A-Z]+) Matches digits or caps, captures in group 1.
  • ([0-9A-Z]+)匹配数字或大写,在第1组中捕获。

Regex Demo

Regex演示


What is wrong in the regex 1?

regex 1有什么问题吗?

  • [^Register: 0x0092 = 0x] This, [] is a character class. Which means that it tries to match anything other than R or e or g or etc. This anything other is becuase you put a ^ which negates the class. If you remove the ^, it will match anything in the character class.
  • [^注册:0 x0092 = 0 x],[]是一个字符类。这意味着它试图匹配任何其他比R e g或者等等。这个其他因为你把一个否定类^。如果你删除^,它将匹配任何字符类。

An example code can be written as

示例代码可以写成

String stringLine3 = "Register: 0x0092 = 0x0048  //blah blah blah blah";
Match match = Regex.Match(stringLine3, @"Register: 0x0092 = 0x([0-9A-Z]+)");
System.Console.WriteLine(match.Groups[1]);
// 0x0048
  • match.Groups[1] Gets the string captured by group 1. Where as match.Groups[0] will have the entire match.
  • 匹配。组[1]获取组1捕获的字符串。而比赛。组[0]将拥有整个匹配。

#1


1  

Using a lookbehind :

使用一个向后插入:

(?<== 0x)[0-9A-F]{4}

Or using a group :

或使用团体:

^Register: 0x[0-9A-F]{4} = 0x([0-9A-F]{4})

In this second case, you must retrieve the first group instead of the whole match.

在第二种情况下,您必须检索第一个组,而不是整个匹配。

#2


1  

The fist regex you have is wrong. It should be like

你的第一个正则表达式是错误的。它应该像

Register: 0x0092 = 0x([0-9A-Z]+)

Here the content of the register is captured in group 1.

在这里,寄存器的内容在第1组中被捕获。

  • ([0-9A-Z]+) Matches digits or caps, captures in group 1.
  • ([0-9A-Z]+)匹配数字或大写,在第1组中捕获。

Regex Demo

Regex演示


What is wrong in the regex 1?

regex 1有什么问题吗?

  • [^Register: 0x0092 = 0x] This, [] is a character class. Which means that it tries to match anything other than R or e or g or etc. This anything other is becuase you put a ^ which negates the class. If you remove the ^, it will match anything in the character class.
  • [^注册:0 x0092 = 0 x],[]是一个字符类。这意味着它试图匹配任何其他比R e g或者等等。这个其他因为你把一个否定类^。如果你删除^,它将匹配任何字符类。

An example code can be written as

示例代码可以写成

String stringLine3 = "Register: 0x0092 = 0x0048  //blah blah blah blah";
Match match = Regex.Match(stringLine3, @"Register: 0x0092 = 0x([0-9A-Z]+)");
System.Console.WriteLine(match.Groups[1]);
// 0x0048
  • match.Groups[1] Gets the string captured by group 1. Where as match.Groups[0] will have the entire match.
  • 匹配。组[1]获取组1捕获的字符串。而比赛。组[0]将拥有整个匹配。