
时间:2022-09-13 16:45:02

I've been trying to figure this out, but I don't think I understand Regex well enough to get to where I need to.


I have string that resemble these:


filename.txt(1)attribute, 2)attribute(s), more!)
otherfile.txt(abc, def)

Basically, a string that always starts with a filename, then has some text between parentheses. And I'm trying to extract that part which is between the main parentheses, but the text that's there can contain absolutely anything, even some more parentheses (it often does.)


Originally, there was a 'hacky' expression made like this:



And it worked, until we ran into a case where the input string contained a @ and we were stuck. Obviously...


I can't change the way the strings are generated, it's always a filename, then some parentheses and something of unknown length and content inside.


I'm hoping for a simple Regex expression, since I need this to work in both C# and in Perl -- is such a thing possible? Or does this require something more complex, like its own parsing method?


2 个解决方案



You can change exception for @ symbol in your regex to regex matches any characters and add quantifier that matches from 0 to infinity symbols. And also simplify your regex by deleting group construction:



Here is the explanation for the regular expression:


  • Symbol \( matches the character ( literally.
  • 符号\(匹配字符)
  • .* matches any character (except for line terminators)
  • .*匹配任何字符(行终止符除外)
  • * quantifier matches between zero and unlimited times, as many times as possible, giving back as needed (greedy)
  • *量词匹配在零到无限次之间,尽可能多地匹配,根据需要返回(贪婪)
  • \) matches the character ) literally.
  • \)匹配字符)字面上。

You can use regex101 to compose and debug your regular expressions.




Regex seems overkill to me in this case. Can be more reliably achieved using string manipulation methods.


int first = str.IndexOf("(");
int last = str.LastIndexOf(")");
if (first != -1 && last != -1)
    string subString = str.Substring(first + 1, last - first - 1);

I've never used Perl, but I'll venture a guess that it has equivalent methods.




You can change exception for @ symbol in your regex to regex matches any characters and add quantifier that matches from 0 to infinity symbols. And also simplify your regex by deleting group construction:



Here is the explanation for the regular expression:


  • Symbol \( matches the character ( literally.
  • 符号\(匹配字符)
  • .* matches any character (except for line terminators)
  • .*匹配任何字符(行终止符除外)
  • * quantifier matches between zero and unlimited times, as many times as possible, giving back as needed (greedy)
  • *量词匹配在零到无限次之间,尽可能多地匹配,根据需要返回(贪婪)
  • \) matches the character ) literally.
  • \)匹配字符)字面上。

You can use regex101 to compose and debug your regular expressions.




Regex seems overkill to me in this case. Can be more reliably achieved using string manipulation methods.


int first = str.IndexOf("(");
int last = str.LastIndexOf(")");
if (first != -1 && last != -1)
    string subString = str.Substring(first + 1, last - first - 1);

I've never used Perl, but I'll venture a guess that it has equivalent methods.
