如何在正则表达式中构造它

时间:2022-03-21 13:00:34

I have file names that end in yyyymmdd, eg: myFile.20090601, myFile20090708 , etc

我的文件名以yyyymmdd结尾,例如:myFile.20090601,myFile20090708等

I want to grep for a pattern in all files from June 08 to July 07 of 2009, ie: 20090609 to 20090707

我想在2009年6月8日到7月7日的所有文件中找到一个模式,即:20090609到20090707

How can I do a regex in one go?

我怎样才能一次性完成正则表达式?

I tried:

grep 'myPattern' *20090(6(09|[1-3][0-9])|70[1-7])

4 个解决方案

#1


20090(6(09|[1-3][0-9])|70[1-7])$

or

20090(6(0[89]|[1-3][0-9])|70[1-7])$

depending on whether you meant 8th or 9th of July (your question seems contradictory there).

取决于你是指7月8日或9日(你的问题似乎与此相矛盾)。

#2


grep 'myPattern' `ls | grep -E "20090(6(09|[1-3][0-9])|70[1-7])"`

This works roughly as follows. Take a list of files in the current directory (ls), filter that using the date regex (ls | grep ...), then perform a grep search using your pattern, on the list of files that is produced (grep 'myPattern' ...). The back-ticks surrounding the ls | grep ... executes that part of the command and substitutes in the output of that command into the surrounding command. So if it produced output like "file1 file2 file3", then it would result in a command like grep 'myPattern' file1 file2 file3.

这大致如下。获取当前目录(ls)中的文件列表,使用日期正则表达式(ls | grep ...)对其进行过滤,然后使用您的模式在生成的文件列表上执行grep搜索(grep'myPattern' ...)。围绕ls |的背蜱grep ...执行该命令的那一部分,并将该命令的输出替换为周围的命令。因此,如果它产生类似“file1 file2 file3”的输出,那么它将导致像grep'myPattern'file1 file2 file3这样的命令。

#3


I'd suggest a perl/python script (or any other scripting language) that takes 3 parameters:

我建议使用带有3个参数的perl / python脚本(或任何其他脚本语言):

  1. The pattern
  2. Start date as yyyymmdd
  3. 开始日期为yyyymmdd

  4. End date as yyyymmdd
  5. 结束日期为yyyymmdd

It would :

它会 :

  1. decode start and end date.
  2. 解码开始和结束日期。

  3. loop through the files in a folder
  4. 循环遍历文件夹中的文件

  5. decode any dates in the filename
  6. 解码文件名中的任何日期

  7. check if it's between the dates, and grep the pattern
  8. 检查它是否在日期之间,并且grep模式

#4


The range of valid dates is 06–30 for June and 01—07 for July. Because the ranges of days are dissimilar, we should use separate regexes for each month. These are

有效日期的范围是6月的06-30和7月的01-07。由于天数不同,我们应该每个月使用单独的正则表达式。这些是

/2009 06 (09 | [12][0-9] | 30)/x

(Notice how the day ranges are divided into cases depending on the tens place, because there are different conditions on what is valid for the units place depending.)

(请注意日期范围如何根据十位进行划分,因为对于单位所依据的有效内容有不同的条件。)

And

/2009 07 0[1-7]/x

and then we can join them into

然后我们可以加入他们

/(2009 06 (09 | [12][0-9] | 30)) | (2009 07 0[1-7])/x

and then factor out the common points (may not be the best for readabilty) and add the end-of-line assertion:

然后分解公共点(可能不是最好的可读性)并添加行尾断言:

/2009 0 (6 (09 | [12][0-9] | 30)) | (7 0[1-7]) $/x

#1


20090(6(09|[1-3][0-9])|70[1-7])$

or

20090(6(0[89]|[1-3][0-9])|70[1-7])$

depending on whether you meant 8th or 9th of July (your question seems contradictory there).

取决于你是指7月8日或9日(你的问题似乎与此相矛盾)。

#2


grep 'myPattern' `ls | grep -E "20090(6(09|[1-3][0-9])|70[1-7])"`

This works roughly as follows. Take a list of files in the current directory (ls), filter that using the date regex (ls | grep ...), then perform a grep search using your pattern, on the list of files that is produced (grep 'myPattern' ...). The back-ticks surrounding the ls | grep ... executes that part of the command and substitutes in the output of that command into the surrounding command. So if it produced output like "file1 file2 file3", then it would result in a command like grep 'myPattern' file1 file2 file3.

这大致如下。获取当前目录(ls)中的文件列表,使用日期正则表达式(ls | grep ...)对其进行过滤,然后使用您的模式在生成的文件列表上执行grep搜索(grep'myPattern' ...)。围绕ls |的背蜱grep ...执行该命令的那一部分,并将该命令的输出替换为周围的命令。因此,如果它产生类似“file1 file2 file3”的输出,那么它将导致像grep'myPattern'file1 file2 file3这样的命令。

#3


I'd suggest a perl/python script (or any other scripting language) that takes 3 parameters:

我建议使用带有3个参数的perl / python脚本(或任何其他脚本语言):

  1. The pattern
  2. Start date as yyyymmdd
  3. 开始日期为yyyymmdd

  4. End date as yyyymmdd
  5. 结束日期为yyyymmdd

It would :

它会 :

  1. decode start and end date.
  2. 解码开始和结束日期。

  3. loop through the files in a folder
  4. 循环遍历文件夹中的文件

  5. decode any dates in the filename
  6. 解码文件名中的任何日期

  7. check if it's between the dates, and grep the pattern
  8. 检查它是否在日期之间,并且grep模式

#4


The range of valid dates is 06–30 for June and 01—07 for July. Because the ranges of days are dissimilar, we should use separate regexes for each month. These are

有效日期的范围是6月的06-30和7月的01-07。由于天数不同,我们应该每个月使用单独的正则表达式。这些是

/2009 06 (09 | [12][0-9] | 30)/x

(Notice how the day ranges are divided into cases depending on the tens place, because there are different conditions on what is valid for the units place depending.)

(请注意日期范围如何根据十位进行划分,因为对于单位所依据的有效内容有不同的条件。)

And

/2009 07 0[1-7]/x

and then we can join them into

然后我们可以加入他们

/(2009 06 (09 | [12][0-9] | 30)) | (2009 07 0[1-7])/x

and then factor out the common points (may not be the best for readabilty) and add the end-of-line assertion:

然后分解公共点(可能不是最好的可读性)并添加行尾断言:

/2009 0 (6 (09 | [12][0-9] | 30)) | (7 0[1-7]) $/x