Bash / DOS / PowerShell脚本列出最新版本的文件?

时间:2021-04-24 07:37:49

We have a list of (let's say 50) reports that get dumped into various folders depending on certain conditions. All the reports have standard names eg. D099C.LIS, D18A0.LIS etc.

我们有一个(比方说50个)报告列表,这些报告会根据特定条件转储到各种文件夹中。所有报告都有标准名称,例如。 D099C.LIS,D18A0.LIS等

Sometimes a report can exist in up to 5 different locations, and I need to generate a list of all the locations of the most recent version of each report.

有时报告最多可以存在于5个不同的位置,我需要生成每个报告的最新版本的所有位置的列表。

I can do it easily using code, or redirecting "dir" or "ls" output into a text file and then manipulating it in Excel, but I'd prefer a simpler (hopefully a one-liner) solution either using DOS, bash, or PowerShell.

我可以轻松地使用代码,或者将“dir”或“ls”输出重定向到文本文件然后在Excel中操作它,但我更喜欢使用DOS,bash,更简单(希望是单行)解决方案。或PowerShell。

The best I've come up with so far in PowerShell (I've done something similar using bash) is:

到目前为止我在PowerShell中做到的最好(我用bash做过类似的事情)是:

ls -r -fi *.lis | sort @{expression={$_.Name}}, @{expression={$_.LastWriteTime};Descending=$true} | select Directory, Name, lastwritetime

That will recursively list all files with *.lis extension, then sort it by name (asc) and date (desc), and then display the directory, name, and date.

这将以递归方式列出所有带有* .lis扩展名的文件,然后按名称(asc)和日期(desc)对其进行排序,然后显示目录,名称和日期。

This gives this sort of output:

这给出了这种输出:

C:\reports\LESE            D057A.LIS                  28/01/2009 09:00:43
C:\reports\JCSW            D057A.LIS                  27/01/2009 10:50:21
C:\reports\ALID            D075A.LIS                  04/02/2009 12:34:12
C:\reports\JCSW            D075B.LIS                  05/02/2009 10:07:15
C:\reports\ALID            D075B.LIS                  30/01/2009 09:14:57
C:\reports\BMA3            D081A.LIS                  01/09/2008 14:51:36

What I obviously need to do now is remove the files that aren't the most recent versions, so that the output looks like this (not too worried about formatting yet):

我现在显然需要做的是删除不是最新版本的文件,以便输出看起来像这样(不太担心格式化):

C:\reports\LESE            D057A.LIS                  28/01/2009 09:00:43
C:\reports\JCSW            D075B.LIS                  05/02/2009 10:07:15
C:\reports\BMA3            D081A.LIS                  01/09/2008 14:51:36

Anyone have any ideas?

有人有主意吗?

[edit] Some good ideas and answers to this question. Unfortunately I can't mark all as accepted, but EBGreen's (edited) answer worked without modification. I'll add working solutions here as I verify them.

[编辑]这个问题的一些好主意和答案。不幸的是,我不能将所有标记都接受,但EBGreen的(编辑过的)答案没有修改。在我验证它们时,我会在这里添加工作解决方案。

bash:

 ls -lR --time-style=long-iso | awk 'BEGIN{OFS="\t"}{print $5,$6,$7,$8}' | grep ".LIS" | sort -k4 -k2r -k3r | uniq -f3
 ls -lR --time-style=long-iso | awk 'BEGIN{OFS="\t"}{print $5,$6,$7,$8}' | grep ".LIS" | sort -k4 -k2r -k3r | awk '!x[$4]++'

PowerShell:

  ls -r -fi *.lis | sort @{expression={$_.Name}}, @{expression={$_.LastWriteTime};Descending=$true} | select Directory, Name, lastwritetime | Group-Object Name | %{$_.Group | Select -first 1}
  ls -r . *.lis | sort -desc LastWriteTime | group Name | %{$_.Group[0]} | ft Directory,Name,LastWriteTime
  ls -r -fi *.lis | sort @{expression={$_.Name}}, @{expression={$_.LastWriteTime};Descending=$true} | unique | ft Directory,Name,LastWriteTime

8 个解决方案

#1


ls -r -fi *.lis | sort @{expression={$_.Name}}, @{expression={$_.LastWriteTime};Descending=$true} | select Directory, Name, lastwritetime | Group-Object Name | %{$_.Group | Select -first 1}

#2


In bash you could pipe your answers through uniq. I'm not sure of the exact structure for the results of your bash 1-liner but the right arguments to -w N and -s N ought to do it.

在bash中你可以通过uniq管道你的答案。我不确定你的bash 1-liner结果的确切结构,但-w N和-s N的正确参数应该这样做。

#3


Another alternative in PowerShell, more "script" like:

PowerShell中的另一个替代方案,更像“脚本”,如:

ls -r . *.lis | sort LastWriteTime | %{$f=@{}} {$f[$_.Name]=$_} {$f.Values} | ft Directory,Name,LastWriteTime
  1. get the files recursively
  2. 递归获取文件

  3. sort them ascending by last write time
  4. 按上次写入时间对它们进行排序

  5. initialize a hashmap (associative array)
  6. 初始化hashmap(关联数组)

  7. for each file assign it using the name as key - later entries will overwrite previous ones
  8. 对于每个文件,使用名称作为键分配它 - 以后的条目将覆盖以前的条目

  9. get the Values of the hashmap (excluding keys)
  10. 获取hashmap的值(不包括键)

  11. format as a table
  12. 格式化为表格

Note, the FileInfo objects are retained throughout the pipeline. You can still access any property/method of the objects or format them any way you like.

注意,FileInfo对象在整个管道中保留。您仍然可以访问对象的任何属性/方法或以您喜欢的方式格式化它们。

#4


The problem seems to be finding unique based on particular field. awk can be used to solve this problem. Saw this blog entry which has one approach. For eg, in bash one could do:

问题似乎是基于特定领域找到独特的。 awk可以用来解决这个问题。看到这个博客条目有一种方法。例如,在bash中可以做到:

find . -name "*.lis" -print | xargs ls -tr | awk -F/ '!x[$NF]++'

找 。 -name“* .lis”-print | xargs ls -tr | awk -F /'!x [$ NF] ++'

#5


Powershell:

ls -r . *.lis | sort -desc LastWriteTime | sort -u Name | ft Directory,Name,LastWriteTime

Explanation:

  1. get the files recursively
  2. 递归获取文件

  3. sort the files descending by LastWriteTime
  4. 按LastWriteTime对文件进行排序

  5. sort the files by Name, selecting unique files (only the first).
  6. 按名称对文件进行排序,选择唯一文件(仅限第一个)。

  7. format the resulting FileInfo objects in a table with Directory, Name and Time
  8. 在包含目录,名称和时间的表中格式化生成的FileInfo对象

Alternative which does not rely on sort being stable:

不依赖于排序稳定的替代方案:

ls -r . *.lis | sort -desc LastWriteTime | group Name | %{$_.Group[0]} | ft Directory,Name,LastWriteTime
  1. get the files recursively
  2. 递归获取文件

  3. sort the files descending by LastWriteTime
  4. 按LastWriteTime对文件进行排序

  5. group the files by name
  6. 按名称对文件进行分组

  7. for each group select the first (index zero) item of the group
  8. 对于每个组,选择该组的第一个(索引零)项

  9. format the resulting FileInfo objects in a table with Directory, Name and Time
  10. 在包含目录,名称和时间的表中格式化生成的FileInfo对象

#6


Can you use perl? Something like:

你能用perl吗?就像是:

your command | perl 'while (<STDIN>) { ($dir,$name,$date) = split; $hash{$name} = ($dir,$date);} foreach (keys %hash) { print "$hash{$}[0] $ $hash{$_}[1]\n"; }'

你的命令| perl'while( ){($ dir,$ name,$ date)= split; $ hash {$ name} =($ dir,$ date);} foreach(keys%hash){print“$ hash {$} [0] $ $ hash {$ _} [1] \ n”; }”

This could be wrong in the details (it's been too long since I used perl in anger) but the basic idea being to keep a hash of results keyed on filename and always overwriting the previous entry when encountering a new entry. That way, as long as the order of lines coming in is right, you'll only get the most recently touched files coming out.

这在细节上可能是错误的(因为我在愤怒中使用perl已经太久了)但基本的想法是保持结果的哈希值键入文件名并且在遇到新条目时总是覆盖前一个条目。这样,只要输入的行的顺序正确,您将只获得最近触摸的文件。

#7


ls -ARFlrt | awk '{print $6,$7,$8}'|grep 2010|sort -n

ls -ARFlrt | awk'{print $ 6,$ 7,$ 8}'| grep 2010 | sort -n

Was looking for similar. The above has helped me get listing I was after in bash. The grep is optional (of course). \thanks

正在寻找类似的。以上帮助我获得了我在bash中的列表。 grep是可选的(当然)。 \谢谢

#8


$f = ls -r -fi *.lis | sort name,lastWriteTime -desc

$ f = ls -r -fi * .lis |排序名称,lastWriteTime -desc

# Remove -whatIf to delete the files

$f[1..$f.length] | Remove-Item -whatIf

$ f [1 .. $ f.length] | Remove-Item -whatIf

#1


ls -r -fi *.lis | sort @{expression={$_.Name}}, @{expression={$_.LastWriteTime};Descending=$true} | select Directory, Name, lastwritetime | Group-Object Name | %{$_.Group | Select -first 1}

#2


In bash you could pipe your answers through uniq. I'm not sure of the exact structure for the results of your bash 1-liner but the right arguments to -w N and -s N ought to do it.

在bash中你可以通过uniq管道你的答案。我不确定你的bash 1-liner结果的确切结构,但-w N和-s N的正确参数应该这样做。

#3


Another alternative in PowerShell, more "script" like:

PowerShell中的另一个替代方案,更像“脚本”,如:

ls -r . *.lis | sort LastWriteTime | %{$f=@{}} {$f[$_.Name]=$_} {$f.Values} | ft Directory,Name,LastWriteTime
  1. get the files recursively
  2. 递归获取文件

  3. sort them ascending by last write time
  4. 按上次写入时间对它们进行排序

  5. initialize a hashmap (associative array)
  6. 初始化hashmap(关联数组)

  7. for each file assign it using the name as key - later entries will overwrite previous ones
  8. 对于每个文件,使用名称作为键分配它 - 以后的条目将覆盖以前的条目

  9. get the Values of the hashmap (excluding keys)
  10. 获取hashmap的值(不包括键)

  11. format as a table
  12. 格式化为表格

Note, the FileInfo objects are retained throughout the pipeline. You can still access any property/method of the objects or format them any way you like.

注意,FileInfo对象在整个管道中保留。您仍然可以访问对象的任何属性/方法或以您喜欢的方式格式化它们。

#4


The problem seems to be finding unique based on particular field. awk can be used to solve this problem. Saw this blog entry which has one approach. For eg, in bash one could do:

问题似乎是基于特定领域找到独特的。 awk可以用来解决这个问题。看到这个博客条目有一种方法。例如,在bash中可以做到:

find . -name "*.lis" -print | xargs ls -tr | awk -F/ '!x[$NF]++'

找 。 -name“* .lis”-print | xargs ls -tr | awk -F /'!x [$ NF] ++'

#5


Powershell:

ls -r . *.lis | sort -desc LastWriteTime | sort -u Name | ft Directory,Name,LastWriteTime

Explanation:

  1. get the files recursively
  2. 递归获取文件

  3. sort the files descending by LastWriteTime
  4. 按LastWriteTime对文件进行排序

  5. sort the files by Name, selecting unique files (only the first).
  6. 按名称对文件进行排序,选择唯一文件(仅限第一个)。

  7. format the resulting FileInfo objects in a table with Directory, Name and Time
  8. 在包含目录,名称和时间的表中格式化生成的FileInfo对象

Alternative which does not rely on sort being stable:

不依赖于排序稳定的替代方案:

ls -r . *.lis | sort -desc LastWriteTime | group Name | %{$_.Group[0]} | ft Directory,Name,LastWriteTime
  1. get the files recursively
  2. 递归获取文件

  3. sort the files descending by LastWriteTime
  4. 按LastWriteTime对文件进行排序

  5. group the files by name
  6. 按名称对文件进行分组

  7. for each group select the first (index zero) item of the group
  8. 对于每个组,选择该组的第一个(索引零)项

  9. format the resulting FileInfo objects in a table with Directory, Name and Time
  10. 在包含目录,名称和时间的表中格式化生成的FileInfo对象

#6


Can you use perl? Something like:

你能用perl吗?就像是:

your command | perl 'while (<STDIN>) { ($dir,$name,$date) = split; $hash{$name} = ($dir,$date);} foreach (keys %hash) { print "$hash{$}[0] $ $hash{$_}[1]\n"; }'

你的命令| perl'while( ){($ dir,$ name,$ date)= split; $ hash {$ name} =($ dir,$ date);} foreach(keys%hash){print“$ hash {$} [0] $ $ hash {$ _} [1] \ n”; }”

This could be wrong in the details (it's been too long since I used perl in anger) but the basic idea being to keep a hash of results keyed on filename and always overwriting the previous entry when encountering a new entry. That way, as long as the order of lines coming in is right, you'll only get the most recently touched files coming out.

这在细节上可能是错误的(因为我在愤怒中使用perl已经太久了)但基本的想法是保持结果的哈希值键入文件名并且在遇到新条目时总是覆盖前一个条目。这样,只要输入的行的顺序正确,您将只获得最近触摸的文件。

#7


ls -ARFlrt | awk '{print $6,$7,$8}'|grep 2010|sort -n

ls -ARFlrt | awk'{print $ 6,$ 7,$ 8}'| grep 2010 | sort -n

Was looking for similar. The above has helped me get listing I was after in bash. The grep is optional (of course). \thanks

正在寻找类似的。以上帮助我获得了我在bash中的列表。 grep是可选的(当然)。 \谢谢

#8


$f = ls -r -fi *.lis | sort name,lastWriteTime -desc

$ f = ls -r -fi * .lis |排序名称,lastWriteTime -desc

# Remove -whatIf to delete the files

$f[1..$f.length] | Remove-Item -whatIf

$ f [1 .. $ f.length] | Remove-Item -whatIf