文件树的文本规范?

时间:2022-09-02 00:21:58

I'm looking for examples of specifying files in a tree structure, for example, for specifying the set of files to search in a grep tool. I'd like to be able to include and exclude files and directories by name matches. I'm sure there are examples out there, but I'm having a hard time finding them.

我正在寻找在树结构中指定文件的示例,例如,用于指定要在grep工具中搜索的文件集。我希望能够通过名称匹配来包含和排除文件和目录。我确信那里有例子,但我很难找到它们。

Here's an example of a possible syntax:

以下是可能语法的示例:

*.py *.html
*.txt *.js
-*.pyc
-.svn/
-*combo_*.js

(this would mean include file with extensions .py .html .txt .js, exclude .pyc files, anything under a .svn directory, and any file matching combo_.js)

(这意味着包含扩展名为.py .html .txt .js的文件,排除.pyc文件,.svn目录下的任何内容以及任何匹配combo_.js的文件)

I know I've seen these sorts of specifications in other tools before. Is this ringing any bells for anyone?

我知道我以前在其他工具中看到过这些规格。这对任何人来说都响了吗?

7 个解决方案

#1


There is no single standard format for this kind of thing, but if you want to copy something that is widely recognized, have a look at the rsync documentation. Look at the chapter on "INCLUDE/EXCLUDE PATTERN RULES."

这种事情没有单一的标准格式,但如果你想复制被广泛认可的东西,请查看rsync文档。请看“包含/排除模式规则”一章。

#2


Apache Ant provides 'ant globs or patterns where:

Apache Ant提供'ant globs或模式,其中:

**/foo/**/*.java

means "any file ending in '.java' in a directory which includes a directory named 'foo' in its path" -- including ./foo/X.java

表示“在目录中以'.java'结尾的任何文件,其目录中包含名为'foo'的目录” - 包括./foo/X.java

#3


In your example syntax, is it implicitly understood that there's an escaping character so that you can explicitly include a file that begins with a dash? (The same question goes for any other wildcard characters, but I suppose I'd expect to see more files with dashes in their names than asterisks.)

在您的示例语法中,是否隐式理解存在转义字符,以便您可以显式包含以破折号开头的文件? (同样的问题适用于任何其他通配符,但我想我希望看到更多带有破折号的文件,而不是星号。)

Various command shells use * (and possibly ? to match a single char), as in your example, but they generally only match against a string of characters that doesn't include a path component separator (i.e. '\' on Windows systems, '/' elsewhere). I've also seen such source control apps as Perforce use additional patterns that can match against path component separators. For instance, with Perforce the pattern "foo/...ext" (without quotes) will match all files under the foo/ directory structure that end with "ext", regardless of whether they are in foo/ itself or in one of its descendant directories. This seems to be a useful pattern.

各种命令shell使用*(并且可能?匹配单个char),如您的示例所示,但它们通常仅匹配不包含路径组件分隔符的字符串(即Windows系统上的'\',' /'其他地方)。我也看到像Perforce这样的源代码控制应用程序使用了可以与路径组件分隔符匹配的其他模式。例如,使用Perforce模式“foo / ... ext”(不带引号)将匹配以“ext”结尾的foo /目录结构下的所有文件,无论它们是在foo /本身还是在其中一个后代目录。这似乎是一种有用的模式。

#4


If you're using bash, you can use the extglob extension to get some nice globbing functions. Enable it as follows:

如果你正在使用bash,你可以使用extglob扩展来获得一些漂亮的globbing函数。启用它如下:

shopt -s extglob

Then you can do things like the following:

然后你可以做以下事情:

# everything but .html, .jpg or ,gif files
ls -d !(*.html|*gif|*jpg)
# list file9, file22 but not fileit
ls file+([0-9])
# begins with apl or un only
ls -d +(apl*|un*)

See also this page.

另见本页。

#5


How about find in unixish environments?

如何在unixish环境中找到?

Find can, of course, do more than build a list of files, but that is one of the common ways it is used. From the man page:

当然,查找可以做的不仅仅是构建文件列表,但这是它的常用方法之一。从手册页:

NAME find -- walk a file hierarchy

NAME find - 遍历文件层次结构

SYNOPSIS find [-H | -L | -P] [-EXdsx] [-f pathname] pathname ... expression find [-H | -L | -P] [-EXdsx] -f pathname [pathname ...] expression

大概找到[-H | -L | -P] [-EXdsx] [-f pathname] pathname ... expression find [-H | -L | -P] [-EXdsx] -f pathname [pathname ...]表达式

DESCRIPTION The find utility recursively descends the directory tree for each pathname listed, evaluating an expression (composed of the primaries'' andoperands'' listed below) in terms of each file in the tree.

说明find实用程序递归地下拉列出的每个路径名的目录树,根据树中的每个文件评估表达式(由下面列出的原色''和'opends''组成)。

to achieve your goal I would write something like (formatted for readability):

为了实现你的目标,我会写一些类似的(格式化为可读性):

find ./ \( -name *.{py,html,txt,js,pyc} -or \
           -name *combo_*.js -or \
           \( -name *.svn -and -type d\)\) \
           -print

Moreover there is a idomatic pattern using xargs which makes find suitable for sending the whole list so constructed to an arbitrary command as in:

此外,还有一个使用xargs的idomatic模式,这使得find适合于将如此构造的整个列表发送到任意命令,如:

find /path -type f -print0 | xargs -0 rm

#6


find(1) is a fine tool as described in the previous answer but if it gets more complicated, you should consider either writing your own script in any of the usual suspects (Ruby, Perl, Python et al.) or try to use one of the more powerful shells such as zsh which has a ** globbing commands and you can specify things to exclude. The latter is probably more complicated though.

find(1)是一个很好的工具,如前面的答案所述,但如果它变得更复杂,你应该考虑在任何常见的嫌疑人(Ruby,Perl,Python等)编写自己的脚本或尝试使用一个更强大的shell,比如zsh,它有一个** globbing命令,你可以指定要排除的东西。后者可能更复杂。

#7


You might want to check out ack, which allows you to specify file types to search in with options like --perl, etc.

您可能想要查看ack,它允许您指定要使用--perl等选项搜索的文件类型。

It also ignores .svn directories by default, as well as core dumps, editor cruft, binary files, and so on.

它还默认忽略.svn目录,以及核心转储,编辑器,二进制文件等。

#1


There is no single standard format for this kind of thing, but if you want to copy something that is widely recognized, have a look at the rsync documentation. Look at the chapter on "INCLUDE/EXCLUDE PATTERN RULES."

这种事情没有单一的标准格式,但如果你想复制被广泛认可的东西,请查看rsync文档。请看“包含/排除模式规则”一章。

#2


Apache Ant provides 'ant globs or patterns where:

Apache Ant提供'ant globs或模式,其中:

**/foo/**/*.java

means "any file ending in '.java' in a directory which includes a directory named 'foo' in its path" -- including ./foo/X.java

表示“在目录中以'.java'结尾的任何文件,其目录中包含名为'foo'的目录” - 包括./foo/X.java

#3


In your example syntax, is it implicitly understood that there's an escaping character so that you can explicitly include a file that begins with a dash? (The same question goes for any other wildcard characters, but I suppose I'd expect to see more files with dashes in their names than asterisks.)

在您的示例语法中,是否隐式理解存在转义字符,以便您可以显式包含以破折号开头的文件? (同样的问题适用于任何其他通配符,但我想我希望看到更多带有破折号的文件,而不是星号。)

Various command shells use * (and possibly ? to match a single char), as in your example, but they generally only match against a string of characters that doesn't include a path component separator (i.e. '\' on Windows systems, '/' elsewhere). I've also seen such source control apps as Perforce use additional patterns that can match against path component separators. For instance, with Perforce the pattern "foo/...ext" (without quotes) will match all files under the foo/ directory structure that end with "ext", regardless of whether they are in foo/ itself or in one of its descendant directories. This seems to be a useful pattern.

各种命令shell使用*(并且可能?匹配单个char),如您的示例所示,但它们通常仅匹配不包含路径组件分隔符的字符串(即Windows系统上的'\',' /'其他地方)。我也看到像Perforce这样的源代码控制应用程序使用了可以与路径组件分隔符匹配的其他模式。例如,使用Perforce模式“foo / ... ext”(不带引号)将匹配以“ext”结尾的foo /目录结构下的所有文件,无论它们是在foo /本身还是在其中一个后代目录。这似乎是一种有用的模式。

#4


If you're using bash, you can use the extglob extension to get some nice globbing functions. Enable it as follows:

如果你正在使用bash,你可以使用extglob扩展来获得一些漂亮的globbing函数。启用它如下:

shopt -s extglob

Then you can do things like the following:

然后你可以做以下事情:

# everything but .html, .jpg or ,gif files
ls -d !(*.html|*gif|*jpg)
# list file9, file22 but not fileit
ls file+([0-9])
# begins with apl or un only
ls -d +(apl*|un*)

See also this page.

另见本页。

#5


How about find in unixish environments?

如何在unixish环境中找到?

Find can, of course, do more than build a list of files, but that is one of the common ways it is used. From the man page:

当然,查找可以做的不仅仅是构建文件列表,但这是它的常用方法之一。从手册页:

NAME find -- walk a file hierarchy

NAME find - 遍历文件层次结构

SYNOPSIS find [-H | -L | -P] [-EXdsx] [-f pathname] pathname ... expression find [-H | -L | -P] [-EXdsx] -f pathname [pathname ...] expression

大概找到[-H | -L | -P] [-EXdsx] [-f pathname] pathname ... expression find [-H | -L | -P] [-EXdsx] -f pathname [pathname ...]表达式

DESCRIPTION The find utility recursively descends the directory tree for each pathname listed, evaluating an expression (composed of the primaries'' andoperands'' listed below) in terms of each file in the tree.

说明find实用程序递归地下拉列出的每个路径名的目录树,根据树中的每个文件评估表达式(由下面列出的原色''和'opends''组成)。

to achieve your goal I would write something like (formatted for readability):

为了实现你的目标,我会写一些类似的(格式化为可读性):

find ./ \( -name *.{py,html,txt,js,pyc} -or \
           -name *combo_*.js -or \
           \( -name *.svn -and -type d\)\) \
           -print

Moreover there is a idomatic pattern using xargs which makes find suitable for sending the whole list so constructed to an arbitrary command as in:

此外,还有一个使用xargs的idomatic模式,这使得find适合于将如此构造的整个列表发送到任意命令,如:

find /path -type f -print0 | xargs -0 rm

#6


find(1) is a fine tool as described in the previous answer but if it gets more complicated, you should consider either writing your own script in any of the usual suspects (Ruby, Perl, Python et al.) or try to use one of the more powerful shells such as zsh which has a ** globbing commands and you can specify things to exclude. The latter is probably more complicated though.

find(1)是一个很好的工具,如前面的答案所述,但如果它变得更复杂,你应该考虑在任何常见的嫌疑人(Ruby,Perl,Python等)编写自己的脚本或尝试使用一个更强大的shell,比如zsh,它有一个** globbing命令,你可以指定要排除的东西。后者可能更复杂。

#7


You might want to check out ack, which allows you to specify file types to search in with options like --perl, etc.

您可能想要查看ack,它允许您指定要使用--perl等选项搜索的文件类型。

It also ignores .svn directories by default, as well as core dumps, editor cruft, binary files, and so on.

它还默认忽略.svn目录,以及核心转储,编辑器,二进制文件等。