将Bash数组元素传递给Awk Regex表达式

时间:2022-10-09 15:41:54

I've found several questions on how to pass variables from bash into awk, most notably the -v command, but I can't quite seem to get them to do what I want.

我发现了几个关于如何将变量从bash传递到awk的问题,最值得注意的是-v命令,但我似乎无法让它们做我想做的事情。

Outside the script, the command I'm running is

在脚本之外,我正在运行的命令是

awk '$2 ~ /^\/var$/ { print $1 }' /etc/fstab

Which searches /etc/fstab for JUST the /var partition, and should either print out the physical mount point, or if there isn't one, nothing at all.

在/ etc / fstab中搜索/ var分区,并且应该打印出物理安装点,或者如果没有,则根本不打印。

Now inside the script I have an array that contains numerous partitions, and what I want to do is iterate through that array to search fstab for each physical mount point. The problem comes in at the fact that the elements in the array have a / in them.

现在在脚本中我有一个包含大量分区的数组,我想要做的是遍历该数组以搜索每个物理安装点的fstab。问题在于数组中的元素中包含/的事实。

So what I want to do (In horrifically incorrect awk) is:

所以我想做的事情(在可怕的错误awk中)是:

PARTITIONS=(/usr /home /var tmp);
for ((n=0; n<${#PARTITION[@]}; n++)); do
    cat /etc/fstab | awk '$2 ~ /^\${PARTITIONS[$n]}$/ { print $1 }';
done

But I know that that's not correct. The closest I have right now is:

但我知道这不正确。我现在最接近的是:

PARTITIONS=(/usr /home /var tmp);
for ((n=0; n<${#PARTITION[@]}; n++)); do
    cat /etc/fstab | awk -v partition="${PARTITIONS[$n]}" '$2 ~ /^\/var$/ { print $1," ",partition }';
done

Which at LEAST gets the partition variable into awk, but doesn't help me at all with matching it.

哪个在LEAST时将分区变量变为awk,但在匹配它时根本没有帮助我。

So basically, I need to feed the array in, and get the physical partitions back out. Eventually the results will be assigned to another array, but once I get the output I can go from there.

所以基本上,我需要输入数组,并将物理分区退出。最终结果将被分配给另一个数组,但是一旦我得到输出,我就可以从那里开始。

I also understand awk can remove the need for the cat at the beginning, but I don't know enough about awk to do that yet. :)

我也明白awk可以在开始时删除猫的需要,但我还不太了解awk这样做。 :)

Thanks for any help.

谢谢你的帮助。

EDIT

编辑

cat /etc/fstab | awk -v partition="${PARTITIONS[$n]}" '$2 ~ partition { print $1 }'

Approximates what I needed enough to be useful. I was focusing far too much on including the regex apparently. If anyone else could clean this up, it would be much appreciated :)

近似我需要的东西足够有用。我显然非常关注包括正则表达式。如果其他人可以清理它,将非常感谢:)

6 个解决方案

#1


2  

awk -v partition="${partitions[$n]}" '$2 ~ "^/" partition "$" { print $1 }' /etc/fstab

You can concatenate your regex characters (^ - beginning of string and $ - end of string) and the slash which is part of the partition name and the variable containing the partition name by placing them adjacent to each other. You don't need to use the slashes that delimit hard-coded regexes.

您可以连接正则表达式字符(^ - 字符串的开头和$ - 字符串的结尾)和作为分区名称一部分的斜杠和包含分区名称的变量,方法是将它们放在一起。您不需要使用分隔硬编码正则表达式的斜杠。

AWK will accept the filename as an argument without using cat to pipe it or using < to redirect it.

AWK将接受文件名作为参数,而不使用cat来管道它或使用 <来重定向它。< p>

I recommend using mixed or lowercase variable names in the shell as a habit to avoid potential name collisions with shell or environment variables.

我建议在shell中使用混合或小写变量名作为习惯,以避免与shell或环境变量发生潜在的名称冲突。

#2


2  

You could also pass the whole array into awk through -v, this assumes that the directory names do not contain spaces:

您也可以通过-v将整个数组传递给awk,这假设目录名称不包含空格:

PARTITIONS=(/usr /home /var /tmp)
awk -v partition="${PARTITIONS[*]}" \
  '$2 != "" && partition ~ $2"\\>" { print $1 }' /etc/fstab

This avoids the need for a for loop.

这避免了对for循环的需要。

Explanation

  • `partition="${PARTITIONS[*]}" passes in the whole array as space separated string.
  • `partition =“$ {PARTITIONS [*]}”将整个数组作为空格分隔的字符串传递。
  • $2 != "" means no empty lines are matched.
  • $ 2!=“”表示没有空行匹配。
  • partition ~ $2"\\>" matches $2 to the passed in string, \\> requires the match to be at the end of a word.
  • partition~ $ 2“\\>”匹配传入的字符串$ 2,\\>要求匹配位于单词的末尾。

#3


1  

First, to get the most annoying thing out of the way (GUoC), awk can work on a file just like cat, so just pass it directly. You can't pass whole arrays via -v unflattened, but since you're iterating over the items, it doesn't matter. If you want to avoid -v, you can pass bash variables by directly including them into awk scripts, you just have to be careful about the quoting (whitespace and awk's own $variable usage). Examples:

首先,为了让最烦人的事情(GUoC),awk可以像cat一样处理文件,所以直接传递它。你不能通过-v unflattened传递整个数组,但是因为你在迭代这些项目,所以没关系。如果你想避免使用-v,你可以通过直接将它们包含到awk脚本中来传递bash变量,你只需要注意引用(空格和awk自己的$ variable用法)。例子:

awk '$2 ~ "'${PARTITIONS[$n]}'" { print $1 }' /etc/fstab

Or the more complicated version with soft quotes:

或者带有软引号的更复杂的版本:

awk "\$2 ~ /${PARTITIONS[$n]//\//\\/}/ { print \$1 }" /etc/fstab

#4


1  

You can pass the array on as an additional file to awk, using bash's process substitution.

您可以使用bash的进程替换将数组作为附加文件传递给awk。

partitions=( /usr /home /var /tmp )
awk '
    FNR==NR { partitions[$0]=""; next } 
    $1 !~ /^#/ && ($2 in partitions) { print $1 }
' <(printf '%s\n' "${partitions[@]}") /etc/fstab

NR holds the current number of records (lines) read, and FNR holds the current number of records read in the current file, so FNR==NR is only true when reading the first file, which is the process substitution in this case. So you fill up the partitions array for the first file.

NR保持当前读取的记录(行)数,并且FNR保持当前文件中读取的当前记录数,因此FNR == NR仅在读取第一个文件时才为真,这是本例中的过程替换。所以你填满第一个文件的分区数组。

Then, for the second file, you just check if the second field is in the array...

然后,对于第二个文件,您只需检查第二个字段是否在数组中...

In this case though, I'd just use bash (version >= 4.0), since /etc/fstab is typically fairly small.

在这种情况下,我只使用bash(版本> = 4.0),因为/ etc / fstab通常相当小。

declare -A 'partitions=([/usr]= [/home]= [/var]= [/tmp]=)'
while read -r spec file vfstype mntops freq passno; do
    [[ $spec != \#* && ${partitions[$file]+set} ]] && echo "$spec"
done < /etc/fstab

Or depending on the actual goal, you could parse df, which will tell you what filesystem the directory is on.

或者根据实际目标,您可以解析df,它将告诉您目录所在的文件系统。

dirs=( /usr /home /var /tmp )
for dir in "${dirs[@]}"; do
    { read -r; read -r part _; } < <(df -P "$dir")
    echo "$part"
done

#5


0  

Here are some observations that may help.

以下是一些可能有用的观察结果。

If the input is just a single file, it isn't necessary to cat it to anything. That is:

如果输入只是一个文件,则无需将其捕获到任何内容。那是:

$ cat file | program # would normally just be ...
$ program < file

If you need to feed something complicated to awk(1), then maybe you do have a use case for cat x | y ... you could do something like ...

如果你需要将复杂的东西喂给awk(1),那么也许你有一个用于cat x的用例你......你可以做点什么......

(echo StartFlag ${PARTITIONS[*]}; cat /etc/fstab) | awk ...

And finally, for the best results on SO ask in a format like ...

最后,为了获得SO上的最佳结果,请采用以下格式...

  1. My PARTITIONS bash variable contains simplified example contents
  2. 我的PARTITIONS bash变量包含简化的示例内容
  3. Suppose my /etc/fstab contains simplified example fstab
  4. 假设我的/ etc / fstab包含简化的示例fstab
  5. How do I get the following output: exact desired output based on simplified input
  6. 如何获得以下输出:基于简化输入的精确所需输出
  7. This is what I've tried: some people won't just provide code, it helps to ask for assistance on a specific programming problem where you have tried to reach a solution
  8. 这就是我尝试过的:有些人不会只提供代码,它有助于在您试图找到解决方案的特定编程问题上寻求帮助

#6


0  

You can do it like this:

你可以这样做:

awk -v partitions="${PARTITIONS[*]}" '
    BEGIN { split(partitions,a," ") }
    { for (e in a) { if ($2 ~ a[e]) { print $1 } } }' /etc/fstab 

So you don't need to create a for cycle outside awk and this means fewer processes.

因此,您不需要在awk之外创建for循环,这意味着更少的进程。

#1


2  

awk -v partition="${partitions[$n]}" '$2 ~ "^/" partition "$" { print $1 }' /etc/fstab

You can concatenate your regex characters (^ - beginning of string and $ - end of string) and the slash which is part of the partition name and the variable containing the partition name by placing them adjacent to each other. You don't need to use the slashes that delimit hard-coded regexes.

您可以连接正则表达式字符(^ - 字符串的开头和$ - 字符串的结尾)和作为分区名称一部分的斜杠和包含分区名称的变量,方法是将它们放在一起。您不需要使用分隔硬编码正则表达式的斜杠。

AWK will accept the filename as an argument without using cat to pipe it or using < to redirect it.

AWK将接受文件名作为参数,而不使用cat来管道它或使用 <来重定向它。< p>

I recommend using mixed or lowercase variable names in the shell as a habit to avoid potential name collisions with shell or environment variables.

我建议在shell中使用混合或小写变量名作为习惯,以避免与shell或环境变量发生潜在的名称冲突。

#2


2  

You could also pass the whole array into awk through -v, this assumes that the directory names do not contain spaces:

您也可以通过-v将整个数组传递给awk,这假设目录名称不包含空格:

PARTITIONS=(/usr /home /var /tmp)
awk -v partition="${PARTITIONS[*]}" \
  '$2 != "" && partition ~ $2"\\>" { print $1 }' /etc/fstab

This avoids the need for a for loop.

这避免了对for循环的需要。

Explanation

  • `partition="${PARTITIONS[*]}" passes in the whole array as space separated string.
  • `partition =“$ {PARTITIONS [*]}”将整个数组作为空格分隔的字符串传递。
  • $2 != "" means no empty lines are matched.
  • $ 2!=“”表示没有空行匹配。
  • partition ~ $2"\\>" matches $2 to the passed in string, \\> requires the match to be at the end of a word.
  • partition~ $ 2“\\>”匹配传入的字符串$ 2,\\>要求匹配位于单词的末尾。

#3


1  

First, to get the most annoying thing out of the way (GUoC), awk can work on a file just like cat, so just pass it directly. You can't pass whole arrays via -v unflattened, but since you're iterating over the items, it doesn't matter. If you want to avoid -v, you can pass bash variables by directly including them into awk scripts, you just have to be careful about the quoting (whitespace and awk's own $variable usage). Examples:

首先,为了让最烦人的事情(GUoC),awk可以像cat一样处理文件,所以直接传递它。你不能通过-v unflattened传递整个数组,但是因为你在迭代这些项目,所以没关系。如果你想避免使用-v,你可以通过直接将它们包含到awk脚本中来传递bash变量,你只需要注意引用(空格和awk自己的$ variable用法)。例子:

awk '$2 ~ "'${PARTITIONS[$n]}'" { print $1 }' /etc/fstab

Or the more complicated version with soft quotes:

或者带有软引号的更复杂的版本:

awk "\$2 ~ /${PARTITIONS[$n]//\//\\/}/ { print \$1 }" /etc/fstab

#4


1  

You can pass the array on as an additional file to awk, using bash's process substitution.

您可以使用bash的进程替换将数组作为附加文件传递给awk。

partitions=( /usr /home /var /tmp )
awk '
    FNR==NR { partitions[$0]=""; next } 
    $1 !~ /^#/ && ($2 in partitions) { print $1 }
' <(printf '%s\n' "${partitions[@]}") /etc/fstab

NR holds the current number of records (lines) read, and FNR holds the current number of records read in the current file, so FNR==NR is only true when reading the first file, which is the process substitution in this case. So you fill up the partitions array for the first file.

NR保持当前读取的记录(行)数,并且FNR保持当前文件中读取的当前记录数,因此FNR == NR仅在读取第一个文件时才为真,这是本例中的过程替换。所以你填满第一个文件的分区数组。

Then, for the second file, you just check if the second field is in the array...

然后,对于第二个文件,您只需检查第二个字段是否在数组中...

In this case though, I'd just use bash (version >= 4.0), since /etc/fstab is typically fairly small.

在这种情况下,我只使用bash(版本> = 4.0),因为/ etc / fstab通常相当小。

declare -A 'partitions=([/usr]= [/home]= [/var]= [/tmp]=)'
while read -r spec file vfstype mntops freq passno; do
    [[ $spec != \#* && ${partitions[$file]+set} ]] && echo "$spec"
done < /etc/fstab

Or depending on the actual goal, you could parse df, which will tell you what filesystem the directory is on.

或者根据实际目标,您可以解析df,它将告诉您目录所在的文件系统。

dirs=( /usr /home /var /tmp )
for dir in "${dirs[@]}"; do
    { read -r; read -r part _; } < <(df -P "$dir")
    echo "$part"
done

#5


0  

Here are some observations that may help.

以下是一些可能有用的观察结果。

If the input is just a single file, it isn't necessary to cat it to anything. That is:

如果输入只是一个文件,则无需将其捕获到任何内容。那是:

$ cat file | program # would normally just be ...
$ program < file

If you need to feed something complicated to awk(1), then maybe you do have a use case for cat x | y ... you could do something like ...

如果你需要将复杂的东西喂给awk(1),那么也许你有一个用于cat x的用例你......你可以做点什么......

(echo StartFlag ${PARTITIONS[*]}; cat /etc/fstab) | awk ...

And finally, for the best results on SO ask in a format like ...

最后,为了获得SO上的最佳结果,请采用以下格式...

  1. My PARTITIONS bash variable contains simplified example contents
  2. 我的PARTITIONS bash变量包含简化的示例内容
  3. Suppose my /etc/fstab contains simplified example fstab
  4. 假设我的/ etc / fstab包含简化的示例fstab
  5. How do I get the following output: exact desired output based on simplified input
  6. 如何获得以下输出:基于简化输入的精确所需输出
  7. This is what I've tried: some people won't just provide code, it helps to ask for assistance on a specific programming problem where you have tried to reach a solution
  8. 这就是我尝试过的:有些人不会只提供代码,它有助于在您试图找到解决方案的特定编程问题上寻求帮助

#6


0  

You can do it like this:

你可以这样做:

awk -v partitions="${PARTITIONS[*]}" '
    BEGIN { split(partitions,a," ") }
    { for (e in a) { if ($2 ~ a[e]) { print $1 } } }' /etc/fstab 

So you don't need to create a for cycle outside awk and this means fewer processes.

因此,您不需要在awk之外创建for循环,这意味着更少的进程。