在shell脚本中并行运行ssh并设置远程变量

时间:2021-06-09 23:17:01

I'm writing a script to read from a input file, which contains ~1000 lines of host info. The script ssh to each host, cd to the remote hosts log directory and cat the latest daily log file. Then I redirect the cat log file locally to do some pattern matching and statistics.

我正在编写一个从输入文件中读取的脚本,该文件包含大约1000行主机信息。每个主机的脚本ssh、远程主机日志目录cd和cat最新的每日日志文件。然后,我在本地重定向cat日志文件,以执行一些模式匹配和统计。

The simplified structure of my program is a while loop looks like this:

我的程序的简化结构是一个while循环如下:

while read host
do
    ssh -n name@$host "cd TO LOG DIR AND cat THE LATEST LOGFILE" | matchPattern
done << EOA
    $(awk -F, '{print &7}' $FILEIN)
EOA

where matchPattern is a function to match pattern and do statistics.

matchPattern是匹配模式和做统计的函数。

Right now I got 2 questions for this:

现在我有两个问题:

1) How to find the latest daily log file remotely? The latest log file name matches xxxx2012-05-02.log and is newest created, is it possible to do ls remotely and find the file matching the xxxx2012-05-02.log file name?(I can do this locally but get jammed when appending it to ssh command) Another way I could come up with is to do

1)如何远程查找最新日志文件?最新的日志文件名与xxxx2012-05-02匹配。日志是最新创建的,是否可以远程执行ls并找到与xxxx2012-05-02匹配的文件。日志文件的名字吗?(我可以在本地进行此操作,但在将其附加到ssh命令时会出现阻塞)另一种方法是执行

cat 'ls -t | head -1'  or
cat $(ls -t | head -1)

However if I append this to ssh, it will list my local newest created file name, can we set this to a remote variable so that cat will find the correct file?

但是如果我把这个附加到ssh,它会列出我本地最新创建的文件名,我们可以将它设置为一个远程变量,以便cat能够找到正确的文件吗?

2) As there are nearly 1000 hosts, I'm wondering can I do this in parallel (like to do 20 ssh at a time and do the next 20 after the first 20 finishes), appending & to each ssh seems not suffice to accomplish it.

2)由于有近1000台主机,我想知道我是否可以并行地完成这一任务(比如一次做20个ssh,然后在前20个完成后再做20个),每个ssh的附加&似乎不足以完成这一任务。

Any ideas would be greatly appreciated!

任何想法都将非常感谢!


Follow up: Hi everyone, I finally find a crappy way do solve the first problem by doing this:

跟进:大家好,我终于找到了一种蹩脚的方法来解决第一个问题:

ssh -n name@$host "cd $logDir; cat *$logName" | matchPattern

Where $logName is "today's date.log"(2012-05-02.log). The problem is that I can only use local variables within the double quotes. Since my log file ends with 2012-05-02.log, and there is no other files ends with this suffix, I just do a blindly cat *2012-05-02.log on remote machine and it will cat the desired file for me.

其中$logName是“today's data .log”(2012-05-02.log)。问题是我只能在双引号内使用局部变量。因为我的日志文件以2012-05-02结束。日志,没有其他文件以这个后缀结尾,我只是做了一个盲目的cat *2012-05-02。登录远程计算机,它将为我查找所需的文件。

3 个解决方案

#1


1  

For your first question,

对于你的第一个问题,

ssh -n name@$host 'cat $(ls -t /path/to/log/dir/*.log | head -n 1)'

should work. Note single quotes around the remote command.

应该工作。注意远程命令周围的单引号。

For your second question, wrap all the ssh | matchPattern | analyse stuff into its own function, then iterate over it by

对于第二个问题,将所有ssh |匹配模式|都封装到自己的函数中,然后对其进行迭代

outstanding=0
while read host
do
    sshMatchPatternStuff &
    outstanding=$((outstanding + 1))
    if [ $outstanding -ge 20 ] ; then
        wait
        outstanding=$((outstanding - 1))
    fi
done << EOA
    $(awk -F, '{print &7}' $FILEIN)
EOA
while [ $outstanding -gt 0 ] ; do
    wait
    outstanding=$((outstanding - 1))
done

(I assume you're using bash.)

(我想你是在用bash。)

It may be better to separate the ssh | matchPattern | analyse stuff into its own script, and then use a parallel variant of xargs to call it.

最好将ssh | matchPattern |解析到它自己的脚本中,然后使用xargs的并行变体调用它。

#2


1  

for your second question, take a look at parallel distributed shell:

关于你的第二个问题,看看平行分布壳层:

http://sourceforge.net/projects/pdsh/

http://sourceforge.net/projects/pdsh/

#3


0  

If you have GNU Parallel http://www.gnu.org/software/parallel/ installed you can do this:

如果你安装了GNU Parallel http://www.gnu.org/software/parallel/,你可以做到:

parallel -j0 --nonall --slf <(awk -F, '{print $7}' servers.txt) 'cd logdir; cat `ls -t | head -1` | grep pattern'

This way you get the matching done on the remote server. If you prefer to transfer the full log file and do the matching locally, simply move the grep outside:

这样可以在远程服务器上完成匹配。如果您希望传输完整的日志文件并在本地进行匹配,只需将grep移到外部:

parallel -j0 --nonall --slf <(awk -F, '{print $7}' servers.txt) 'cd logdir; cat `ls -t | head -1`' | grep pattern

You can install GNU Parallel simply by:

您可以简单地通过以下方式安装GNU并行:

wget http://git.savannah.gnu.org/cgit/parallel.git/plain/src/parallel
chmod 755 parallel
cp parallel sem

Watch the intro videos for GNU Parallel to learn more: https://www.youtube.com/playlist?list=PL284C9FF2488BC6D1

查看GNU Parallel的介绍视频了解更多信息:https://www.youtube.com/playlist?list=PL284C9FF2488BC6D1

#1


1  

For your first question,

对于你的第一个问题,

ssh -n name@$host 'cat $(ls -t /path/to/log/dir/*.log | head -n 1)'

should work. Note single quotes around the remote command.

应该工作。注意远程命令周围的单引号。

For your second question, wrap all the ssh | matchPattern | analyse stuff into its own function, then iterate over it by

对于第二个问题,将所有ssh |匹配模式|都封装到自己的函数中,然后对其进行迭代

outstanding=0
while read host
do
    sshMatchPatternStuff &
    outstanding=$((outstanding + 1))
    if [ $outstanding -ge 20 ] ; then
        wait
        outstanding=$((outstanding - 1))
    fi
done << EOA
    $(awk -F, '{print &7}' $FILEIN)
EOA
while [ $outstanding -gt 0 ] ; do
    wait
    outstanding=$((outstanding - 1))
done

(I assume you're using bash.)

(我想你是在用bash。)

It may be better to separate the ssh | matchPattern | analyse stuff into its own script, and then use a parallel variant of xargs to call it.

最好将ssh | matchPattern |解析到它自己的脚本中,然后使用xargs的并行变体调用它。

#2


1  

for your second question, take a look at parallel distributed shell:

关于你的第二个问题,看看平行分布壳层:

http://sourceforge.net/projects/pdsh/

http://sourceforge.net/projects/pdsh/

#3


0  

If you have GNU Parallel http://www.gnu.org/software/parallel/ installed you can do this:

如果你安装了GNU Parallel http://www.gnu.org/software/parallel/,你可以做到:

parallel -j0 --nonall --slf <(awk -F, '{print $7}' servers.txt) 'cd logdir; cat `ls -t | head -1` | grep pattern'

This way you get the matching done on the remote server. If you prefer to transfer the full log file and do the matching locally, simply move the grep outside:

这样可以在远程服务器上完成匹配。如果您希望传输完整的日志文件并在本地进行匹配,只需将grep移到外部:

parallel -j0 --nonall --slf <(awk -F, '{print $7}' servers.txt) 'cd logdir; cat `ls -t | head -1`' | grep pattern

You can install GNU Parallel simply by:

您可以简单地通过以下方式安装GNU并行:

wget http://git.savannah.gnu.org/cgit/parallel.git/plain/src/parallel
chmod 755 parallel
cp parallel sem

Watch the intro videos for GNU Parallel to learn more: https://www.youtube.com/playlist?list=PL284C9FF2488BC6D1

查看GNU Parallel的介绍视频了解更多信息:https://www.youtube.com/playlist?list=PL284C9FF2488BC6D1