如何扫描多个日志文件以查找哪些日志文件中包含特定的IP地址?

时间:2021-12-20 07:09:51

Recently there have been a few attackers trying malicious things on my server so I've decided to somewhat "track" them even though I know they won't get very far.

最近有一些攻击者在我的服务器上尝试恶意攻击,所以我决定稍微“跟踪”它们,即使我知道它们不会走得太远。

Now, I have an entire directory containing the server logs and I need a way to search through every file in the directory, and return a filename if a string is found. So I thought to myself, what better of a language to use for text & file operations than Perl? So my friend is helping me with a script to scan all files for a certain IP, and return the filenames that contain the IP so I don't have to search for the attacker through every log manually. (I have hundreds)

现在,我有一个包含服务器日志的整个目录,我需要一种方法来搜索目录中的每个文件,并在找到字符串时返回文件名。所以我想,自己用于文本和文件操作的语言比Perl更好?所以我的朋友正在帮我编写一个脚本来扫描某个IP的所有文件,并返回包含IP的文件名,这样我就不必手动搜索每个日志的攻击者了。 (我有几百个)

#!/usr/bin/perl

$dir = ".";

opendir(DIR, "$dir");
@files = grep(/\.*$/,readdir(DIR));
closedir(DIR);

foreach $file(@files) {
    open FILE, "$file" or die "Unable to open files";

    while(<FILE>) {
        print if /12.211.23.200/;
    }

}

although it is giving me directory read errors. Any assistance is greatly appreciated.

虽然它给我目录读取错误。非常感谢任何帮助。

EDIT: Code edited, still saying permission denied cannot open directory on line 10. I am just going to run the script from within the logs directory if you are questioning the directory change to "."

编辑:代码编辑,仍然说权限被拒绝无法打开第10行的目录。如果你质疑目录更改为“。”,我将从日志目录中运行脚本。

Mike.

14 个解决方案

#1


Can you use grep instead?

你能用grep吗?

#2


To get all the lines with the IP, I would directly use grep, no need to show a list of files, it's a simple command:

要获得IP的所有行,我会直接使用grep,不需要显示文件列表,这是一个简单的命令:

grep 12\.211\.23\.200 *

I like to pipe it to another file and then open that file in an editor...

我喜欢将它传输到另一个文件,然后在编辑器中打开该文件...

If you insist on wanting the filenames, it's also easy

如果你坚持想要文件名,那也很容易

grep -l 12\.211\.23\.200 *

grep is available on all Unix//Linux with the GNU tools, or on windows using one of the many implementations (unxutils, cygwin, ...etc.)

grep可以在所有Unix // Linux上使用GNU工具,或在Windows上使用众多实现之一(unxutils,cygwin,...等)。

#3


You have to concatenate $dirname with $filname when using files found through readdir, remember you haven't chdir'ed into the directory where those files resides.

当使用通过readdir找到的文件时,你必须将$ dirname与$ filname连接起来,记住你还没有进入这些文件所在的目录。

open FH, "<", "$dirname/$filname" or die "Cannot open $filname:$!";

Incidentally, why not just use grep -r to recursively search all subdirectories under your log dir for your string?

顺便说一句,为什么不使用grep -r以递归方式搜索日志目录下的所有子目录作为字符串?

EDIT: I see your edits, and two things. First, this line:

编辑:我看到你的编辑和两件事。首先,这一行:

@files = grep(/\.*$/,readdir(DIR));

Is not effective, because you are searching for zero or more . characters at the end of the string. Since it's zero or more, it'll match everything in the directory. If you're trying to exclude files ending in ., try this:

无效,因为您正在搜索零或更多。字符串末尾的字符。因为它是零或更多,它将匹配目录中的所有内容。如果您尝试排除以。结尾的文件,请尝试以下操作:

@files = grep(!/\.$/,readdir(DIR));

Note the ! sign for negation if you're trying to exclude those files. Otherwise (if you only want those files and I'm misunderstanding your intent), leave the ! out.

注意!如果您试图排除这些文件,请签署否定。否则(如果你只想要那些文件并且我误解了你的意图),请离开!出。

In any case, if you're getting your die message on line 10, most likely you're hitting a file that has permissions such that you can't read it. Try putting the filename in the die output so you can see which file it's failing on:

在任何情况下,如果您在第10行收到您的骰子消息,很可能您正在访问具有权限的文件,以至于您无法读取它。尝试将文件名放在die输出中,这样你就可以看到它失败的文件:

open FILE, "$file" or die "Unable to open file: $file";

But as with other answers, and to reiterate: Why not use grep? The unix command, not the Perl function.

但与其他答案一样,并重申:为什么不使用grep? unix命令,而不是Perl函数。

#4


This will get the file names you are looking for in perl, and probably do it much faster than running and doing a perl regex.

这将获得您在perl中寻找的文件名,并且可能比运行和执行perl正则表达式更快。

@files = `find ~/ServerLogs -name "*.log" | xargs grep -l "<ip address>"`'

Although, this will require a *nix compliant system, or Cygwin on Windows.

虽然,这将需要* nix兼容系统,或Windows上的Cygwin。

#5


Firstly get a list of files within your source directory:

首先获取源目录中的文件列表:

opendir(DIR, "$dir");
@files = grep(/\.log$/,readdir(DIR));
closedir(DIR);

And then loop through those files

然后循环遍历这些文件

foreach $file(@files)
{
  // file processing code
}

#6


My first suggest would be to use grep instead. The right tool for the job, they say...

我的第一个建议是使用grep代替。他们说,这是工作的正确工具......

But to answer your question:

但要回答你的问题:

readdir just returns the filenames from the directory. You'll need to concatenate the directory name and filename together.

readdir只返回目录中的文件名。您需要将目录名和文件名连接在一起。

$path = "$dirname/$filname";
open FH, $path or die ...

Then you should ignore files that are actually directories, such as "." and "..". After getting the $path, check to see if it's a file.

然后你应该忽略实际上是目录的文件,例如“。”和“......”。获取$ path后,检查它是否是文件。

if (-f $path) {
    open FH, $path or die ...
    while (<FH>)

#7


BTW, I thought I would throw in a mention for File::Next. To iterate over all files in a directory (recursively):

顺便说一下,我想我会提到File :: Next。迭代目录中的所有文件(递归):

use Path::Class; # always useful.
use File::Next;

my $files = File::Next::files( dir(qw/path to files/) ); # look in path/to/files
while( defined ( my $file = $files->() ) ){
    $file = file( $file );
    say "Examining $file";
    say "found foo" if $file->slurp =~ /foo/;
}

File::Next is taint-safe.

File :: Next是污点安全的。

#8


~ doesn't auto-expand in Perl.

〜不会在Perl中自动扩展。

opendir my $fh,  '~/' or die("Doin It Wrong");  # Doing It Wrong. 

opendir my $fh, glob('~/') and die( "Thats right!" );

#9


Also, if you must use readdir(), make sure you guard the expression thus:

另外,如果你必须使用readdir(),请确保你保护表达式:

while (defined(my $filename = readdir(DH))) {
    ...
}

If you don't do the defined() test, the loop will terminate if it finds a file called '0'.

如果不执行defined()测试,则循环将在找到名为“0”的文件时终止。

#10


Have you looked on CPAN for log parsers? I searched with 'log parse' and it yielded over 200 hits. Some (probably many) won't be relevant - some may be. It depends, in part, on which web server you are using.

你有没有看过CPAN的日志解析器?我用'log parse'搜索了它,它产生了超过200次点击。一些(可能很多)不相关 - 有些可能是。这部分取决于您使用的Web服务器。

#11


Am I reading this right? Your line 10 that gives you the error is

我读这个吧?你的第10行给出了错误

open FILE, "$file" or die "Unable to open files";

And the $file you are trying to read, according to line 6,

根据第6行,您要读取的$文件,

@files = grep(/\.*$/,readdir(DIR));

is a file that ends with zero or more dot. Is this what you really wanted? This basically matches every file in the directory, including "." and "..". Maybe you don't have enough permission to open the parent directory for reading?

是一个以零或多点为结尾的文件。这是你真正想要的吗?这基本上匹配目录中的每个文件,包括“。”和“......”。也许你没有足够的权限打开父目录进行阅读?

EDIT: if you only want to read all files (including hidden ones), you might want to use something like the following:

编辑:如果您只想阅读所有文件(包括隐藏文件),您可能希望使用以下内容:

opendir(DIR, ".");
@files = readdir(DIR);
closedir(DIR);

foreach $file (@files) {
  if ($file ne "." and $file ne "..") {
    open FILE, "$file" or die "cannot open $file\n";
    # do stuff with FILE
  }
}

Note that this doesn't take care of sub directories.

请注意,这不会处理子目录。

#12


I know I am way late to this discussion (ran across it while searching for grep related posts) but I am going to answer anyway:

我知道我在这次讨论中已经迟到了(在搜索grep相关帖子时遇到了它)但我还是会回答:

It isn't specified clearly if these are web server logs (Apache, IIS, W3SVC, etc.) but the best tool for mining those for data is the LogParser tool from Microsoft. See logparser.com for more info.

如果这些是Web服务器日志(Apache,IIS,W3SVC等),则没有明确说明,但是用于挖掘数据的最佳工具是Microsoft的LogParser工具。有关详细信息,请参阅logparser.com。

LogParser will allow you to write SQL-like statements against the log files. It is very flexible and very fast.

LogParser允许您针对日志文件编写类似SQL的语句。它非常灵活,速度非常快。

#13


Use perl from the command line, like a better grep

从命令行使用perl,就像一个更好的grep

perl -wnl -e '/12.211.23.200/ and print;' *.log > output.txt

the benefit here is that you can chain logic far easier

这里的好处是你可以更容易地链接逻辑

perl -wnl -e '(/12.211.23.20[1-11]/ or /denied/i ) and print;' *.log

if you are feeling wacky you can also use more advanced command line options to feed perl one liner result into other perl one liners.

如果您感觉很古怪,您还可以使用更高级的命令行选项将perl one liner结果提供给其他perl one衬里。

You really need to read "Minimal Perl: For UNIX and Linux People", awesome book on this very sort of thing.

你真的需要阅读“Minimal Perl:For UNIX and Linux People”,这本书非常棒。

#14


First, use grep.

首先,使用grep。

But if you don't want to, here are two small improvements you can make that I haven't seen mentioned yet:

但是,如果你不想,这里有两个小改进,你可以做到我还没有看到提到过:

1) Change:

@files = grep(/\.*$/,readdir(DIR));

to

@files = grep({ !-d "$dir/$_" } readdir(DIR));

This way you will exclude not just "." and ".." but also any other subdirectories that may exist in the server log directory (which the open downstream would otherwise choke on).

这样你就不会只排除“。”和“..”以及服务器日志目录中可能存在的任何其他子目录(否则打开下游会阻塞)。

2) Change:

print if /12.211.23.200/;

to

print if /12\.211\.23\.200/;

"." is a regex wildcard meaning "any character". Changing it to "\." will reduce the number of false positives (unlikely to change your results in practice but it's more correct anyway).

“”是一个正则表达式通配符,意思是“任何字符”。将其更改为“\”。将减少误报的数量(不太可能在实践中改变你的结果,但无论如何它更正确)。

#1


Can you use grep instead?

你能用grep吗?

#2


To get all the lines with the IP, I would directly use grep, no need to show a list of files, it's a simple command:

要获得IP的所有行,我会直接使用grep,不需要显示文件列表,这是一个简单的命令:

grep 12\.211\.23\.200 *

I like to pipe it to another file and then open that file in an editor...

我喜欢将它传输到另一个文件,然后在编辑器中打开该文件...

If you insist on wanting the filenames, it's also easy

如果你坚持想要文件名,那也很容易

grep -l 12\.211\.23\.200 *

grep is available on all Unix//Linux with the GNU tools, or on windows using one of the many implementations (unxutils, cygwin, ...etc.)

grep可以在所有Unix // Linux上使用GNU工具,或在Windows上使用众多实现之一(unxutils,cygwin,...等)。

#3


You have to concatenate $dirname with $filname when using files found through readdir, remember you haven't chdir'ed into the directory where those files resides.

当使用通过readdir找到的文件时,你必须将$ dirname与$ filname连接起来,记住你还没有进入这些文件所在的目录。

open FH, "<", "$dirname/$filname" or die "Cannot open $filname:$!";

Incidentally, why not just use grep -r to recursively search all subdirectories under your log dir for your string?

顺便说一句,为什么不使用grep -r以递归方式搜索日志目录下的所有子目录作为字符串?

EDIT: I see your edits, and two things. First, this line:

编辑:我看到你的编辑和两件事。首先,这一行:

@files = grep(/\.*$/,readdir(DIR));

Is not effective, because you are searching for zero or more . characters at the end of the string. Since it's zero or more, it'll match everything in the directory. If you're trying to exclude files ending in ., try this:

无效,因为您正在搜索零或更多。字符串末尾的字符。因为它是零或更多,它将匹配目录中的所有内容。如果您尝试排除以。结尾的文件,请尝试以下操作:

@files = grep(!/\.$/,readdir(DIR));

Note the ! sign for negation if you're trying to exclude those files. Otherwise (if you only want those files and I'm misunderstanding your intent), leave the ! out.

注意!如果您试图排除这些文件,请签署否定。否则(如果你只想要那些文件并且我误解了你的意图),请离开!出。

In any case, if you're getting your die message on line 10, most likely you're hitting a file that has permissions such that you can't read it. Try putting the filename in the die output so you can see which file it's failing on:

在任何情况下,如果您在第10行收到您的骰子消息,很可能您正在访问具有权限的文件,以至于您无法读取它。尝试将文件名放在die输出中,这样你就可以看到它失败的文件:

open FILE, "$file" or die "Unable to open file: $file";

But as with other answers, and to reiterate: Why not use grep? The unix command, not the Perl function.

但与其他答案一样,并重申:为什么不使用grep? unix命令,而不是Perl函数。

#4


This will get the file names you are looking for in perl, and probably do it much faster than running and doing a perl regex.

这将获得您在perl中寻找的文件名,并且可能比运行和执行perl正则表达式更快。

@files = `find ~/ServerLogs -name "*.log" | xargs grep -l "<ip address>"`'

Although, this will require a *nix compliant system, or Cygwin on Windows.

虽然,这将需要* nix兼容系统,或Windows上的Cygwin。

#5


Firstly get a list of files within your source directory:

首先获取源目录中的文件列表:

opendir(DIR, "$dir");
@files = grep(/\.log$/,readdir(DIR));
closedir(DIR);

And then loop through those files

然后循环遍历这些文件

foreach $file(@files)
{
  // file processing code
}

#6


My first suggest would be to use grep instead. The right tool for the job, they say...

我的第一个建议是使用grep代替。他们说,这是工作的正确工具......

But to answer your question:

但要回答你的问题:

readdir just returns the filenames from the directory. You'll need to concatenate the directory name and filename together.

readdir只返回目录中的文件名。您需要将目录名和文件名连接在一起。

$path = "$dirname/$filname";
open FH, $path or die ...

Then you should ignore files that are actually directories, such as "." and "..". After getting the $path, check to see if it's a file.

然后你应该忽略实际上是目录的文件,例如“。”和“......”。获取$ path后,检查它是否是文件。

if (-f $path) {
    open FH, $path or die ...
    while (<FH>)

#7


BTW, I thought I would throw in a mention for File::Next. To iterate over all files in a directory (recursively):

顺便说一下,我想我会提到File :: Next。迭代目录中的所有文件(递归):

use Path::Class; # always useful.
use File::Next;

my $files = File::Next::files( dir(qw/path to files/) ); # look in path/to/files
while( defined ( my $file = $files->() ) ){
    $file = file( $file );
    say "Examining $file";
    say "found foo" if $file->slurp =~ /foo/;
}

File::Next is taint-safe.

File :: Next是污点安全的。

#8


~ doesn't auto-expand in Perl.

〜不会在Perl中自动扩展。

opendir my $fh,  '~/' or die("Doin It Wrong");  # Doing It Wrong. 

opendir my $fh, glob('~/') and die( "Thats right!" );

#9


Also, if you must use readdir(), make sure you guard the expression thus:

另外,如果你必须使用readdir(),请确保你保护表达式:

while (defined(my $filename = readdir(DH))) {
    ...
}

If you don't do the defined() test, the loop will terminate if it finds a file called '0'.

如果不执行defined()测试,则循环将在找到名为“0”的文件时终止。

#10


Have you looked on CPAN for log parsers? I searched with 'log parse' and it yielded over 200 hits. Some (probably many) won't be relevant - some may be. It depends, in part, on which web server you are using.

你有没有看过CPAN的日志解析器?我用'log parse'搜索了它,它产生了超过200次点击。一些(可能很多)不相关 - 有些可能是。这部分取决于您使用的Web服务器。

#11


Am I reading this right? Your line 10 that gives you the error is

我读这个吧?你的第10行给出了错误

open FILE, "$file" or die "Unable to open files";

And the $file you are trying to read, according to line 6,

根据第6行,您要读取的$文件,

@files = grep(/\.*$/,readdir(DIR));

is a file that ends with zero or more dot. Is this what you really wanted? This basically matches every file in the directory, including "." and "..". Maybe you don't have enough permission to open the parent directory for reading?

是一个以零或多点为结尾的文件。这是你真正想要的吗?这基本上匹配目录中的每个文件,包括“。”和“......”。也许你没有足够的权限打开父目录进行阅读?

EDIT: if you only want to read all files (including hidden ones), you might want to use something like the following:

编辑:如果您只想阅读所有文件(包括隐藏文件),您可能希望使用以下内容:

opendir(DIR, ".");
@files = readdir(DIR);
closedir(DIR);

foreach $file (@files) {
  if ($file ne "." and $file ne "..") {
    open FILE, "$file" or die "cannot open $file\n";
    # do stuff with FILE
  }
}

Note that this doesn't take care of sub directories.

请注意,这不会处理子目录。

#12


I know I am way late to this discussion (ran across it while searching for grep related posts) but I am going to answer anyway:

我知道我在这次讨论中已经迟到了(在搜索grep相关帖子时遇到了它)但我还是会回答:

It isn't specified clearly if these are web server logs (Apache, IIS, W3SVC, etc.) but the best tool for mining those for data is the LogParser tool from Microsoft. See logparser.com for more info.

如果这些是Web服务器日志(Apache,IIS,W3SVC等),则没有明确说明,但是用于挖掘数据的最佳工具是Microsoft的LogParser工具。有关详细信息,请参阅logparser.com。

LogParser will allow you to write SQL-like statements against the log files. It is very flexible and very fast.

LogParser允许您针对日志文件编写类似SQL的语句。它非常灵活,速度非常快。

#13


Use perl from the command line, like a better grep

从命令行使用perl,就像一个更好的grep

perl -wnl -e '/12.211.23.200/ and print;' *.log > output.txt

the benefit here is that you can chain logic far easier

这里的好处是你可以更容易地链接逻辑

perl -wnl -e '(/12.211.23.20[1-11]/ or /denied/i ) and print;' *.log

if you are feeling wacky you can also use more advanced command line options to feed perl one liner result into other perl one liners.

如果您感觉很古怪,您还可以使用更高级的命令行选项将perl one liner结果提供给其他perl one衬里。

You really need to read "Minimal Perl: For UNIX and Linux People", awesome book on this very sort of thing.

你真的需要阅读“Minimal Perl:For UNIX and Linux People”,这本书非常棒。

#14


First, use grep.

首先,使用grep。

But if you don't want to, here are two small improvements you can make that I haven't seen mentioned yet:

但是,如果你不想,这里有两个小改进,你可以做到我还没有看到提到过:

1) Change:

@files = grep(/\.*$/,readdir(DIR));

to

@files = grep({ !-d "$dir/$_" } readdir(DIR));

This way you will exclude not just "." and ".." but also any other subdirectories that may exist in the server log directory (which the open downstream would otherwise choke on).

这样你就不会只排除“。”和“..”以及服务器日志目录中可能存在的任何其他子目录(否则打开下游会阻塞)。

2) Change:

print if /12.211.23.200/;

to

print if /12\.211\.23\.200/;

"." is a regex wildcard meaning "any character". Changing it to "\." will reduce the number of false positives (unlikely to change your results in practice but it's more correct anyway).

“”是一个正则表达式通配符,意思是“任何字符”。将其更改为“\”。将减少误报的数量(不太可能在实践中改变你的结果,但无论如何它更正确)。