如何用grep打印出唯一匹配的数量?

时间:2021-11-26 20:36:59

Lets say I have millions of packets to look through and I want to see how many times a packet was sent to a certain port number.

假设我有数百万个数据包要查看,我想看看数据包被发送到某个端口号的次数。

Here are some of the packets:

以下是一些数据包:

10:27:46.227407 IP 85.130.236.26.54156 > 139.91.133.120.60679: tcp 0
10:27:46.337038 IP 211.142.173.14.80 > 139.91.138.125.56163: tcp 0
10:27:46.511241 IP 211.49.224.217.3389 > 139.91.131.47.6973: tcp 0

I want to look through the 2nd port number here so:

我想在这里查看第二个端口号,以便:

60679, 53163, 6973, etc

60679,53163,6973等

So I can use:

所以我可以用:

grep -c '\.80:' output.txt

To count all the times port 80 was used. But is there a way for it to display all the ports that were used and how many times it was found in this file. Something like this and preferable sorted too so I can see which ports were used most often:

要计算端口80的所有时间。但有没有办法显示所有使用的端口以及在此文件中找到的端口数。像这样的东西,也是最好的排序,所以我可以看到最常使用的端口:

.80: - 54513
.110: - 12334
.445: - 412

1 个解决方案

#1


33  

See uniq -c. You'll want to pull out the bit you want, sort the result, pipe thru uniq, sort the output. Something like this maybe:

见uniq -c。您需要提取所需的位,对结果进行排序,通过uniq管道,对输出进行排序。这样的事情可能是:

egrep '\.[0-9]+:' output.txt | sort | uniq -c | sort -nr

Clarification: I've used grep here because it's not clear what your output.txt format looks like, but you'll want to actually cut out the port number bit, perhaps via cut or awk.

澄清:我在这里使用了grep,因为不清楚你的output.txt格式是什么样的,但是你想要实际删除端口号位,可能是通过cut或awk。

Edit: To get the port, you can cut once on a period and then again on a colon:

编辑:要获取端口,您可以在一段时间内剪切一次,然后再在冒号上剪切:

cut -d. -f10 < output.txt | cut -d: -f1

(Or any one of a dozen other ways to accomplish the same thing.) That will give you an unsorted list of ports. Then:

(或者其他任何一种方法来完成相同的事情。)这将为您提供未排序的端口列表。然后:

cut -d. -f10 < output.txt | cut -d: -f1 | sort | uniq -c | sort -nr

#1


33  

See uniq -c. You'll want to pull out the bit you want, sort the result, pipe thru uniq, sort the output. Something like this maybe:

见uniq -c。您需要提取所需的位,对结果进行排序,通过uniq管道,对输出进行排序。这样的事情可能是:

egrep '\.[0-9]+:' output.txt | sort | uniq -c | sort -nr

Clarification: I've used grep here because it's not clear what your output.txt format looks like, but you'll want to actually cut out the port number bit, perhaps via cut or awk.

澄清:我在这里使用了grep,因为不清楚你的output.txt格式是什么样的,但是你想要实际删除端口号位,可能是通过cut或awk。

Edit: To get the port, you can cut once on a period and then again on a colon:

编辑:要获取端口,您可以在一段时间内剪切一次,然后再在冒号上剪切:

cut -d. -f10 < output.txt | cut -d: -f1

(Or any one of a dozen other ways to accomplish the same thing.) That will give you an unsorted list of ports. Then:

(或者其他任何一种方法来完成相同的事情。)这将为您提供未排序的端口列表。然后:

cut -d. -f10 < output.txt | cut -d: -f1 | sort | uniq -c | sort -nr