005_awk案例实战

时间:2023-03-10 00:36:31
005_awk案例实战

一、工作经验总结.

(1)日志案例:

10.100.194.39	10.100.194.39	1019-03-16T11:01:04+08:00	www.uuwatch.com^^3FF91DE01BCB49B8BD11198DB98394F8|1993969314640	GET	499	/agent-config?appId=biz.marketing&host=10.101.181.194&hostName=wg1-biz-marketing-139-1161719-11193146.c.elenet.me	httphttp1.1	0.000-	146	0	0	10.100.38.36:1891	-	-	-	www.uuwatch.com	user	-	wg1	wg1-channel-stable-1Nginx	1.6.1	wg1-bjdev-pub-nginxms-1	Python-urllib/1.7	-	-	-	www.uuwatch.com
10.100.194.39 10.100.194.39 1019-03-16T11:07:16+08:00 www.uuwatch.com^^C9398ACF399743669E71A19613D9F498|1993969646084 GET 499 /collector/tcp/cluster?cluster=wg1-collector-esm http http1.1 0.001 - 191 0 0 10.100.38.49:1891 application/json - - www.uuwatch.com user - wg1 wg1-channel-stable-1 Nginx 1.6.1wg1-bjdev-pub-nginxms-1 Go-http-client/1.1 - - - www.uuwatch.com
10.100.194.39 10.100.194.39 1019-03-16T11:11:46+08:00 www.uuwatch.com^^1319B1E9EA1A4B9A9E498317606B74B7|1993969906741 GET 499 /agent-config?appId=arch.waf_collector&host=10.101.130.196 http http1.1 0.000 - 169 0 0 10.100.40.17:1891 -- - www.uuwatch.com user - wg1 wg1-channel-stable-1 Nginx 1.6.1 wg1-bjdev-pub-nginxms-1 Go-http-client/1.1 -- - www.uuwatch.com

只输出日志从第三列到最后的输出:

<1>cut -f 3- demo       Reference:https://*.com/questions/1602035/how-to-print-third-column-to-last-column

1019-03-16T11:01:04+08:00	www.uuwatch.com^^3FF91DE01BCB49B8BD11198DB98394F8|1993969314640	GET	499	/agent-config?appId=biz.marketing&host=10.101.181.194&hostName=wg1-biz-marketing-139-1161719-11193146.c.elenet.me	httphttp1.1	0.000	-	146	0	0	10.100.38.36:1891	-	-	-	www.uuwatch.com	user	-	wg1	wg1-channel-stable-1	Nginx	1.6.1	wg1-bjdev-pub-nginxms-1	Python-urllib/1.7	-	-	-	www.uuwatch.com
1019-03-16T11:07:16+08:00 www.uuwatch.com^^C9398ACF399743669E71A19613D9F498|1993969646084 GET 499 /collector/tcp/cluster?cluster=wg1-collector-esm http http1.1 0.001 - 191 0 0 10.100.38.49:1891 application/json - - www.uuwatch.com user - wg1 wg1-channel-stable-1 Nginx 1.6.1wg1-bjdev-pub-nginxms-1 Go-http-client/1.1 - - -www.uuwatch.com
1019-03-16T11:11:46+08:00 www.uuwatch.com^^1319B1E9EA1A4B9A9E498317606B74B7|1993969906741 GET 499 /agent-config?appId=arch.waf_collector&host=10.101.130.196 http http1.1 0.000 - 169 0 0 10.100.40.17:1891 - - - www.uuwatch.com user - wg1 wg1-channel-stable-1 Nginx 1.6.1 wg1-bjdev-pub-nginxms-1 Go-http-client/1.1 - - -www.uuwatch.com

<2>第二种方法:

awk '
{
for(i=3;i<=NF;i++)
rec[i]=(rec[i]?rec[i]RS$i:$i)
}
END {
for(i=3;i<=NF;i++) print rec[i]
}' splict 2029-03-26T22:02:04+08:00
www.uuwatch.com^^3FF92DE02BCB49B8BD22298DB98394F8|2993969324640
GET
499
/agent-config?appId=biz.marketing&host=20.202.282.294&hostName=wg2-biz-marketing-239-2262729-22293246.c.elenet.me
httphttp2.2
0.000
-
246
0
0
20.200.38.36:2892
-
-
-
www.uuwatch.com
user
-
wg2
wg2-channel-stable-2
Nginx
2.6.2
wg2-bjdev-pub-nginxms-2
Python-urllib/2.7
-
-
-
www.uuwatch.com

<3>第三种同上,区别参考:https://*.com/questions/23644184/using-awk-to-take-a-range-of-columns-and-print-them-as-a-single-column

awk '
{
for(i=3;i<=NF;i++) {
rec[i]=(rec[i]?rec[i]RS$i:$i)
}
num=(num>NF?num:NF)
}
END {
for(i=3;i<=num;i++) print rec[i]
}' splict

(2)指定时间范围打印

cut -f 3- demo|sed -n '/2029-03-26T22:02:04+08:00/,/2029-03-26T22:07:26+08:00/p'

2029-03-26T22:02:04+08:00	www.uuwatch.com^^3FF92DE02BCB49B8BD22298DB98394F8|2993969324640	GET	499	/agent-config?appId=biz.marketing&host=20.202.282.294&hostName=wg2-biz-marketing-239-2262729-22293246.c.elenet.me	httphttp2.2	0.000	-	246	0	0	20.200.38.36:2892	-	-	-	www.uuwatch.com	user	-	wg2	wg2-channel-stable-2	Nginx	2.6.2	wg2-bjdev-pub-nginxms-2	Python-urllib/2.7	-	-	-	www.uuwatch.com
2029-03-26T22:07:26+08:00 www.uuwatch.com^^C9398ACF399743669E72A29623D9F498|2993969646084 GET 499 /collector/tcp/cluster?cluster=wg2-collector-esm http http2.2 0.002 - 292 0 0 20.200.38.49:2892 application/json - - www.uuwatch.com user - wg2 wg2-channel-stable-2 Nginx 2.6.2wg2-bjdev-pub-nginxms-2 Go-http-client/2.2 - - -www.uuwatch.com

(3)

(4)

cat new20190329.log|awk -F "\t" '{print $3,$6}'|sort |uniq -c
2 2019-03-28T18:30:03+08:00 499
1 2019-03-28T20:43:13+08:00 404
1 2019-03-28T20:43:19+08:00 404
14 2019-03-28T20:43:34+08:00 404
30 2019-03-28T20:43:35+08:00 404
22 2019-03-28T20:43:36+08:00 404
32 2019-03-28T20:43:37+08:00 404

二、

cat file.txt
groups=001(group1),
002(group2),
003(group3)
groups=004(group4),
005(group5)

只想输出

group1
group2
group3
group4
group5

(1)awk 'BEGIN{FS="[()]"} {if($0~/^.*[0-9][0-9][0-9]\(group[0-9]+\).*$/) print $2}' file.txt
步骤详解:
➜ 011_cmdb_op awk 'BEGIN{FS="[()]"} {if($0~/^.*[0-9][0-9][0-9]\(group[0-9]+\).*$/) print $1}' file.txt
groups=001
002
003
groups=004
005
➜ 011_cmdb_op awk 'BEGIN{FS="[()]"} {if($0~/^.*[0-9][0-9][0-9]\(group[0-9]+\).*$/) print $2}' file.txt
group1
group2
group3
group4
group5
➜ 011_cmdb_op awk 'BEGIN{FS="[()]"} {if($0~/^.*[0-9][0-9][0-9]\(group[0-9]+\).*$/) print $3}' file.txt
,
, , #通过以上输出可见是以()为匹配的 或
(2)awk '{sub(/^.*[0-9][0-9][0-9]\(/,""); sub(/\).*$/,""); print}' file.txt
➜ 011_cmdb_op awk '{sub(/^.*[0-9][0-9][0-9]\(/,"");print}' file.txt #删除正则匹配的部分
group1),
group2),
group3)
group4),
group5)
awk '{sub(/^.*[0-9][0-9][0-9]\(/,""); sub(/\).*$/,""); print}' file.txt #再删除括号后边的部分
group1
group2
group3
group4
group5
(3)实战
ls al-arch-soa-zk-1-al-arch-soa-zk-1

ls al-arch-soa-zk-1-al-arch-soa-zk-1|awk '{sub(/^.*[0-9]-/,"");print}'
   al-arch-soa-zk-1