用脚本处理日志文件

时间:2022-12-20 12:55:12

在linux中经常需要处理日志文件。主要是从日志文件中进行格式匹配,提取一些数据,然后对这些数据进行运算。

某日志文件test.log内容如下,要提取出Uin=xxxxx中的xxxxx。由于没有标准的一列一列的,用awk不好处理。直接用sed过滤即可。
ssh -p36000 -ltest 34.34.34.34 'cd /home/oicq/log; grep NoLoginSconnUinSendMsg talk_srv.log'
        NoLoginSconnUinSendMsg Uin=957995 ShmCmd=2 shSeq=50207
        NoLoginSconnUinSendMsg Uin=957995 ShmCmd=2 shSeq=61326
        NoLoginSconnUinSendMsg Uin=957995 ShmCmd=2 shSeq=3490
        NoLoginSconnUinSendMsg Uin=957995 ShmCmd=2 shSeq=29543
        NoLoginSconnUinSendMsg Uin=957995 ShmCmd=2 shSeq=25852
        NoLoginSconnUinSendMsg Uin=957995 ShmCmd=2 shSeq=25035
        NoLoginSconnUinSendMsg Uin=957995 ShmCmd=2 shSeq=7185
        NoLoginSconnUinSendMsg Uin=957995 ShmCmd=2 shSeq=51694
        NoLoginSconnUinSendMsg Uin=957995 ShmCmd=2 shSeq=16633
[2010-08-24 14:24:56] (DealMsgFromCrm)NoLoginSconnUinSendMsg Uin=1597748713 ShmCmd=4 shSeq=1698
[2010-08-24 14:26:00] (DealMsgFromCrm)NoLoginSconnUinSendMsg Uin=284815483 ShmCmd=4 shSeq=3166
[2010-08-24 14:27:26] (DealMsgFromCrm)NoLoginSconnUinSendMsg Uin=284815483 ShmCmd=4 shSeq=44776
[2010-08-24 14:27:27] (DealMsgFromCrm)NoLoginSconnUinSendMsg Uin=284815483 ShmCmd=4 shSeq=41575
[2010-08-24 14:29:45] (DealMsgFromCrm)NoLoginSconnUinSendMsg Uin=402300830 ShmCmd=4 shSeq=51399

 

程序sed -rn 's/.*Uin=([0-9]+).*//1/p ' test.log |uniq
输出为:
957995
1597748713
284815483
402300830

然后可以对输出进行进一步处理。

 


如果日志文件格式标准,就可以用awk统计。如下面,性能日志经常是这样子的:
[2009-09-20 15:50:48] PID[18883] PkgRecv[10438] PkgSent[10437] ErrPkgSent[500]
[2009-09-20 15:51:48] PID[18883] PkgRecv[10735] PkgSent[10734] ErrPkgSent[523]
[2009-09-20 15:52:48] PID[18883] PkgRecv[10937] PkgSent[10934] ErrPkgSent[508]

如果需要统计PkgRecv的累加值,假设日志名称为perform.log,可以用以下方法:

sed -e s/"/["/" "/g -e  s/"/]"/" "/g perform.log | awk '{sum+=$6}END{print sum}'

原理:通过sed,将'[',']'替换为空格,用awk的函数功能,将值相加