awk 截取某段时间的日志

时间:2022-12-06 13:35:38

好久没有截取nginx/haproxy 中 的日志了,竟有点不熟悉了。

记录一下,以免以后忘记。

NGINX 日志格式:

192.168.1.26 - - [14/Sep/2017:16:48:42 +0800] "GET /ui/favicons/favicon-16x16.png HTTP/1.1" 304 0
"http://192.168.1.124/app/kibana" "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/59.0.3071.115 Safari/537.36" "-"

截取方法:

cat access.log | awk '$4 >="[14/Sep/2017:16:30:50" && $4 <="[14/Sep/2017:16:35:50"' >>/tmp/nginx-access.com

HAPROXY 日志格式:

Sep 28 10:17:45 localhost haproxy[12443]: 192.168.1.36:52935 [28/Sep/2017:10:17:45.950] Dianda server_new/192.168.0.19 0/0/0/1/1 304 237 - - ---- 604/96/0/0/0 0/0 "GET /images/logo.png HTTP/1.1"

截取方法:

cat  haproxy.log |egrep "Sep 28"| awk '$3 >="10:08:00" && $3 <="10:18:00"' >/tmp/haproxy.txt

如果有些要过滤的可以考虑用 egrep

比如碰到这种格式

[2017-09-12 00:00:33.866] [INFO] access - 192.168.1.10  get  api_server-121000-1505145633854  /v2/promotion/announcement  200  12ms
[2017-09-12 00:00:18.720] [INFO] access - 192.168.0.100  579  get  api_server-579-1504195218720  /v3/user/coupon/list  {}  {}  null

cat access.log-2017-08-31.log |awk '{print $9  , $NF}' |egrep  -v 'null|\{|\}|N'|awk -F'ms' '{print $1}' |awk  '$NF >= 1500'

执行结果, 访问哪个api 用时多少

/v2/order/query/type/count 2644
/v2/order/query/type/count 1901
/v2/order/query/type/count 2837
/v1/message/client/device 3673
/v2/order/place/new 1640
/v2/order/place 1890
/v1/message/client/device 1635
/v1/message/client/device 1667
/v1/message/client/device 1747

在段落前面加一行自定义数据

假设: a="11111111111111"

echo $a |awk '{for(i=1;i<=NF;i++){printf "--exclude="$i" "}{print ""}}'

结果变成:

--exclude=11111111111111