ELK日志收集

时间:2021-06-15 00:03:44

目前日志的痛点

  1. 运维要经常登陆到服务器上拿日志给开发、测试
  2. 每次都是出问题后才去看日志,不能提前通过日志预判问题
  3. 如果是集群服务,日志将要从多台机器取
  4. 开发人员搞出来的日志不规范,没有标准。日志目录不统一、日志类型也不明确(系统日志、错误日志、访问日志、运行日志、设备日志、debug日志)

以上痛点可以使用ELK解决,

要想让日志发挥作用,要有4个阶段,

  1. 收集
  2. 存储
  3. 搜索和展现
  4. 日志分析,做到故障预警和业务拓展

使用 elasticsearch logstash kibana 可以解决前3个阶段的问题

es: 存储,搜索

logstash: 收集

kibanna: 展现

es 和 logstash都是使用java语言开发的,运行时使用jvm,所以运行环境要安装jdk(open-jdk,据说安卓系统将改用open-jdk,弃用sun-jdk,让安卓系统更轻一些)

es安装及配置

es安装的最佳实践是使用yum安装(也可以用源码安装,就是下载一个tar包,解压运行即可,好处是更新版本时很方便)

https://www.elastic.co/guide/en/elasticsearch/reference/current/rpm.html

1.Download and install the public signing key:

rpm --import https://artifacts.elastic.co/GPG-KEY-elasticsearch

2.Create a file called elasticsearch.repo in the /etc/yum.repos.d/ directory for RedHat based distributions

[elasticsearch-6.x]

name=Elasticsearch repository for 6.x packages

baseurl=https://artifacts.elastic.co/packages/6.x/yum

gpgcheck=1

gpgkey=https://artifacts.elastic.co/GPG-KEY-elasticsearch

enabled=1

autorefresh=1

type=rpm-md

3.And your repository is ready for use. You can now install Elasticsearch with one of the following

sudo yum install elasticsearch

配置:

es要配置的地方不多,集群cluster名称(很重要),节点名称(很重要),是否锁住内存, data path, log path ,监听网络的IP ,监听网络的接口

grep "[1]" /etc/elasticsearch/elasticsearch.yml

cluster.name: oldgirl

node.name: linux-node-1

path.data: /var/lib/elasticsearch

path.logs: /var/log/elasticsearch

bootstrap.memory_lock: true

network.host: 0.0.0.0

http.port: 9200

这里bootstrap.memory_lock: true 是锁内存,启动的时候会报错,导致服务无法启动,那是因为limit.conf没开启锁的权限按照日志报错提示进行添加

2018-07-01T14:15:44,143][WARN ][o.e.b.JNANatives ] Increase RLIMIT_MEMLOCK, soft limit: 65536, hard limit: 65536

[2018-07-01T14:15:44,144][WARN ][o.e.b.JNANatives ] These can be adjusted by modifying /etc/security/limits.conf, for example:

# allow user 'elasticsearch' mlockall

elasticsearch soft memlock unlimited

elasticsearch hard memlock unlimited

[2018-07-01T14:15:44,144][WARN ][o.e.b.JNANatives

至此一个单节点的es安装完成,可以访问测试 http://IP:9200

{

"name" : "linux-node-1",

"cluster_name" : "oldgirl",

"cluster_uuid" : "5hmMNxc5QxG6q-2t2VNqrg",

"version" : {

"number" : "6.3.0",

"build_flavor" : "default",

"build_type" : "rpm",

"build_hash" : "424e937",

"build_date" : "2018-06-11T23:38:03.357887Z",

"build_snapshot" : false,

"lucene_version" : "7.3.1",

"minimum_wire_compatibility_version" : "5.6.0",

"minimum_index_compatibility_version" : "5.0.0"

},

"tagline" : "You Know, for Search"

}

看到以上结果,说明一个es已经搭建成功,es搭建成功后接下来就是往es里存数据了。

如何和es交互?两种大的方法

一种是java API 一种是resful api

我们使用restfulapi,以json数据格式与es交互

比如在shell环境中执行:

curl -H Content-Type:application/json -i -X GET 'http://127.0.0.1:9200/_count?pretty' -d '

{

"query": {

"match_all": {}

}

}'

返回结果

HTTP/1.1 200 OK

content-type: application/json; charset=UTF-8

content-length: 114

{

"count" : 0,

"_shards" : {

"total" : 0,

"successful" : 0,

"skipped" : 0,

"failed" : 0

}

}

-X GET 请求的方法

加-i是把响应头显示出来

这里要加-H Content-Type:application/json ,告诉服务器用json格式解析请求数据,否则会报如下错误:

HTTP/1.1 406 Not Acceptable

content-type: application/json; charset=UTF-8

content-length: 109

{

"error" : "Content-Type header [application/x-www-form-urlencoded] is not supported",

"status" : 406

}

这样使用shell命令行curl访问 es的restfulapi,但是不方便,es提供了很多插件,我们来使用官方推荐的插件,提供一个web管理的形式,来和es的restfulapi进行交互

官方推荐的插件在 elasticsearch 6.x版本 不在支持,我们用开源的elasticsearch-head github地址:https://github.com/mobz/elasticsearch-head

安装方法:

Running with built in server

git clone git://github.com/mobz/elasticsearch-head.git

cd elasticsearch-head

npm install

npm run start

open http://localhost:9100/

然后去修改elasticsearch的配置文件

vim /etc/elasticsearch/elasticsearch.yml

最后添加如下两行

http.cors.enabled: true

http.cors.allow-origin: "*"

然后访问

打开http://localhost:9100/

添加http://localhost:9200

至此 我们就可以使用web方式与elasticsearch的restfulapi进行交互了

接下来就是做一个elasticsearch集群

安装都是一样的,就在配置文件里把cluster name 设置成一样 。

启动后es用多播或者组播 对外宣称自己是哪个集群的。这里要注意的是,多播形式在6.x版本不好用,建议使用组播。组播的配置方式

discovery.zen.ping.unicast.hosts: ["host1", "host2"] 这里最好填写ip

这里并不需要把所有的节点名称都添加进去,只需要添加1到2个。因为他们会传播的。

如何判断是否加入集群了,两种方式,一种看elasticsearch-head 概述里能看到。

另外一种是通过看elasticsearch的日志,日志的名称为集群的名称。

还有就是监控插件bigdesk 很可惜从2.0后就不支持了。还有一个kopf插件3.0也不支持,总之现在es在做平台化,我们这里学习了解即可,,生产尽量使用平台产品。少很多运维成本。

常用的插件就这3个,有2个已经不能使用了。

es集群安装配置成功后,基本的使用和概念了解后,我们就开始学习logstash ,es的使用有很多知识,但是对于我们运维来说,最重要的是收集日志,所以接下来重点学习logstash的使用。

logstash的安装

是不是要在每一台服务器上安装logstash,不一定如果通过网络收就不需要。要是收集文本文件,那就是了。

https://www.elastic.co/guide/en/logstash/current/installing-logstash.html

YUM

Download and install the public signing key:

rpm --import https://artifacts.elastic.co/GPG-KEY-elasticsearch

Add the following in your /etc/yum.repos.d/ directory in a file with a .repo suffix, for example logstash.repo

vim /etc/yum.repos.d/logstash.repo

[logstash-6.x]

name=Elastic repository for 6.x packages

baseurl=https://artifacts.elastic.co/packages/6.x/yum

gpgcheck=1

gpgkey=https://artifacts.elastic.co/GPG-KEY-elasticsearch

enabled=1

autorefresh=1

type=rpm-md

And your repository is ready for use. You can install it with:

sudo yum install logstash

logstash使用gruby开发的。启动会有些慢

/usr/share/logstash/bin/logstash -e 'input { stdin{} } output { stdout{} }'

-e 执行

一个input 一个output

stdin{} ,stdout{} 是两个插件

运行需要等1分钟左右

[root@node2 elasticsearch]# /usr/share/logstash/bin/logstash -e 'input { stdin{} } output { stdout{} }'

WARNING: Could not find logstash.yml which is typically located in $LS_HOME/config or /etc/logstash. You can specify the path using --path.settings. Continuing using the defaults

Could not find log4j2 configuration at path /usr/share/logstash/config/log4j2.properties. Using default config which logs errors to the console

[WARN ] 2018-07-01 15:03:59.682 [LogStash::Runner] multilocal - Ignoring the 'pipelines.yml' file because modules or command line options are specified

[INFO ] 2018-07-01 15:04:00.629 [LogStash::Runner] runner - Starting Logstash {"logstash.version"=>"6.3.0"}

[INFO ] 2018-07-01 15:04:03.885 [Converge PipelineAction::Create] pipeline - Starting pipeline {:pipeline_id=>"main", "pipeline.workers"=>2, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>50}

The stdin plugin is now waiting for input:

[INFO ] 2018-07-01 15:04:04.098 [Converge PipelineAction::Create] pipeline - Pipeline started successfully {:pipeline_id=>"main", :thread=>"#<Thread:0x1b16cf42 run>"}

[INFO ] 2018-07-01 15:04:04.225 [Ruby-0-Thread-1: /usr/share/logstash/lib/bootstrap/environment.rb:6] agent - Pipelines running {:count=>1, :running_pipelines=>[:main], :non_running_pipelines=>[]}

[INFO ] 2018-07-01 15:04:04.547 [Api Webserver] agent - Successfully started Logstash API endpoint {:port=>9600}

hello world

{

"@version" => "1",

"@timestamp" => 2018-07-01T07:04:13.785Z,

"message" => "hello world",

"host" => "node2.shared"

}

hehehe

{

"@version" => "1",

"@timestamp" => 2018-07-01T07:04:20.411Z,

"message" => "hehehe",

"host" => "node2.shared"

}

以上就是标准输入输出的例子。

/usr/share/logstash/bin/logstash -e 'input { stdin{} } output { stdout{ codec => rubydebug } }'

...

hello

{

"message" => "hello",

"@version" => "1",

"@timestamp" => 2018-07-01T07:08:02.456Z,

"host" => "node2.shared"

}

我们把logstash进来的每条数据叫做事件,不叫一行 ,多行数据可能表示一个事件,比如 一个报错肯定不止一行信息。

把内容写到es中

输入还是用标准,输出改下

/usr/share/logstash/bin/logstash -e 'input { stdin{} } output { elasticsearch { hosts => ["10.211.55.8:9200"] } }'

相关官方文档https://www.elastic.co/guide/en/logstash/current/index.html

输出到es 就是那么简单。

能不能同时输出到es和前端,可以,不是负载均衡是同时。一个input,可以有多个output

/usr/share/logstash/bin/logstash -e 'input { stdin{} } output { elasticsearch { hosts => ["10.211.55.8:9200"] } stdout { codec => rubydebug } }'

什么作用呢? 生产上写到es的时候同时写到文本。文本保留是最好的,3个好处 1.最简单 2.可以2次加工 3. 压缩比最高 日志记什么好? 文本

接下来我们就要学习写logstash的配置文件,不能一直在命令行写,写到配置文件方便。

最简单的配置文件:

vim /etc/logstash/conf.d/logstash-simple.conf

input { stdin { } }

output {

elasticsearch { hosts => ["10.211.55.8:9200"] }

stdout { codec => rubydebug }

}

然后启动

/usr/share/logstash/bin/logstash -f /etc/logstash/conf.d/logstash-simple.conf

我们主要学习logstash的配置语法

This is a comment. You should use comments to describe

parts of your configuration.

input {

...

}

filter {

...

}

output {

...

}

input{},output{}是必须的,filter{}是可选的

input {

file {

path => "/var/log/messages"

type => "syslog"

}

file {

path => "/var/log/apache/access.log"

type => "apache"

}

}

案例 1

最常见的就是从文件输入

vim /etc/logstash/conf.d/file.conf

input {

file {

path => "/var/log/messages"

type => "system"

start_position => "beginning"

}

}

output {

stdout { codec => rubydebug }

elasticsearch {

hosts => ["10.211.55.8:9200"]

index => "system-%{+YYYY.MM.dd}"

}

}

接下来不仅收集系统日志 而且要收集java日志

案例 2

vim /etc/logstash/conf.d/file.conf

input {

file {

path => "/var/log/messages"

type => "system"

start_position => "beginning"

}

file {
path => "/var/log/elasticsearch/oldgirl.log"
type => "es-error"
start_position => "beginning"
}

}

output {

if [type] == "system" {

elasticsearch {

hosts => ["10.211.55.8:9200"]

index => "system-%{+YYYY.MM.dd}"

}

}

if [type] == "es-error" {

elasticsearch {

hosts => ["10.211.55.8:9200"]

index => "es-error-%{+YYYY.MM.dd}"

}

}

}

这样通过type 字段做if判断。

6.x中file插件文档没写type属性,但是能用,还不能换成其他的

这里要注意的是我们还没有给massge信息里做域,域中是有type属性的,那么这时候你再在file里使用type用于判断 那就会失效了。

当然也可以在一台服务器上 启动多个logstash程序去实现不同服务的日志。不过占用cpu和内存

Detected a 6.x and above cluster: the type event field won't be used to determine the document _type {:es_version=>6}

启动时提示信息,告诉我们配置文件在file里设置的type并不是es 数据浏览中的_type

这样去elasticsearch中查看日志会有一个问题,就是一个错误信息 应该是一个事件,显示在一个事件里才是最好的,但是从文件里读取导致这个数据被切成了多行。这样是很不方便的。怎么把它收集到一个事件里呢。该引入codec了

案例3

input {

stdin {

codec => multiline {

pattern => "pattern, a regexp"

negate => "true" or "false"

what => "previous" or "next"

}

}

}

上面三个参数的解释

pattern 正则 ,在什么情况下和并

negate

what

input {

stdin {

codec => multiline {

pattern => "^["

negate => "true"

what => "previous"

}

}

}

output {

stdout {

codec => rubydebug

}

}

以[开头的为一个事件,不以[开头的就合并到上一个事件去

vim /etc/logstash/conf.d/all.conf

input {

file {

path => "/var/log/messages"

type => "system"

start_position => "beginning"

}

file {
path => "/var/log/elasticsearch/oldgirl.log"
type => "es-error"
start_position => "beginning"
codec => multiline {
pattern => "^\["
negate => "true"
what => "previous"
}
}

}

output {

if [type] == "system" {

elasticsearch {

hosts => ["10.211.55.8:9200"]

index => "system-%{+YYYY.MM.dd}"

}

}

if [type] == "es-error" {

elasticsearch {

hosts => ["10.211.55.8:9200"]

index => "es-error-%{+YYYY.MM.dd}"

}

}

}

/usr/share/logstash/bin/logstash -f /etc/logstash/conf.d/all.conf

接下来从elastic-head查看不方便,就要引用我们的kibana服务

kibana是elasticsearch的可视化平台

https://www.elastic.co/guide/en/kibana/current/index.html

kibana 一开始PHP,改为ruby 又改成gruby 现在改成nodejs

wget https://artifacts.elastic.co/downloads/kibana/kibana-6.3.0-linux-x86_64.tar.gz

shasum -a 512 kibana-6.3.0-linux-x86_64.tar.gz

tar -xzf kibana-6.3.0-linux-x86_64.tar.gz

mv kibana-6.3.0-linux-x86_64/ /usr/local/

cd /usr/local/

ln -s kibana-6.3.0-linux-x86_64/ kibana

更改kibana配置文件

cd /usr/local/kibana/config

vim kibana.yml

4个地方修改

server.port: 5601

server.host: "0.0.0.0"

elasticsearch.url: "http://10.211.55.8:9200"

kibana.index: ".kibana"

kibana.index值得注意,kibana没有数据库,但数据总要又个地方存储,那么既然和es是生死之交,那就用es,直接告诉你帮我创建一个.kibana的索引,用来存储kibana数据

配置完成后,直接启动kibana

我们收集了system日志,java 的日志(es的运行日志),接下来我们收集nginx的日志。

es里有域的概念,域 可以理解成表中的字段 。 index 索引 理解成 数据库实例 ,_type 理解成数据库里的表,而域就是字段 即把 message里的内容 搞成key:value的形式

nginx 的日志 通过配置nginx.conf文件,可以让ngingx的日志格式统一输出为json文件格式。而logstash 传递给es,es可以直接把这种json数据格式解析成k:v的形式,这样将为以后使用elk中的kibana进行搜索增加效率。

nginx配置日志使用json的方式如下:nginx.org

http://nginx.org/en/docs/http/ngx_http_log_module.html 查看nginx官网的关于日志模块的配置

其中

Syntax: log_format name [escape=default|json|none] string ...;

Default: log_format combined "...";

Context: http

我们只需要在nginx中的http配置块中添加

log_format main '$remote_addr - $remote_user [$time_local] "$request" '

'$status $body_bytes_sent "$http_referer" '

'"$http_user_agent" "$http_x_forwarded_for"';

log_format json '{"@timestamp":"$time_iso8601",'

'"@version":"1",'

'"url":"$uri",'

'"status":"$status",'

'"domain":"$host",'

'"host":"$server_addr",'

'"size":$body_bytes_sent,'

'"responsetime":$request_time,'

'"referer": "$http_referer",'

'"ua": "$http_user_agent"'

'}';

access_log /var/log/nginx/access_json.log json;

access_log /var/log/nginx/access.log main;

启动nginx,访问产生日志,并且确认是json格式的

此时写一个json.conf文件

vim /etc/logstash/conf.d/json.conf

input {

file {

path => "/var/log/nginx/access_json.log"

codec => json

}

}

output {

stdout {

codec => rubydebug

}

}

执行结果如下:

/usr/share/logstash/bin/logstash -f /etc/logstash/conf.d/json.conf

[INFO ] 2018-07-01 22:22:36.797 [Ruby-0-Thread-1: /usr/share/logstash/lib/bootstrap/environment.rb:6] agent - Pipelines running {:count=>1, :running_pipelines=>[:main], :non_running_pipelines=>[]}

[INFO ] 2018-07-01 22:22:37.539 [Api Webserver] agent - Successfully started Logstash API endpoint {:port=>9600}

{

"domain" => "10.211.55.8",

"@version" => "1",

"host" => "10.211.55.8",

"responsetime" => 0.0,

"@timestamp" => 2018-07-01T14:23:24.000Z,

"size" => 0,

"status" => "304",

"path" => "/var/log/nginx/access_json.log",

"ua" => "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/67.0.3396.99 Safari/537.36",

"url" => "/index.html",

"referer" => "-"

}

接下来我们就可以添加到all.conf中了

input {

file {

path => "/var/log/messages"

type => "system"

start_position => "beginning"

}

file {

path => "/var/log/nginx/access_json.log"

type => "nginx-log"

start_position => "beginning"

codec => json

}

file {
path => "/var/log/elasticsearch/oldgirl.log"
type => "es-error"
start_position => "beginning"
codec => multiline {
pattern => "^\["
negate => "true"
what => "previous"
}
}

}

output {

if [type] == "system" {

elasticsearch {

hosts => ["10.211.55.8:9200"]

index => "system-%{+YYYY.MM.dd}"

}

}

if [type] == "es-error" {

elasticsearch {

hosts => ["10.211.55.8:9200"]

index => "es-error-%{+YYYY.MM.dd}"

}

}

if [type] == "nginx-log" {

elasticsearch {

hosts => ["10.211.55.8:9200"]

index => "nginx-log-%{+YYYY.MM.dd}"

}

}

}

这样就可以在elasticsearch-head中查看到新的index

在kibana中添加新的索引,然后就可以进行查询了

message日志的收集

前面我们也收集了message日志,但是我们使用的是file插件,

我们知道系统的日志是由syslog程序生成,syslog是可以将日志写到远程的

所以我们应该使用logstash 监听一个端口,syslog直接将日志写到监听端口就行了。

最好的是 生产上所有的业务都用syslog进行写日志,那就相当于 不需要在每台机器上安装logstash进行抓取日志,只需要搞一个logstash服务端口

nginx 也有支持写到syslog,原生的不支持,淘宝开源的支持,还有nginx lua 支持

在 input 插件列表中能找到syslog

https://www.elastic.co/guide/en/logstash/current/plugins-inputs-syslog.html

vim /etc/logstash/conf.d/syslog.conf

input {

syslog {

type => "system-syslog"

host => "10.211.55.8"

port => "514"

}

}

output {

stdout {

codec => "rubydebug"

}

}

/usr/share/logstash/bin/logstash -f /etc/logstash/conf.d/syslog.conf

启动后确认514端口是开放的

接下来就是更改系统的rsyslog.conf配置文件

vim /etc/rsyslog.conf

找到

. @@remote-host:514

去掉#改成:

. @@10.211.55.8:514

然后重启rsyslog服务

systemctl restart rsyslog

重启下你就会立马看到日志

{

"pid" => "20915",

"severity" => 5,

"logsource" => "node2",

"facility_label" => "security/authorization",

"timestamp" => "Jul 2 20:56:43",

"type" => "system-syslog",

"program" => "polkitd",

"@timestamp" => 2018-07-02T12:56:43.000Z,

"facility" => 10,

"host" => "10.211.55.8",

"@version" => "1",

"message" => "Unregistered Authentication Agent for unix-process:1927:9050003 (system bus name :1.1149, object path /org/freedesktop/PolicyKit1/AuthenticationAgent, locale zh_CN.UTF-8) (disconnected from bus)\n",

"priority" => 85,

"severity_label" => "Notice"

}

然后我们就可以把syslog.conf的配置写在all.conf配置文件中

input {

file {

path => "/var/log/messages"

type => "system"

start_position => "beginning"

}

file {

path => "/var/log/nginx/access_json.log"

type => "nginx-log"

start_position => "beginning"

codec => json

}

file {
path => "/var/log/elasticsearch/oldgirl.log"
type => "es-error"
start_position => "beginning"
codec => multiline {
pattern => "^\["
negate => "true"
what => "previous"
}
}
syslog {
type => "system-syslog"
host => "10.211.55.8"
port => "514"
}

}

output {

if [type] == "system" {

elasticsearch {

hosts => ["10.211.55.8:9200"]

index => "system-%{+YYYY.MM.dd}"

}

}

if [type] == "es-error" {

elasticsearch {

hosts => ["10.211.55.8:9200"]

index => "es-error-%{+YYYY.MM.dd}"

}

}

if [type] == "nginx-log" {

elasticsearch {

hosts => ["10.211.55.8:9200"]

index => "nginx-log-%{+YYYY.MM.dd}"

}

}

if [type] == "system-syslog" {

elasticsearch {

hosts => ["10.211.55.8:9200"]

index => "sysetm-syslog-%{+YYYY.MM.dd}"

}

}

}

启动后

logger "hallo 1"

logger "hallo 1"

logger "hallo 1"

logger "hallo 1"

logger "hallo 1"

logger "hallo 1"

进行测试

上面这个可以当作生产的模版。

还有一个常见的logstash插件 ,tcp插件

system-syslog可以监听syslog日志,假如有应用程序不想把日志写到文件中,就可以用logstash直接启动tcp监听端口

这样,程序可以将日志直接写到tcp监听端口。

写法如下:

vim tcp.conf

input {

tcp {

host => "10.211.55.8"

port => "6666"

}

}

output {

stdout {

codec => "rubydebug"

}

}

启动 /usr/share/logstash/bin/logstash -f /etc/logstash/conf.d/tcp.conf

然后用nc进行测试

nc 10.211.55.8 6666 < /etc/resolv.conf

{

"host" => "node2.shared",

"message" => "# Generated by NetworkManager",

"@timestamp" => 2018-07-02T13:20:27.921Z,

"port" => 44257,

"@version" => "1"

}

{

"host" => "node2.shared",

"message" => "search localdomain shared",

"@timestamp" => 2018-07-02T13:20:27.943Z,

"port" => 44257,

"@version" => "1"

}

{

"host" => "node2.shared",

"message" => "nameserver 10.211.55.1",

"@timestamp" => 2018-07-02T13:20:27.944Z,

"port" => 44257,

"@version" => "1"

}

echo "hehe" | nc 10.211.55.8 6666

{

"host" => "node2.shared",

"message" => "hehe",

"@timestamp" => 2018-07-02T13:21:39.242Z,

"port" => 44259,

"@version" => "1"

}

echo "oldgirl" > /dev/tcp/10.211.55.8/6666

{

"host" => "node2.shared",

"message" => "oldgirl",

"@timestamp" => 2018-07-02T13:23:23.936Z,

"port" => 44260,

"@version" => "1"

}


  1. a-z ↩︎