使用一个 mysqld exporter 监控所有的MySQL实例

时间:2023-02-01 12:16:15

说明

之前文章有聊过使用Prometheus监控自建Mysql,感兴趣的同学可以看下。本次我们简单聊下结合consulmanager使用一个 mysqld exporter 监控所有的MySQL实例的具体实现方法。

主要步骤:

  • 数据库授权

  • 部署 mysql exporter

  • 对接 Prometheus 监控

  • 对接 consulmanager 服务发现

  • 对接 grafana 数据看板

  • 监控报警规则

  • 对接钉钉告警机器人

数据库授权

创建 mysql 监控专用账号:

mysql> CREATE USER 'monitoring'@'%'  IDENTIFIED BY 'Test#123';
Query OK, 0 rows affected (0.09 sec)

mysql> GRANT SELECT, PROCESS, SUPER, REPLICATION CLIENT, RELOAD ON *.* TO 'monitoring'@'%';
Query OK, 0 rows affected (0.01 sec)

mysql> flush privileges;
Query OK, 0 rows affected (0.02 sec)

登入测试:

mysql -h 192.168.1.1 -umonitoring -pTest#123

部署 mysql exporter

这里提供两种部署方式,具体文件详见下文。

docker-compose 部署 exporter

docker-compose.yml文件内容如下

version: "3.2"
services:
  mysqld_exporter:
    image: swr.cn-south-1.myhuaweicloud.com/starsl.cn/mysqld_exporter:latest
    container_name: mysqld_exporter
    hostname: mysqld_exporter
    restart: always
    ports:
      - "9104:9104"
    volumes:
      - /usr/share/zoneinfo/PRC:/etc/localtime
    environment:
      MYSQLD_EXPORTER_PASSWORD: Test#123
    entrypoint:
      - /bin/mysqld_exporter
      - --collect.info_schema.innodb_metrics
      - --collect.info_schema.tables
      - --collect.info_schema.processlist
      - --collect.info_schema.tables.databases=*
      - --mysqld.username=monitoring

说明:

  • docker-compose中有2个变量:监控专用的mysql账号和密码,注意修改掉后再启动。

  • docker-compose配置方式是所有的mysql实例都配置了一样的mysql监控账号和密码。

  • 如果你有不同mysql实例需要配置不同监控账号密码的需求,请参考官方readme使用配置文件的方式启动。

启动 mysql exporter 服务:

docker-compose up -d

k8s 部署 exporter

mysql-exporter.yaml文件内容如下:

---
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: mysql-exporter
  name: mysql-exporter
  namespace: kubesphere-monitoring-system
spec:
  replicas: 1
  selector:
    matchLabels:
      app: mysql-exporter
  template:
    metadata:
      labels:
        app: mysql-exporter
    spec:
      containers:
      - name: mysql-exporter
        image: swr.cn-south-1.myhuaweicloud.com/starsl.cn/mysqld_exporter:latest
        command:
          - /bin/mysqld_exporter
        args:
          - --collect.info_schema.innodb_metrics
          - --collect.info_schema.tables
          - --collect.info_schema.processlist
          - --collect.info_schema.tables.databases=*
          - --mysqld.username=monitoring
        env:
        - name: TZ
          value: "Asia/Shanghai"
        - name: MYSQLD_EXPORTER_PASSWORD
          value: "Test#123"
        resources:
          requests:
            cpu: 100m
            memory: 100Mi
        ports:
        - name: http-metrics
          containerPort: 9104
          protocol: TCP
---
apiVersion: v1
kind: Service
metadata:
  labels:
    app: mysql-exporter
  name: mysql-exporter
  namespace: kubesphere-monitoring-system
spec:
  ports:
  - name: http-metirc
    protocol: TCP
    port: 9104
    targetPort: 9104
  selector:
    app: mysql-exporter

启动 mysql exporter 服务:

kubectl apply -f mysql-exporter.yaml

对接 Prometheus 监控

prometheus 自动发现 mysql 服务配置如下:

- job_name: multi_mysqld_exporter
  scrape_interval: 15s
  scrape_timeout: 5s
  metrics_path: /probe
  consul_sd_configs:
      ## consul 服务地址
    - server: '192.168.10.60:8500'
      token: 'fe48c9a4-364e-af23-81df-9f28303012af'
      refresh_interval: 30s
      services: ['selfrds_exporter']
  relabel_configs:
    - source_labels: [__meta_consul_tags]
      regex: .*OFF.*
      action: drop
    - source_labels: [__meta_consul_service_address,__meta_consul_service_port]
      regex: ([^:]+)(?::\d+)?;(\d+)
      target_label: __param_target
      replacement: $1:$2
    - source_labels: [__param_target]
      target_label: instance
    - target_label: __address__
      ## mysql exporter 服务地址和端口
      replacement: mysql-exporter.kubesphere-monitoring-system.svc:9104
    - source_labels: ['__meta_consul_service_metadata_vendor']
      target_label: vendor
    - source_labels: ['__meta_consul_service_metadata_region']
      target_label: region
    - source_labels: ['__meta_consul_service_metadata_group']
      target_label: group
    - source_labels: ['__meta_consul_service_metadata_account']
      target_label: account
    - source_labels: ['__meta_consul_service_metadata_name']
      target_label: name
    - source_labels: ['__meta_consul_service_metadata_iid']
      target_label: iid
    - source_labels: ['__meta_consul_service_metadata_exp']
      target_label: exp
    - source_labels: ['__meta_consul_service_metadata_cpu']
      target_label: cpu
    - source_labels: ['__meta_consul_service_metadata_mem']
      target_label: mem
    - source_labels: ['__meta_consul_service_metadata_disk']
      target_label: disk
    - source_labels: ['__meta_consul_service_metadata_itype']
      target_label: itype

对接 consulmanager 服务发现

登入consulmanager点击“ 云资源管理 ”,“ MySQL管理 ”,“ 自建MySQL管理 ”,点击新增或导入来添加待监控mysql服务信息。

使用一个 mysqld exporter 监控所有的MySQL实例

添加成功后,登入 prometheus Targets 控制台,可以看到multi_mysqld_exporter下新增的 mysql 服务信息。

对接 grafana 数据看板

Grafana 看板详情, Grafana 看板ID:17320:

使用一个 mysqld exporter 监控所有的MySQL实例

监控报警规则

consul-mysql-exporter-rules.yaml文件内容如下:

apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
  labels:
    app.kubernetes.io/component: exporter
    app.kubernetes.io/name: consul-mysql-exporter
    app.kubernetes.io/part-of: kube-prometheus
    app.kubernetes.io/version: 1.3.1
    prometheus: k8s
    role: alert-rules
  name: consul-mysql-exporter-rules
  namespace: kubesphere-monitoring-system
spec:
  groups:
  - name: MySQL-Alert
    rules:
    - alert: MySQL_is_down
      expr: mysql_up == 0
      for: 3m
      labels:
        severity: 紧急
      annotations:
        description: "{{ $labels.group }}_{{ $labels.name }}:MySQL database is down. \n> {{ $labels.instance }}\n> {{ $labels.iid }}"

    - alert: MySQL_慢查询过多
      expr: delta(mysql_global_status_slow_queries[1m]) > 60
      for: 1m
      labels:
        severity: 警告
      annotations:
        description: "{{ $labels.group }}_{{ $labels.name }}:每分钟慢查询:{{ $value }} \n> {{ $labels.instance }}\n> {{ $labels.iid }}"

    - alert: MySQL_当前活跃的连接数过多
      expr: mysql_global_status_threads_running > 100
      for: 1m
      labels:
        severity: 紧急
      annotations:
        description: "{{ $labels.group }}_{{ $labels.name }}:当前活跃的连接数:{{ $value }} \n> {{ $labels.instance }}\n> {{ $labels.iid }}"

    - alert: MySQL_当前updating状态的线程过多
      expr: mysql_info_schema_processlist_threads{state=~"updating"} > 100
      for: 1m
      labels:
        severity: 警告
      annotations:
        description: "{{ $labels.group }}_{{ $labels.name }}:当前updating状态的线程:{{ $value }} \n> {{ $labels.instance }}\n> {{ $labels.iid }}"

    - alert: MySQL_High_QPS
      expr: irate(mysql_global_status_questions[3m]) > 30000
      for: 2m
      labels:
        severity: 警告
      annotations:
        description: "{{ $labels.group }}_{{ $labels.name }}:Mysql QPS:{{ $value | humanize }} \n> {{ $labels.instance }}\n> {{ $labels.iid }}"

    - alert: MySQL_Too_Many_Connections
      expr: irate(mysql_global_status_threads_connected[3m]) > 1000
      for: 2m
      labels:
        severity: 警告
      annotations:
        description: "{{ $labels.group }}_{{ $labels.name }}:Mysql Connections:{{ $value | humanize }} \n> {{ $labels.instance }}\n> {{ $labels.iid }}"

    - alert: MySQL_主从IO线程运行状态异常
      expr: mysql_slave_status_master_server_id > 0 and ON (instance) mysql_slave_status_slave_io_running == 0
      for: 1m
      labels:
        severity: 紧急
      annotations:
        description: "{{ $labels.group }}_{{ $labels.name }}:MySQL Slave IO thread not running \n> {{ $labels.instance }}\n> {{ $labels.iid }}"
    
    - alert: MySQL_主从SQL线程运行状态异常
      expr: mysql_slave_status_master_server_id > 0 and ON (instance) mysql_slave_status_slave_sql_running == 0
      for: 1m
      labels:
        severity: 紧急
      annotations:
        description: "{{ $labels.group }}_{{ $labels.name }}:MySQL Slave SQL thread not running \n> {{ $labels.instance }}\n> {{ $labels.iid }}"

    - alert: MySQL_主从复制延迟过高
      expr: mysql_slave_status_seconds_behind_master > 3
      for: 1m
      labels:
        severity: 紧急
      annotations:
        description: "{{ $labels.group }}_{{ $labels.name }}:主从复制延迟当前:{{ $value | humanize }}s \n> {{ $labels.instance }}\n> {{ $labels.iid }}"

    - alert: MySQL_is_Restart
      expr: mysql_global_status_uptime <600
      for: 2m
      labels:
        severity: 紧急
      annotations:
        description: "{{ $labels.group }}_{{ $labels.name }}:MySQL database is Restart. \n> {{ $labels.instance }}\n> {{ $labels.iid }}"

部署监控报警规则:

kubectl apply -f consul-mysql-exporter-rules.yaml

对接钉钉告警机器人

这里简单写下路由:

- "match_re":
    "region": "mysql.*"
  "receiver": "dd-mysql-prod"

参考文档

使用一个 mysqld exporter 监控所有的MySQL实例.md