记录一个腾讯云上kafka不能正常启动问题

时间:2024-04-17 13:56:04
问题描述:刚在新的腾讯云三台节点上安了Zookeeper和kafka,改好对应配置文件后,启动zk和kafka。
启动zk后,启动kafka
jpsall以后 发现两个进程都启了(这里有猫腻,kafka其实没起来,过几秒就自动掉了,但我以为正常起了)
我再启动kafka的查看主题:
bin/kafka-topics.sh --bootstrap-server hadoop102:9092 --list
【控制台打印报错日志】
Connection to node -1 (hadoop102/172.17.0.16:9092) could not be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient)
他说我的节点不可用

我又用jpsall看了下,我的三台节点的kafka果然都自动毙掉了,我又去kafka的log下看日志:
【kafka的log目录下的报错日志】
[2024-04-13 08:42:36,154] WARN Session 0x0 for server hadoop102/172.17.0.16:2181, Closing socket connection. Attempting reconnect except it is a SessionExpiredException. (org.apache.zookeeper.ClientCnxn)
EndOfStreamException: Unable to read additional data from server sessionid 0x0, likely server has closed socket
        at org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:77)
        at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:350)
        at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1289)
这个错误表明你的应用程序尝试与Kafka集群中地址为hadoop102(主机名)、IP地址为172.17.0.16(IP)的节点(代理)通信,通过9092端口,但是无法建立连接。这可能是由于网络问题、代理服务未运行、错误的配置或者防火墙设置等原因造成的。

排查原因:我kafka没起来是因为Zookeeper没起成功
【Zookeeper里的log目录报错日志】

2024-04-13 09:15:24,561 [myid:2] - WARN  [NIOWorkerThread-3:NIOServerCnxn@371] - Unexpected exception
EndOfStreamException: Unable to read additional data from client, it probably closed the socket: address = /127.0.0.1:34296, session = 0x2000040fcc40002
        at org.apache.zookeeper.server.NIOServerCnxn.handleFailedRead(NIOServerCnxn.java:170)
        at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:333)
        at org.apache.zookeeper.server.NIOServerCnxnFactory$IOWorkRequest.doWork(NIOServerCnxnFactory.java:508)
        at org.apache.zookeeper.server.WorkerService$ScheduledWorkRequest.run(WorkerService.java:154)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
百度后给的解决方案是:防火墙没关?网络连接异常?客户端和服务器版本不匹配?参数配置问题?

行吧,那就一个一个试:
1.先关三台机器防火墙:
    systemctl stop firewalld 都关了
2.在试三台节点网络是否通畅
    ping hadoop103 正常
3.版本:
    这个不太可能,因为别人都用这两个版本,没报这个错,所以应该不是这个问题
4.配置问题:
    查看kafka的server.propertis文件,查看broker.id,查看advertised.listeners,zookeeper.connect三个参数,都正常

上述四项检查了,都没问题,重启Zookeeper,kafka,发现还是起不来,娘希匹
5.难道是腾讯云的问题?

   /etc/hosts,将上面那一堆乱七八糟的注释掉,分发
    重启三台zk,kafka。
    成功!!!