【甘道夫】Hadoop2.2.0 NN HA具体配置+Client透明性试验【完整版】

时间:2021-10-05 00:47:16
引言:
前面转载过一篇团队兄弟【伊利丹】写的NN HA实验记录,我也基于他的环境实验了NN HA对于Client的透明性。
本篇文章记录的是亲自配置NN HA的具体全过程,以及全面測试HA对client訪问透明性的全过程。希望对大家有帮助。

实验环境
Hadoop2.2.0的4节点集群。ZK节点3个(ZK节点数最好为奇数个),hosts文件和各节点角色分配例如以下:

hosts
192.168.66.91 master
192.168.66.92 slave1
192.168.66.93 slave2
192.168.66.94 slave3

角色分配
  Active NN Standby NN DN JournalNode Zookeeper FailoverController
master V     V V V
slave1   V V V V V
slave2     V V V  
slave3     V      




实验过程:


1.下载稳定版Zookeeper

并解压到hadoop集群某文件夹下,我放在了/home/yarn/下。

2.改动配置文件
配置文件在conf目录中,将zoo_sample.cfg改名为zoo.cfg。并对其做响应改动。下面是改动过后的zoo.cfg
# The number of milliseconds of each tick ZK之间,或者Client和ZK之间心跳的时间间隔
tickTime=2000

# The number of ticks that the initial 
# synchronization phase can take
initLimit=10
# The number of ticks that can pass between 
# sending a request and getting an acknowledgement
syncLimit=5

# the directory where the snapshot is stored. 
# do not use /tmp for storage, /tmp here is just 
# example sakes.  保存ZK数据的文件夹,请自行创建后在此处指定
dataDir=/home/yarn/Zookeeper/zoodata

# the port at which the clients will connect  客户端连接ZKserver的端口
clientPort=2181

# the maximum number of client connections.
# increase this if you need to handle more clients
#maxClientCnxns=60
#
# Be sure to read the maintenance section of the 
# administrator guide before turning on autopurge.
#
# http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance
#
# The number of snapshots to retain in dataDir
#autopurge.snapRetainCount=3
# Purge task interval in hours
# Set to "0" to disable auto purge feature
#autopurge.purgeInterval=1

#保存ZK日志的文件夹,请自行创建后在此处指定
dataLogDir=/home/yarn/Zookeeper/zoolog

#******下面部分均为分布式ZK集群使用******
#ZK集群初始化时。Follower节点须要和Leader节点通信,initLimit配置的是Leader最多等待多少个心跳
initLimit=5 

#Leader和Follower之间发送消息、请求和应答时。最多等待多少个心跳
syncLimit=2 

#server.A=B:C:D
#A是一个数字,表示这是第几号server
#B是当前server的ID或者主机名
#C是Followerserver与Leaderserver交换信息的port
#D是当Leader挂掉时,又一次选举Leader所使用的port
server.1=192.168.66.91:2888:3888 
server.2=192.168.66.92:2888:3888
server.3=192.168.66.93:2888:3888
#千万注意:接下来须要在各个几点的dataDir文件夹下建立myid文件,内容就是对应的A,也就是说,各个ZK节点的myid文件内容不同 !!!

3.改动各个节点的环境变量
在/etc/profile文件加入:
export ZOOKEEPER_HOME=/home/yarn/Zookeeper/zookeeper-3.4.6
并为PATH加上
$ZOOKEEPER_HOME/bin
注意:export ZOOKEEPER_HOME要在PATH的上方。

以下開始改动Hadoop的配置文件:

4.改动core-site.xml
<configuration>
<property>
  <name>fs.defaultFS</name>
  <value>hdfs://myhadoop</value>
  <description>注意:myhadoop为集群的逻辑名,需与hdfs-site.xml中的dfs.nameservices一致!</description>
</property>
<property>
  <name>hadoop.tmp.dir</name>
  <value>/home/yarn/Hadoop/hdfs2.0/tmp</value>
</property>
<property>
  <name>ha.zookeeper.quorum</name>
  <value>master:2181,slave1:2181,slave2:2181</value>
  <description>各个ZK节点的IP/host,及客户端连接ZK的port,该port需与zoo.cfg中的clientPort一致。</description>
</property>
</configuration>

5.改动hdfs-site.xml
<?

xml version="1.0" encoding="UTF-8"?>

<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->

<!-- Put site-specific property overrides in this file. -->

<configuration>
<property>
  <name>dfs.nameservices</name>
  <value>myhadoop</value>
  <description>
    Comma-separated list of nameservices.
    as same as fs.defaultFS in core-site.xml.
  </description>
</property>

<property>
  <name>dfs.ha.namenodes.myhadoop</name>
  <value>nn1,nn2</value>
  <description>
    The prefix for a given nameservice, contains a comma-separated
    list of namenodes for a given nameservice (eg EXAMPLENAMESERVICE).
  </description>
</property>

<property>
  <name>dfs.namenode.rpc-address.myhadoop.nn1</name>
  <value>master:8020</value>
  <description>
    RPC address for nomenode1 of hadoop-test
  </description>
</property>

<property>
  <name>dfs.namenode.rpc-address.myhadoop.nn2</name>
  <value>slave1:8020</value>
  <description>
    RPC address for nomenode2 of hadoop-test
  </description>
</property>

<property>
  <name>dfs.namenode.http-address.myhadoop.nn1</name>
  <value>master:50070</value>
  <description>
    The address and the base port where the dfs namenode1 web ui will listen on.
  </description>
</property>

<property>
  <name>dfs.namenode.http-address.myhadoop.nn2</name>
  <value>slave1:50070</value>
  <description>
    The address and the base port where the dfs namenode2 web ui will listen on.
  </description>
</property>


<property>  
  <name>dfs.namenode.servicerpc-address.myhadoop.n1</name>
 
  <value>master:53310</value>  
</property>  
<property>  
  <name>dfs.namenode.servicerpc-address.myhadoop.n2</name>
 
  <value>slave1:53310</value>  
</property>



<property>
  <name>dfs.namenode.name.dir</name>
  <value>file:///home/yarn/Hadoop/hdfs2.0/name</value>
  <description>Determines where on the local filesystem the DFS name node
      should store the name table(fsimage).  If this is a comma-delimited list
      of directories then the name table is replicated in all of the
      directories, for redundancy. </description>
</property>

<property>
  <name>dfs.namenode.shared.edits.dir</name>
  <value>qjournal://slave1:8485;slave2:8485;slave3:8485/hadoop-journal</value>
  <description>A directory on shared storage between the multiple namenodes
  in an HA cluster. This directory will be written by the active and read
  by the standby in order to keep the namespaces synchronized. This directory
  does not need to be listed in dfs.namenode.edits.dir above. It should be
  left empty in a non-HA cluster.
  </description>
</property>

<property>
  <name>dfs.datanode.data.dir</name>
  <value>file:///home/yarn/Hadoop/hdfs2.0/data</value>
  <description>Determines where on the local filesystem an DFS data node
  should store its blocks.  If this is a comma-delimited
  list of directories, then data will be stored in all named
  directories, typically on different devices.
  Directories that do not exist are ignored.
  </description>
</property>

<property>
  <name>dfs.ha.automatic-failover.enabled</name>
  <value>true</value>
  <description>
    Whether automatic failover is enabled. See the HDFS High
    Availability documentation for details on automatic HA
    configuration.
  </description>
</property>

<property>
  <name>dfs.journalnode.edits.dir</name>
  <value>/home/yarn/Hadoop/hdfs2.0/journal/</value>
</property>

<property>  
  <name>dfs.client.failover.proxy.provider.myhadoop</name>
                       
  <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
  <description>Configure the name of the Java class which will be used by
the DFS Client to determine which NameNode is the current Active, and therefore which NameNode is currently serving client requests. 
这个类是Client的訪问代理。是HA特性对于Client透明的关键!

</description>  
</property>  
      
<property>      
  <name>dfs.ha.fencing.methods</name>      
  <value>sshfence</value>  
  <description>how to communicate in the switch process</description>
</property>  
    
<property>      
  <name>dfs.ha.fencing.ssh.private-key-files</name>      
  <value>/home/yarn/.ssh/id_rsa</value>
  <description>the location stored ssh key</description>
</property>  
  
<property>  
  <name>dfs.ha.fencing.ssh.connect-timeout</name>  
  <value>1000</value>  
</property>  
  
<property>  
  <name>dfs.namenode.handler.count</name>  
  <value>8</value>  
</property> 

</configuration>

6.将改动好的core-site.xml和hdfs-site.xml复制到各个hadoop节点。


7.启动
(1)启动ZK
在全部的ZK节点运行命令:
zkServer.sh start

查看各个ZK的从属关系:
yarn@master:~$ zkServer.sh status
JMX enabled by default
Using config: /home/yarn/Zookeeper/zookeeper-3.4.6/bin/../conf/zoo.cfg
Mode: follower

yarn@slave1:~$ zkServer.sh status
JMX enabled by default
Using config: /home/yarn/Zookeeper/zookeeper-3.4.6/bin/../conf/zoo.cfg
Mode: follower

yarn@slave2:~$ zkServer.sh status
JMX enabled by default
Using config: /home/yarn/Zookeeper/zookeeper-3.4.6/bin/../conf/zoo.cfg
Mode: leader

注意:
哪个ZK节点会成为leader是随机的,第一次实验时slave2成为了leader,第二次实验时slave1成为了leader!

此时,在各个节点都能够查看到ZK进程:
yarn@master:~$ jps
3084 QuorumPeerMain
3212 Jps

(2)格式化ZK(仅第一次须要做)
随意ZK节点上运行:
hdfs zkfc -formatZK

(3)启动ZKFC
ZookeeperFailoverController是用来监控NN状态。协助实现主备NN切换的,所以只在主备NN节点上启动即可:
hadoop-daemon.sh start zkfc

启动后我们能够看到ZKFC进程:
yarn@master:~$ jps
3084 QuorumPeerMain
3292 Jps
3247 DFSZKFailoverController

(4)启动用于主备NN之间同步元数据信息的共享存储系统JournalNode
參见角色分配表,在各个JN节点上启动:
hadoop-daemon.sh start journalnode

启动后在各个JN节点都能够看到JournalNode进程:
yarn@master:~$ jps
3084 QuorumPeerMain
3358 Jps
3325 JournalNode
3247 DFSZKFailoverController

(5)格式化并启动主NN
格式化:
hdfs
namenode -format

注意:仅仅有第一次启动系统时需格式化。请勿反复格式化!

在主NN节点运行命令启动NN:
hadoop-daemon.sh start namenode

启动后能够看到NN进程:
yarn@master:~$ jps
3084 QuorumPeerMain
3480 Jps
3325 JournalNode
3411 NameNode
3247 DFSZKFailoverController


(6)在备NN上同步主NN的元数据信息
hdfs namenode -bootstrapStandby

下面是正常运行时的最后部分日志:
Re-format filesystem in Storage Directory /home/yarn/Hadoop/hdfs2.0/name ? (Y or N) Y
14/06/15 10:09:08 INFO common.Storage: Storage directory /home/yarn/Hadoop/hdfs2.0/name has been successfully formatted.
14/06/15 10:09:09 INFO namenode.TransferFsImage: Opening connection to http://master:50070/getimage?getimage=1&txid=935&storageInfo=-47:564636372:0:CID-d899b10e-10c9-4851-b60d-3e158e322a62
14/06/15 10:09:09 INFO namenode.TransferFsImage: Transfer took 0.11s at 63.64 KB/s
14/06/15 10:09:09 INFO namenode.TransferFsImage: Downloaded file fsimage.ckpt_0000000000000000935 size 7545 bytes.
14/06/15 10:09:09 INFO util.ExitUtil: Exiting with status 0
14/06/15 10:09:09 INFO namenode.NameNode: SHUTDOWN_MSG: 
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at slave1/192.168.66.92
************************************************************/

(7)启动备NN
在备NN上运行命令:
hadoop-daemon.sh start namenode

(8)设置主NN(这一步能够省略,这是在设置手动切换NN时的步骤。ZK已经自己主动选择一个节点作为主NN了)
到眼下为止,事实上HDFS还不知道谁是主NN,能够通过监控页面查看,两个节点的NN都是Standby状态。
以下我们须要在主NN节点上运行命令激活主NN:
hdfs haadmin -transitionToActive nn1

(9)在主NN上启动Datanode
在[nn1]上,启动全部datanode
hadoop-daemons.sh start datanode

8.效果验证1--主备自己主动切换
眼下的主NN是192.168.0.91
【甘道夫】Hadoop2.2.0 NN HA具体配置+Client透明性试验【完整版】
【甘道夫】Hadoop2.2.0 NN HA具体配置+Client透明性试验【完整版】

备NN是192.168.0.92
【甘道夫】Hadoop2.2.0 NN HA具体配置+Client透明性试验【完整版】
【甘道夫】Hadoop2.2.0 NN HA具体配置+Client透明性试验【完整版】

watermark/2/text/aHR0cDovL2Jsb2cuY3Nkbi5uZXQvdTAxMDk2NzM4Mg==/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/70/gravity/SouthEast" alt="">

我在主NN上kill掉NameNode进程:
yarn@master:~$ jps
5161 NameNode
5085 JournalNode
5438 Jps
4987 DFSZKFailoverController
4904 QuorumPeerMain
yarn@master:~$ kill 5161
yarn@master:~$ jps
5451 Jps
5085 JournalNode
4987 DFSZKFailoverController
4904 QuorumPeerMain

此时,主NN监控页面无法訪问:
【甘道夫】Hadoop2.2.0 NN HA具体配置+Client透明性试验【完整版】

watermark/2/text/aHR0cDovL2Jsb2cuY3Nkbi5uZXQvdTAxMDk2NzM4Mg==/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/70/gravity/SouthEast" alt="">


【甘道夫】Hadoop2.2.0 NN HA具体配置+Client透明性试验【完整版】
备NN自己主动切换为主NN:
【甘道夫】Hadoop2.2.0 NN HA具体配置+Client透明性试验【完整版】
【甘道夫】Hadoop2.2.0 NN HA具体配置+Client透明性试验【完整版】


9.效果验证2--HA对shell的透明性
訪问逻辑名myhadoop,运行命令查看文件夹结构,不受影响:
yarn@slave3:~$ hadoop dfs -ls hdfs://myhadoop/
DEPRECATED: Use of this script to execute hdfs command is deprecated.
Instead use the hdfs command for it.

Found 3 items
drwxr-xr-x   - yarn supergroup          0 2014-03-20 00:10 hdfs://myhadoop/home
drwxrwx---   - yarn supergroup          0 2014-03-17 20:11 hdfs://myhadoop/tmp
drwxr-xr-x   - yarn supergroup          0 2014-03-17 20:15 hdfs://myhadoop/workspace

10.效果验证3--HA对Client程序的透明性
使用自己写的HdfsDAO.java測试,程序中将指向HDFS的path设置为:
private static final String HDFS = "hdfs://myhadoop/";
先ping myhadoop确保没有配置hosts。然后执行程序,一切正常:
yarn@master:~$ ping myhadoop
ping: unknown host myhadoop
yarn@master:~$ hadoop jar Desktop/hatest.jar HdfsDAO 
ls: /
==========================================================
name: hdfs://myhadoop/home, folder: true, size: 0
name: hdfs://myhadoop/tmp, folder: true, size: 0
name: hdfs://myhadoop/workspace, folder: true, size: 0
==========================================================

【甘道夫】Hadoop2.2.0 NN HA具体配置+Client透明性试验【完整版】的更多相关文章

  1. 【甘道夫】Hive 0&period;13&period;1 on Hadoop2&period;2&period;0 &plus; Oracle10g部署详细解释

    环境: hadoop2.2.0 hive0.13.1 Ubuntu 14.04 LTS java version "1.7.0_60" Oracle10g ***欢迎转载.请注明来 ...

  2. 【甘道夫】Win7x64环境下编译Apache Hadoop2&period;2&period;0的Eclipse小工具

    目标: 编译Apache Hadoop2.2.0在win7x64环境下的Eclipse插件 环境: win7x64家庭普通版 eclipse-jee-kepler-SR1-win32-x86_64.z ...

  3. 【伊利丹】Hadoop2&period;0 NN HA实验记录

    1.关于Hadoop2.2.0中HA的介绍 watermark/2/text/aHR0cDovL2Jsb2cuY3Nkbi5uZXQvdTAxNDUxMjEyNA==/font/5a6L5L2T/fo ...

  4. 【甘道夫】MapReduce实现矩阵乘法--实现代码

    之前写了一篇分析MapReduce实现矩阵乘法算法的文章: [甘道夫]Mapreduce实现矩阵乘法的算法思路 为了让大家更直观的了解程序运行,今天编写了实现代码供大家參考. 编程环境: java v ...

  5. 精通Android4&period;0开发视频【张泽华】-完整版下载

    观看须知: 本视频教程为黑马程序员 张泽华老师历经2年时间整理 适合有JavaWeb基础同学学习,教程采用的AVI方式发布,所以看起来很流畅. 视频概括: 1. 本套视频不同于市面上任何一套andro ...

  6. 【甘道夫】Win7环境下Eclipse连接Hadoop2&period;2&period;0

    准备: 确保hadoop2.2.0集群正常执行 1.eclipse中建立javaproject,导入hadoop2.2.0相关jar包 2.在src根文件夹下拷入log4j.properties,通过 ...

  7. 【甘道夫】NN HA 对于 Client 透明的实验

    之前转载过一篇[伊利丹]写的NN HA实验记录.该博客描写叙述了主备NN透明切换的过程,也就是说,当主NN挂掉后,自己主动将备NN切换为主NN了,Hadoop集群正常执行. 今天我继续做了一个实验.目 ...

  8. 【甘道夫】Ubuntu14 server &plus; Hadoop2&period;2&period;0环境下Sqoop1&period;99&period;3部署记录

    第一步.下载.解压.配置环境变量: 官网下载sqoop1.99.3 http://mirrors.cnnic.cn/apache/sqoop/1.99.3/ 将sqoop解压到目标文件夹,我的是 /h ...

  9. 【甘道夫】Hadoop2&period;2&period;0环境使用Sqoop-1&period;4&period;4将Oracle11g数据导入HBase0&period;96,并自己主动生成组合行键

    目的: 使用Sqoop将Oracle中的数据导入到HBase中,并自己主动生成组合行键! 环境: Hadoop2.2.0 Hbase0.96 sqoop-1.4.4.bin__hadoop-2.0.4 ...

随机推荐

  1. JAVA-系统-【2】-创建自增长的用户表

    [2]创建数据库表  用户表 自增 1.用户表结构  数据excel 表1 2.创建表 Create table A_USER( id number primary key, username ) n ...

  2. python gui之tkinter事件处理

    事件一览表 事件 代码 备注 鼠标左键单击按下 1/Button-1/ButtonPress-1   鼠标左键单击松开 ButtonRelease-1   鼠标右键单击 3   鼠标左键双击 Doub ...

  3. 关于 UICollectionViewCell 的一些陷阱

    如果直接使用 UICollectionViewCell 的自带属性 selected 来自定义一些样式,如: - (void)setSelected:(BOOL)selected { [super s ...

  4. 边工作边刷题:70天一遍leetcode&colon; day 81

    Encode and Decode Strings 要点:题的特点:不是压缩,而是encode为字节流.所以需要找delimiter来分割每个word,但是delimiter可能是字符本身,所以可以用 ...

  5. shell中的正则表达式

    1.正则与通配符 linux中的通配符是用来匹配文件名的,其匹配是完全匹配.只支持通配符则命令有ls find cp等命令 正则是用来匹配字符串的,是包含匹配.只要搜索的内容在某个字符串中,那么改字符 ...

  6. ubuntu 16&period;04 安装 QQ

    需要在Ubuntu 16.04下使用QQ,查找了一下,知乎的办法可行. 参考了:http://www.zhihu.com/question/20176925 与 http://www.zhihu.co ...

  7. webstorm配置scss环境

    1.下载 Ruby  (安装过程中记得勾选添加到环境变量,安装结束最后可能会弹出一个cmd弹框,可以忽略) 2. cmd安装sass gem install sass 3. cmd检查是否安装 sas ...

  8. ELK收集Nginx自定义日志格式输出

    1.ELK收集日志的有两种常用的方式: 1.1:不修改源日志格式,简单的说就是在logstash中转通过 grok方式进行过滤处理,将原始无规则的日志转换为规则日志(Logstash自定义日志格式) ...

  9. 使用Nginx来解决跨域的问题

    使用Nginx来解决跨域的问题 nginx的版本:(查看nginx命令: /usr/local/nginx/sbin/nginx -v) nginx/1.4.3 问题是:前端项目域名是 a.xxxx. ...

  10. 在U盘上安装GRUB2直接引导ISO

    本文的内容来源于 http://maxmars.net/blog/2012/10/02/boot-multiple-iso-from-usb-using-linux/ 以下所有命令都在 root 用户 ...