Build a High Availability (HA) cluster using the Quorum Journal Manager (QJM)

Keywords: Big Data Hadoop xml Zookeeper ssh

Manual HA

1. If you are creating a brand new HA cluster, please complete this article first. Blog The first 18 steps.

2,modify core-site.xml

<property>
	<name>fs.defaultFS</name>
	<value>hdfs://mycluster</value>
</property>
<property>
  	<name>hadoop.tmp.dir</name>
	<value>/home/sweeney/soft/tmp</value>
</property>
<property>
	<name>dfs.journalnode.edits.dir</name>
	<value>/home/hyxy/soft/tmp/journalnode</value>
</property>

3,modify hdfs-site.xml

<property>
	<name>dfs.nameservices</name>
	<value>mycluster</value>
</property>
<property>
	<name>dfs.ha.namenodes.mycluster</name>
	<value>nn1,nn2</value>
</property>
<property>
	<name>dfs.namenode.rpc-address.mycluster.nn1</name>
	<value>master:9000</value>
</property>
<property>
	<name>dfs.namenode.rpc-address.mycluster.nn2</name>
	<value>slave1:9000</value>
</property>
<property>
	<name>dfs.namenode.http-address.mycluster.nn1</name>
	<value>master:50070</value>
</property>
<property>
	<name>dfs.namenode.http-address.mycluster.nn2</name>
	<value>slave1:50070</value>
</property>
<property>
	<name>dfs.namenode.shared.edits.dir</name>
	<value>qjournal://master:8485;slave1:8485;slave2:8485/mycluster</value>
</property>
<property>
	<name>dfs.client.failover.proxy.provider.mycluster</name>
	<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>
<property>
	<name>dfs.ha.fencing.methods</name>
	<value>sshfence</value>
</property>
<property>
	<name>dfs.ha.fencing.ssh.private-key-files</name>
	<value>/home/hyxy/.ssh/id_rsa</value>
</property>

4,(If you have formatted the new cluster, please skip to Step 6 below. NameNode，Or it's going to be enabled HA Conversion of cluster to enablement HA，Then you should run the command " hdfs namenode bootstrapStandby "take NameNode Copy the contents of the metadata directory to another unformatted NameNode Up.
hadoop-daemon.sh start namenode
hdfs namenode bootstrapStandby(Before executing this command, make sure namenode It's in the running state.

5,If you want to HA NameNode Convert to HA，Run the command " hdfs namenode -initializeSharedEdits "，This command will be used locally NameNode Initialization of Edit Data in Edit Catalog JournalNodes.
hadoop-daemon.sh stop namenode
hdfs namenode -initializeSharedEdits(Before executing this command, make sure nomenode Stop running)

6,If it's a new cluster, ignore steps 4 or 5 and go back to this article Blog Step 21, the next step is basically the same. Only step 25, before formatting, starts on each node journalnode.
hadoop-daemon.sh start journalnode

7. Whether it is a new cluster or a transition from non-HA to HA, after the above steps, the HA is built manually. Then we need to test.

Switch one of the namenodes to active (see the status of the two namenodes on the web first)
hdfs haadmin -transitionToActive nn1
 Testing the success of uploading files

II. Automatic HA

1. Complete the HA manual part first.

2. Install ZooKeeper on all nodes first. Blog.

3,stay zookeeper Of conf Under the catalogue, modify zoo.cfg(Without this file, please zoo_sample.cfg Copy a) configuration file.
		tickTime=2000
		dataDir=/home/hyxy/soft/tmp/zookeeper
		clientPort=2181
		initLimit=5
		syncLimit=2
		server.1=master:2888:3888
		server.2=slave1:2888:3888
		server.3=slave2:2888:3888

4,take zoo.cfg Files are sent to other nodes.
scp ~/soft/zoo/conf/zoo.cfg sweeney@slave1:~/soft/zoo/conf/

5. Each machine runs zkServer.sh start once. If it cannot start, please execute the following command in the dataDir directory configured by each node.

Echo "1" > myid (note 1 is the ID server. 1 = master: 2888: 3888 given in the configuration file)

6,Running on each machine again zkServer.sh start，Then verify zkCli.sh Is the order successful?
zkCli.sh

7,Above manual HA In the core-site.xml Append

<property>
	<name>ha.zookeeper.quorum</name>
	<value>master:2181,slave1:2181,slave2:2181</value>
</property>

Added to hdfs-site.xml

<property>
	<name>dfs.ha.automatic-failover.enabled</name>
	<value>true</value>
</property>

8,Start cluster
start-dfs.sh

If the cluster has been started before configuration, the zkfc daemon is opened separately using the following command

hadoop-daemon.sh --script $HADOOP_PREFIX/bin/hdfs start zkfc

9. Test, kill the namenode process with the following command, and then check the web interface to see if another namenode status becomes active.

Kill-9 namenode's process id

Posted by dragongamer on Tue, 22 Jan 2019 19:24:12 -0800

Programmer Group

Build a High Availability (HA) cluster using the Quorum Journal Manager (QJM)

Hot Keywords