Configuring Hadoop in Linux Environment

Keywords: Hadoop ssh sudo Java

Ubuntu 16 configuration Hadoop 2.85

  1. Set ssh password free login
 sudo apt-get install openssh-server   #Install SSH server
 $ ssh localhost                         #Log in SSH, enter yes for the first time
 $ exit                                  #Exit ssh localhost logged in
 $ cd ~/.ssh/                            #If you cannot enter the directory, execute ssh localhost once
 $ ssh-keygen -t rsa  
cat ./id_rsa.pub >> ./authorized_keys #Accession authorization
$ ssh localhost                         #At this time, you do not need a password to log in to localhost, as shown in the following figure. If it fails, you can search SSH password free login for answers

2. Create Hadoop users

sudo useradd -m hadoop -s /bin/bash  #Create a hadoop user and use / bin/bash as the shell
$ sudo passwd hadoop                   #Set the password for hadoop users, after which you need to enter the password twice in a row
$ sudo adduser hadoop sudo             #Add administrator rights for hadoop users
$ su - hadoop                          #Switch the current user to user hadoop
$ sudo apt-get update                  #Update the apt of hadoop users for later installation

3. install JDk
Download jdk on Oracle official website
After downloading, create the Java directory and unzip the package (note that you need to download the file in tar.gz format)

 mkdir /usr/lib/jvm                           #Create jvm folder
$ sudo tar zxvf jdk-7u80-linux-x64.tar.gz  -C /usr/lib #/Unzip to / usr/lib/jvm directory
$ cd /usr/lib/jvm                                 #Enter this directory
$ mv  jdk1.7.0_80 java                         #Rename to java
$ vi ~/.bashrc                                 #Configure environment variables for JDK

Edit the Java environment variables and add the environment variable path at the bottom of the bashrc file.

export JAVA_HOME=/usr/lib/jvm/java
export JRE_HOME=${JAVA_HOME}/jre
export CLASSPATH=.:${JAVA_HOME}/lib:${JRE_HOME}/lib
export PATH=${JAVA_HOME}/bin:$PATH

Use command after configuration

$ source ~/.bashrc                     #Make the newly configured environment variable effective
$ java -version                          #Check whether the installation is successful and check the java version

And check the Java version. If the version appears, the configuration is complete.
4. Install Hadoop
Download hadoop-2.8.5.tar.gz first. The link is as follows:
http://mirrors.hust.edu.cn/apache/hadoop/common/
Install as follows:

$ sudo tar -zxvf  hadoop-2.8.5.tar.gz -C /usr/local    #Unzip to / usr/local directory
$ cd /usr/local
$ sudo mv  hadoop-2.8.5    hadoop                      #Rename to hadoop
$ sudo chown -R hadoop ./hadoop                        #Modify file permissions

After installation, configure the environment variables in ~. / bashrc as in jdk.

export HADOOP_HOME=/usr/local/hadoop
export CLASSPATH=$($HADOOP_HOME/bin/hadoop classpath):$CLASSPATH
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin

Similarly, execute source ~./bashrc to make the settings effective, and check whether hadoop is installed successfully
View Hadoop version

Hadoop version

5.Hadoop pseudo distributed configuration
Edit the hadoop-env.sh file in the Hadoop installation directory and make the following changes and check if you want to change any other configuration.
export JAVA_HOME=/opt/jdk1.8.0_192
export HADOOP_CONF_DIR=${HADOOP_CONF_DIR:-"/home/linuxidc/hadoop-2.8.5/etc/hadoop"}
Change the configuration in the core-site.xml file

Use vim to edit core-site.xml, or you can use any editor The file is located in / usr/local/hadoop/etc/hadoop /, with the following entries added.

<configuration>
        <property>
             <name>hadoop.tmp.dir</name>
             <value>file:/usr/local/hadoop/tmp</value>
             <description>Abase for other temporary directories.</description>
        </property>
        <property>
             <name>fs.defaultFS</name>
             <value>hdfs://localhost:9000</value>
        </property>
</configuration>

Configuration changes in the hdfs-site.xml file

<configuration>
        <property>
             <name>dfs.replication</name>
             <value>1</value>
        </property>
        <property>
             <name>dfs.namenode.name.dir</name>
             <value>file:/usr/local/hadoop/tmp/dfs/name</value>
        </property>
        <property>
             <name>dfs.datanode.data.dir</name>
             <value>file:/usr/local/hadoop/tmp/dfs/data</value>
        </property>
</configuration>

After configuration, use hdfs namenode -format to perform the format of namenode.
Start hadoop program:. / usr / local / hadoop / SBIN / start dfs.sh
After startup, enter jps to check whether namenode and DataNode are both enabled.
6. Enter the Hadoop management interface
After successful startup, you can access the Web interface http://localhost:50070 View NameNode and Datanode information, and view files in HDFS online.

Posted by accu on Sat, 23 Nov 2019 12:30:15 -0800