HDFS Client Operations (win10)

1. Building Development Environment 1) copy the compiled hadoop jar package to the non Chinese path. 2) Configure HADOOP_HOME environment variables 3) Configuring Path environment variables 4) Create a Maven Project HDFSClientDemo 5) Import the corresponding dependent coordinates + log additions ...

Posted by becoolufull on Sun, 06 Oct 2019 10:42:44 -0700

Xtrabackup backup script

Install MySQL mysql-5.7 (yum mode) curl -sSL https://dwz.cn/gfcnHqGS -o install-mysql.sh chmod +x install-mysql.sh ./install-mysql.sh --get-version # View software versions in the repository # Example ./install-mysql.sh --active install --data-dir /home/hadoop/mysql --version 5.7.23 --root-pass 123abc@DEF --help # View Help Inform ...

Posted by jtbaker on Sun, 06 Oct 2019 01:57:33 -0700

Various daemon States

Various daemon States Article directory Various daemon States 1. pseudo distribution 2. Fully distributed 2.1 Do not open yarn 2.2 Open yarn (MRAppMaster and YarnChild are available only with MapReduce tasks) 3. Fully Distributed (High Availability) 3.1 Manual Disaster Recovery 3.1.1 No yarn 3 ...

Posted by palpie on Sat, 05 Oct 2019 17:26:13 -0700

Hadoop learning-hadoop installation

Stand-alone version configuration Upload installation package Upload to / bigdata decompression Unzip to / apps directory tar -zxvf /bigdata/hadoop-2.7.1.tar.gz -C /apps Configuring environment variables vi /etc/profile #In the final increase export HADOOP_HOME=/apps/hadoop-2.7.1 export PAT ...

Posted by mimilaw123 on Tue, 01 Oct 2019 09:29:04 -0700

Hive custom UDF function

In hive, sometimes you need to customize some functions according to business requirements. Here are the steps to customize functions 1. Create a new maven project and introduce dependencies in the project's pom file <dependency> <groupId>org.apache.hive</groupId&g ...

Posted by ded on Mon, 30 Sep 2019 03:59:50 -0700

Fully Distributed Cluster of Hadoop

Cluster environment: centOs6.8: hadoop102,hadoop103,hadoop104 JDK version: jdk1.8.0_144 hadoop version: Hadoop 2.7.2 First, prepare three clients (hadoop 102, Hadoop 103, Hadoop 104), close the firewall, and modify them to static ip and ip address mapping. Configuration cluster Writing Cluster Distribution Scripts Create a remote synchro ...

Posted by mcbeckel on Wed, 25 Sep 2019 04:17:21 -0700

Four deployment modes and basic operations of HBase

This paper mainly includes two parts. The first part mainly includes four installation methods of HBase, which are: (1) stand-alone mode, (2) pseudo-cluster mode, (3) using zookeeper which comes with HBase to build distributed cluster, and (4) using zookeeper which is installed independently to build distributed cluster. The second part shows t ...

Posted by simpli on Sat, 21 Sep 2019 07:50:49 -0700

HiveQL table and table query

HiveQL Data Operation 1. Loading data into tables load data local inpath '/data/employees' overwrite into table employees partition (country='US',state='CA') If the partition directory does not exist, this command automatically creates the part ...

Posted by scross on Wed, 18 Sep 2019 04:24:15 -0700

Distributed Log Collection Framework Flume

1 Demand analysis WebServer/Application Server is scattered across machines, but we still want to do statistical analysis on the Hadoop platform. How do we collect logs on the Hadoop platform? Is that simple? shell cp hadoop Clustered machines; hadoop fs -put ... / Obviously, this method is facing a series of problems such as fault toleranc ...

Posted by Andy82 on Tue, 17 Sep 2019 09:51:00 -0700

MapReduce realizes secondary sorting

Article directory 1. Preface 2. Demand analysis 3. The Realization Principle of Quadratic Sorting 4. Upload files 5. Code Implementation 6. Effect screenshots 1. Preface By default, Map will sort keys automatically, but sometimes it needs to ...

Posted by callesson on Tue, 17 Sep 2019 05:37:12 -0700