Hbase specific operation (illustrated and super complete ~ ~ ~)

Purpose: (1) Understand the role of HBase in Hadoop architecture. (2) Proficient in using HBase to operate common Shell commands. Objectives: (1) Be familiar with hbase related operations, and master the operations of creating tables, modifying tables, looking up tables, deleting tables, etc. (2) You can create a table by yourself, be familiar ...

Posted by djBuilder on Wed, 17 Nov 2021 08:45:22 -0800

MapReduce programming practice -- WordCount running example (Python Implementation)

1, Experimental purpose Master the basic MapReduce programming methods through experiments;Master the methods to solve some common data processing problems with MapReduce, including data merging, data De duplication, data sorting and data mining. 2, Experimental platform Operating system: Ubuntu 18.04 (or Ubuntu 16.04)Hadoop version: 3.2.2 ...

Posted by freshneco on Tue, 16 Nov 2021 23:43:47 -0800

04 - system built-in interceptor usage & custom interceptor

Use of system built-in interceptor During Flume operation, Flume has the ability to modify / delete events during the operation phase, which is realized through interceptors. The interceptors have the following characteristics: The interceptor needs to implement the org.apache.flume.interceptor.Interceptor interface.The interceptor can modify ...

Posted by odtaa on Fri, 12 Nov 2021 20:08:28 -0800

Experiment 6 MapReduce data cleaning - meteorological data cleaning

Level 1: data cleaning Task description This task: clean the data according to certain rules. Programming requirements According to the prompt, add code in the editor on the right to clean the data according to certain rules. Data description is as follows: a.txt; Data segmentation method: one or more spaces; Data location: / user/test/ ...

Posted by zoobie on Fri, 12 Nov 2021 02:26:04 -0800

docker builds hbase environment

    hbase is a member of hadoop ecology. To build hbase first, you need to install hadoop, then zookeeper, and then hbase. Now hbase can be installed directly through docker, and hadoop is not required in the container.     The installation is simple. Directly pull the image and run it. docker run -d --name hbase -p 2181 ...

Posted by Procode on Mon, 08 Nov 2021 08:11:30 -0800

HA high availability + hive+hbase+sqoop+kafka+flume+spark installation and deployment

catalogue preface data HA high availability deployment Hive installation deployment Hbase installation and deployment sqoop installation deployment Unzip the installation package Modify profile environment variable  sqoop-env.sh Copy JDBC Driver Test whether Sqoop can successfully connect to the database kafka installation and depl ...

Posted by anthonyv on Mon, 08 Nov 2021 04:33:27 -0800

Hadoop entry note 23: MapReduce performance optimization - data compression optimization

1, Compression optimization designWhen running MapReduce program, disk I/O operation, network data transmission, shuffle and merge take a lot of time, especially in the case of large data scale and intensive workload. Since disk I/O and network bandwidth are valuable resources of Hadoop, data compression is very helpful to save resources and mi ...

Posted by Pandolfo on Sun, 07 Nov 2021 22:03:53 -0800

Review of big data development (MapReduce)

2,MapReduce 2.1. Introduction to MapReduce The core idea of MapReduce is "divide and conquer", which is suitable for a large number of complex task processing scenarios (large-scale data processing scenarios). Map is responsible for "dividing", that is, dividing complex tasks into several "simple tasks" for ...

Posted by JCBarry on Sun, 07 Nov 2021 16:18:05 -0800

2021-11-06 hadoop security mode

1 what is safe mode Security mode is a special state of HDFS. In this state, the file system only accepts data read requests, but does not accept change requests such as deletion and modification. When the namenode master node is started, HDFS first enters the safe mode. When the datanode is started, it will report the available block and oth ...

Posted by Thumper on Sat, 06 Nov 2021 11:06:32 -0700

[linux] CentOS8 Hadoop pseudo distributed environment construction (three node machines)

preface 1. This article mainly refers to Blogs: Click to enter , it integrates the pseudo distributed environment construction and some pit stepping matters.2. There are three node machines built in the environment: pc1, pc2 and pc3. Among them, pc1 is the management machine, and the three node machines all use the user Sillyhumans. If t ...

Posted by paperthinT on Thu, 04 Nov 2021 17:50:19 -0700