Hbase specific operation (illustrated and super complete ~ ~ ~)
Purpose: (1) Understand the role of HBase in Hadoop architecture. (2) Proficient in using HBase to operate common Shell commands. Objectives: (1) Be familiar with hbase related operations, and master the operations of creating tables, modifying tables, looking up tables, deleting tables, etc. (2) You can create a table by yourself, be familiar ...
Posted by djBuilder on Wed, 17 Nov 2021 08:45:22 -0800
MapReduce programming practice -- WordCount running example (Python Implementation)
1, Experimental purpose
Master the basic MapReduce programming methods through experiments;Master the methods to solve some common data processing problems with MapReduce, including data merging, data De duplication, data sorting and data mining.
2, Experimental platform
Operating system: Ubuntu 18.04 (or Ubuntu 16.04)Hadoop version: 3.2.2 ...
Posted by freshneco on Tue, 16 Nov 2021 23:43:47 -0800
04 - system built-in interceptor usage & custom interceptor
Use of system built-in interceptor
During Flume operation, Flume has the ability to modify / delete events during the operation phase, which is realized through interceptors. The interceptors have the following characteristics:
The interceptor needs to implement the org.apache.flume.interceptor.Interceptor interface.The interceptor can modify ...
Posted by odtaa on Fri, 12 Nov 2021 20:08:28 -0800
Experiment 6 MapReduce data cleaning - meteorological data cleaning
Level 1: data cleaning
Task description
This task: clean the data according to certain rules.
Programming requirements
According to the prompt, add code in the editor on the right to clean the data according to certain rules. Data description is as follows: a.txt; Data segmentation method: one or more spaces; Data location: / user/test/ ...
Posted by zoobie on Fri, 12 Nov 2021 02:26:04 -0800
docker builds hbase environment
hbase is a member of hadoop ecology. To build hbase first, you need to install hadoop, then zookeeper, and then hbase. Now hbase can be installed directly through docker, and hadoop is not required in the container.
The installation is simple. Directly pull the image and run it.
docker run -d --name hbase -p 2181 ...
Posted by Procode on Mon, 08 Nov 2021 08:11:30 -0800
HA high availability + hive+hbase+sqoop+kafka+flume+spark installation and deployment
catalogue
preface
data
HA high availability deployment
Hive installation deployment
Hbase installation and deployment
sqoop installation deployment
Unzip the installation package
Modify profile
environment variable
sqoop-env.sh
Copy JDBC Driver
Test whether Sqoop can successfully connect to the database
kafka installation and depl ...
Posted by anthonyv on Mon, 08 Nov 2021 04:33:27 -0800
Hadoop entry note 23: MapReduce performance optimization - data compression optimization
1, Compression optimization designWhen running MapReduce program, disk I/O operation, network data transmission, shuffle and merge take a lot of time, especially in the case of large data scale and intensive workload. Since disk I/O and network bandwidth are valuable resources of Hadoop, data compression is very helpful to save resources and mi ...
Posted by Pandolfo on Sun, 07 Nov 2021 22:03:53 -0800
Review of big data development (MapReduce)
2,MapReduce
2.1. Introduction to MapReduce
The core idea of MapReduce is "divide and conquer", which is suitable for a large number of complex task processing scenarios (large-scale data processing scenarios).
Map is responsible for "dividing", that is, dividing complex tasks into several "simple tasks" for ...
Posted by JCBarry on Sun, 07 Nov 2021 16:18:05 -0800
2021-11-06 hadoop security mode
1 what is safe mode
Security mode is a special state of HDFS. In this state, the file system only accepts data read requests, but does not accept change requests such as deletion and modification.
When the namenode master node is started, HDFS first enters the safe mode. When the datanode is started, it will report the available block and oth ...
Posted by Thumper on Sat, 06 Nov 2021 11:06:32 -0700
[linux] CentOS8 Hadoop pseudo distributed environment construction (three node machines)
preface
1. This article mainly refers to Blogs: Click to enter , it integrates the pseudo distributed environment construction and some pit stepping matters.2. There are three node machines built in the environment: pc1, pc2 and pc3. Among them, pc1 is the management machine, and the three node machines all use the user Sillyhumans. If t ...
Posted by paperthinT on Thu, 04 Nov 2021 17:50:19 -0700