Implementing MapReduce in eclipse
1. Prepare the environment
The mapred-site.xml and yarn-site.xml configuration files of Hadoop under Windows are updated as in the virtual machine.
Copy mapred-site.xml and yarn-site.xml configuration files to the project.
Add dependency packages.
2. Operation mode
Run locally (start multiple threads in local eclipse to simul ...
Posted by Yanayaya on Tue, 29 Jan 2019 15:15:15 -0800
Flink Read and Write Series - Read and Write HBase
There are two ways to read HBase. One is to inherit RichSourceFunction, rewrite the parent method, and the other is to implement the OutputFormat interface. The code is as follows:
Way 1: Inherit RichSourceFunction
package com.my.flink.utils.streaming.hbase;
import com.my.flink.utils.config.ConfigKeys;
import org.apache.flin ...
Posted by jallard on Sun, 27 Jan 2019 21:03:14 -0800
Big Data Hadoop Tool python Tutorial 1-HDFS Hadoop Distributed File System
Hadoop Distributed File System (HDFS) is a Java-based distributed file system.Distributed, scalable and portable file systems are designed to span large clusters of commercial servers. The design of HDFS is based on Google File System (GFS). https://ai.google/research/pubs/pub51) . Like many other distributed file systems, HDFS has a large amo ...
Posted by Nick Zaccardi on Sun, 27 Jan 2019 19:21:14 -0800
Linux Learning from scratch IV
Authority management:
ls -l
rwxrwxrwx:
The left three: defining user(owner) permissions
The middle three: define the rights of group;
The right three: the right to define other
Process security context:
Application model of process access rights to files:
Whether the ownership of ...
Posted by ntbd on Sun, 27 Jan 2019 18:27:15 -0800
Spark Learning Notes (12) - SparkSQL
1 SparkSQL Introduction
Spark SQL is a module Spark uses to process structured data. It provides a programming abstraction called DataFrame and serves as a distributed SQL query engine.
Hive has been learned. It converts Hive SQL into MapReduce and submits it to cluster for execution. It greatly simplifies the complexity of pr ...
Posted by Dorin85 on Sat, 26 Jan 2019 00:24:15 -0800
hadoop garbage collection station
In production, the hdfs recycling bin must be open, usually set to 7 days.
fs.trash.interval reserves time for the garbage collection station, and disables the function of the collection station if it is 0.
The checkpoint time of fs.trash.checkpoint.interval recycling bin is generally set to be less than or equal to fs.trash.interval. If 0, t ...
Posted by NJordan72 on Fri, 25 Jan 2019 14:51:13 -0800
Hadoop 2.9.1 Install Hive 2.3.3 on Ubuntu 16.04
Hadoop 2.9.1 Install Hive 2.3.3 on Ubuntu 16.04
Preface
http://hive.apache.org/downloads.html There are instructions. hadoop3.x Version needs hive3.0.0,and hadoop2.x Need hive2.3.3. Because of mine. hadoop It's 2..9,So choose to download hive2.3.3. Hive yes hadoop Tools, so you only need to install them NameNode Up, no need to install DataN ...
Posted by longtone on Fri, 25 Jan 2019 04:45:13 -0800
idea writes WordCount program under windows and uploads it to hadoop cluster by jar package (fool version)
Typically, programs are programmed in IDE, then packaged as jar packages, and submitted to the cluster. The most commonly used method is to create a Maven project to manage the dependencies of jar packages using Maven.
1. Generating jar packages for WordCount
1. Open IDEA, File New Project Maven Next Fill in Groupld and Artif ...
Posted by kcgame on Thu, 24 Jan 2019 19:45:14 -0800
Java code implements basic operations on HBase
overview
1. Importing jar packages
2. Testing
3. Abnormal handling
First of all, the HBase Build, then start Zookeeper,Hadoop HBase cluster
1. Importing jar packages
Get ready:
1.CentOS7
2.Zookeeper Cluster
3. Hadoop 2.7.3 Cluster
4. HBase 2.0.0 cluster
5.eclipse
Build a java project in eclipse where you create a new lib fol ...
Posted by torrentmasta on Wed, 23 Jan 2019 19:27:12 -0800
Build a High Availability (HA) cluster using the Quorum Journal Manager (QJM)
Manual HA
1. If you are creating a brand new HA cluster, please complete this article first. Blog The first 18 steps.
2,modify core-site.xml
<property>
<name>fs.defaultFS</name>
<value>hdfs://mycluster</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
&l ...
Posted by dragongamer on Tue, 22 Jan 2019 19:24:12 -0800