Fully distributed cluster
Fully distributed cluster in installation mode
1. Introduction to fully distributed mode
Fully distributed refers to using multiple machines to build a complete distributed file system in a real environment.
In the real world, hdfs Related daemons in are also distributed in different machines, such as:
-1. namenode As far as possible, the dae ...
Posted by Stryves on Mon, 06 Dec 2021 19:34:03 -0800
Scala, Java 50 programming questions
⚠️ The correctness of the solution is not guaranteed!
1. There are a pair of rabbits. From the third month after birth, a pair of rabbits are born every month. The rabbits grow to another pair of rabbits every month after the third month. If none of the rabbits dies, what is the total number of rabbits per month?
def main(args: Array[Strin ...
Posted by GreyFilm on Mon, 06 Dec 2021 09:13:55 -0800
HBase quickly import huge amount of data - Bulk Loading
advantage:
If we store a large amount of HBase data at one time, the processing speed is slow, and the Region resources are particularly occupied, a more efficient and convenient method is to use the "Bulk Loading" method, that is, the HFileOutputFormat class provided by HBase. It uses the principle that hbase data information is ...
Posted by rharter on Sun, 05 Dec 2021 20:42:35 -0800
How HDFS works
How HDFS works
1. NameNode and DataNode
HDFS adopts master/slave architecture. An HDFS cluster consists of a NameNode and a certain number of datanodes. NameNode is a central server, which is responsible for managing the namespace of the file system and the access of clients to files. A DataNode in a cluster is usually one node, which is resp ...
Posted by SamLiu on Fri, 03 Dec 2021 06:17:26 -0800
Hadoop yarn source code analysis AsyncDispatcher event asynchronous distributor 2021SC@SDUSC
2021SC@SDUSC
1, AsyncDispatcher overview
As an asynchronous event scheduler in yarn, AsyncDispatcher is a component of scheduling events based on blocking queue in RM. It dispatches events in a specific single thread and sends the dispatched events to the corresponding EventHandler event processor registered in AsyncDispatcher for processi ...
Posted by cyberrate on Wed, 01 Dec 2021 07:10:26 -0800
Scala --- option type and partial function, exception handling, regular expression
1. Pattern matching
Scala has a very powerful pattern matching mechanism and is widely used, such as:
Judge fixed valueType queryGet data quickly
1.1 simple pattern matching
A pattern matching contains a series of alternatives, each of which starts with the keyword case. Each alternative contains a pattern and one or more expressions. The a ...
Posted by KingPhilip on Tue, 30 Nov 2021 23:38:39 -0800
MapReduce comprehensive experiment -- ranking statistics of Chinese Universities
Ranking statistics of Chinese Universities Based on MapReduce
Overall thinking
① Fileinputformat reads data ② Mapper stage is simple for data processing ③ Serialization implements custom sorting ④ Partition partition processing ⑤ Reducer writes out data ⑥ Main class settings
The specific implementation is as follows
Driver main class, inclu ...
Posted by ursvmg on Tue, 30 Nov 2021 09:20:18 -0800
MapReduce core design -- job submission and initialization process analysis
Three components
JobClient (prepare to run environment)JobTracker (receive job)TaskTracker (initialize job) Note that this is written in version 1.x and Hadoop 2. X and is managed by yarn. There are no JobTracker and TaskTracker
Comparison between old and new Hadoop MapReduce frameworks 1. The client remains unchanged, and most of its call ...
Posted by eyaly on Tue, 30 Nov 2021 04:04:24 -0800
scala -- Set, Map, iterator, flattening, filtering, sorting, grouping, aggregation
4. Set set
Set (also known as: Set) represents a set without duplicate elements. Features: unique, disordered
The only meaning is that the elements in the Set are unique and there are no duplicate elementsUnordered means that the order in which the elements in the Set are added and taken out is inconsistent
Format I: Create an empty immutabl ...
Posted by wee493 on Mon, 29 Nov 2021 15:14:24 -0800
Construction of ecological environment of Hadoop 3 Apache platform
Construction of ecological environment of Hadoop 3 Apache platform
System environment
The server: hadoop0,hadoop1,hadoop2
Operating system: CentOS 7.6
Software list and version:
- jdk-8u202-linux-x64
- hadoop-3.2.28.
- zookeeper-3.4.10
- afka_2.12-2.7.11
- spark-3.1.2-bin-hadoop3.2
- MySQL-5.1.72-1.glibc23.x86_64.rpm-bundle
- hba ...
Posted by MadTechie on Mon, 29 Nov 2021 11:05:35 -0800