Level 1: Statistics of user churn
This task: according to the user behavior data, write MapReduce program to count the loss of users.
This training is an intermediate difficulty MapReduce programming exercise, which simulates the statistical analysis of e-commerce data in real scenes. Therefore, it is ...
Posted by kane007 on Sat, 04 Dec 2021 19:59:57 -0800
1. Problem Description
2. Writing Code + Packaging Project in intellij idea
3. Upload jar package to Linux in xftp
4. Preparing input data + Running jar package + Viewing input results in hadoop
1. Problem Description
With MapReduce, for each user, A suggests 10 users who are not friends with A, but have the most common friends w ...
Posted by ADLE on Thu, 02 Dec 2021 11:32:51 -0800
Ranking statistics of Chinese Universities Based on MapReduce
① Fileinputformat reads data ② Mapper stage is simple for data processing ③ Serialization implements custom sorting ④ Partition partition processing ⑤ Reducer writes out data ⑥ Main class settings
The specific implementation is as follows
Driver main class, inclu ...
Posted by ursvmg on Tue, 30 Nov 2021 09:20:18 -0800
JobClient (prepare to run environment)JobTracker (receive job)TaskTracker (initialize job) Note that this is written in version 1.x and Hadoop 2. X and is managed by yarn. There are no JobTracker and TaskTracker
Comparison between old and new Hadoop MapReduce frameworks 1. The client remains unchanged, and most of its call ...
Posted by eyaly on Tue, 30 Nov 2021 04:04:24 -0800
1, Experimental purpose
Master the basic MapReduce programming methods through experiments;Master the methods to solve some common data processing problems with MapReduce, including data merging, data De duplication, data sorting and data mining.
2, Experimental platform
Operating system: Ubuntu 18.04 (or Ubuntu 16.04)Hadoop version: 3.2.2 ...
Posted by freshneco on Tue, 16 Nov 2021 23:43:47 -0800
Level 1: data cleaning
This task: clean the data according to certain rules.
According to the prompt, add code in the editor on the right to clean the data according to certain rules. Data description is as follows: a.txt; Data segmentation method: one or more spaces; Data location: / user/test/ ...
Posted by zoobie on Fri, 12 Nov 2021 02:26:04 -0800
2.1. Introduction to MapReduce
The core idea of MapReduce is "divide and conquer", which is suitable for a large number of complex task processing scenarios (large-scale data processing scenarios).
Map is responsible for "dividing", that is, dividing complex tasks into several "simple tasks" for ...
Posted by JCBarry on Sun, 07 Nov 2021 16:18:05 -0800
This is about hadoop Installation tests for.
stay server110 Install the configuration on and synchronize to server111,server112
Environmental Science Centos 7 jdk 1.8 hadoop-3.2.1
server110 192.168.1.110 server111 192.168.1.111 server112 192.168.1.112
[root@server110 software]# tar -xzvf hadoop ...
Posted by eazyGen on Sat, 02 Oct 2021 10:14:39 -0700