Experiment 8 project case - e-commerce data analysis
Level 1: Statistics of user churn
Task description
This task: according to the user behavior data, write MapReduce program to count the loss of users.
Relevant knowledge
This training is an intermediate difficulty MapReduce programming exercise, which simulates the statistical analysis of e-commerce data in real scenes. Therefore, it is ...
Posted by kane007 on Sat, 04 Dec 2021 19:59:57 -0800
Hadoop2.6.0+Linux Centos7+idea environment: MapReduce second degree friend recommendation case
Catalog
1. Problem Description
2. Writing Code + Packaging Project in intellij idea
3. Upload jar package to Linux in xftp
4. Preparing input data + Running jar package + Viewing input results in hadoop
1. Problem Description
With MapReduce, for each user, A suggests 10 users who are not friends with A, but have the most common friends w ...
Posted by ADLE on Thu, 02 Dec 2021 11:32:51 -0800
MapReduce comprehensive experiment -- ranking statistics of Chinese Universities
Ranking statistics of Chinese Universities Based on MapReduce
Overall thinking
① Fileinputformat reads data ② Mapper stage is simple for data processing ③ Serialization implements custom sorting ④ Partition partition processing ⑤ Reducer writes out data ⑥ Main class settings
The specific implementation is as follows
Driver main class, inclu ...
Posted by ursvmg on Tue, 30 Nov 2021 09:20:18 -0800
MapReduce core design -- job submission and initialization process analysis
Three components
JobClient (prepare to run environment)JobTracker (receive job)TaskTracker (initialize job) Note that this is written in version 1.x and Hadoop 2. X and is managed by yarn. There are no JobTracker and TaskTracker
Comparison between old and new Hadoop MapReduce frameworks 1. The client remains unchanged, and most of its call ...
Posted by eyaly on Tue, 30 Nov 2021 04:04:24 -0800
MapReduce programming practice -- WordCount running example (Python Implementation)
1, Experimental purpose
Master the basic MapReduce programming methods through experiments;Master the methods to solve some common data processing problems with MapReduce, including data merging, data De duplication, data sorting and data mining.
2, Experimental platform
Operating system: Ubuntu 18.04 (or Ubuntu 16.04)Hadoop version: 3.2.2 ...
Posted by freshneco on Tue, 16 Nov 2021 23:43:47 -0800
Experiment 6 MapReduce data cleaning - meteorological data cleaning
Level 1: data cleaning
Task description
This task: clean the data according to certain rules.
Programming requirements
According to the prompt, add code in the editor on the right to clean the data according to certain rules. Data description is as follows: a.txt; Data segmentation method: one or more spaces; Data location: / user/test/ ...
Posted by zoobie on Fri, 12 Nov 2021 02:26:04 -0800
Review of big data development (MapReduce)
2,MapReduce
2.1. Introduction to MapReduce
The core idea of MapReduce is "divide and conquer", which is suitable for a large number of complex task processing scenarios (large-scale data processing scenarios).
Map is responsible for "dividing", that is, dividing complex tasks into several "simple tasks" for ...
Posted by JCBarry on Sun, 07 Nov 2021 16:18:05 -0800
Large Data Platform Real-Time Number Warehouse from 0 to Built - 04 hadoop Installation Test
Summary
This is about hadoop Installation tests for.
stay server110 Install the configuration on and synchronize to server111,server112
Environmental Science Centos 7 jdk 1.8 hadoop-3.2.1
server110 192.168.1.110 server111 192.168.1.111 server112 192.168.1.112
install
#decompression
[root@server110 software]# tar -xzvf hadoop ...
Posted by eazyGen on Sat, 02 Oct 2021 10:14:39 -0700