Call From hadoop01/192.168.80.128 to 0.0.0.0:10020 failed on connection exception about hive execution mr
Today, 42 jobs were enabled when using hive to perform mr. Halfway through the execution, the following errors were reported suddenly. I have never encountered them before. I don't know if it is the reason why there are too many jobs. The error prompt was that port 10020 could not be accessed from the host.
Check the reason on ...
Posted by pha3dr0n on Thu, 31 Jan 2019 13:30:15 -0800
Linear model of sklearn er Library Learning
Linear models use linear functions of input characteristics to predict, and learn the difference between algorithms of linear models.
(1) The specific combination of coefficients and intercepts is used to measure the fitness of training data. Different algorithms use different methods to measure the fitness of training set, whic ...
Posted by joix on Thu, 31 Jan 2019 10:45:15 -0800
Python crawler example: download multi-page topic content from Baidu Post Bar
Last week in the web crawler course, a practice was left: download multi-page topic content from Baidu Post Bar. What I accomplished was to crawl multi-page content from a post in the post bar, which was different from the topic asked by the teacher. Moreover, after the teacher commented, I found the gap between myself and the ...
Posted by maciek4 on Thu, 31 Jan 2019 10:03:15 -0800
Lucene Notes 05-Lucene Index Weighting Operation and Luke's Simple Demonstration
I. Weighting the Index
package com.wsy;
import java.io.File;
import java.io.IOException;
import java.util.HashMap;
import java.util.Map;
import org.apache.lucene.analysis.standard.StandardAnalyzer;
import org.apache.lucene.document.Document;
import org.apache.lucene.document.Field;
import org.apache.lucene.index.IndexReader; ...
Posted by steeveherris on Thu, 31 Jan 2019 03:45:14 -0800
Learn matplotlib drawing from scratch (4): Parallel histogram
Accumulated histograms have the advantage of accumulating histograms. For example, we can easily see the trend of multi-classification summation.
However, we find that in the histogram, we can not easily understand the trend of the data classified above because of the different base positions.
Therefore, when classification is n ...
Posted by nareshrevoori on Thu, 31 Jan 2019 02:48:16 -0800
12c Grid Infrastructure Management Repository
After installing grid in Grid Infrastructure 12.1.0.2, you can see that there are more ora.MGMTLSNR and ora.mgmtdb in the resources. At the same time, you start an instance, sid=-MGMTDB.
[grid@prodb1 ~]$ crsctl status res -t--------------------------------------------------------------------------------Name Target State Serve ...
Posted by fresch on Thu, 31 Jan 2019 02:30:15 -0800
Spark Streaming integrates flume(Poll and Push)
As a framework of log real-time collection, flume can be connected with SparkStreaming real-time processing framework. Flume generates data in real-time and sparkStreaming does real-time processing.
Spark Streaming docks with FlumeNG in two ways: one is that FlumeNG pushes the message Push to Spark Streaming, the other is that S ...
Posted by kobayashi_one on Wed, 30 Jan 2019 17:18:15 -0800
A MapReduce program example details determine success or failure (IV): In-Map Aggregation
Why use in-map aggregation? What's the difference between in-map aggregation and combine? When use combiner? When use in-map aggregation?
Let's start with a picture to see where combiner is in a mr job.
Dry goods below:
Data files are read by InputFormat and processed in the Map phase. After the Map is processed, the ...
Posted by hedgehog90 on Wed, 30 Jan 2019 15:21:15 -0800
spark-streaming sample program
Develop spark-streaming to receive data worldcount from server port in real time.
Environment building
idea+maven's pom file is as follows:
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLoc ...
Posted by phpnew on Wed, 30 Jan 2019 12:21:15 -0800
Large Data Notebook 06-YARN Construction and Case Study
YARN
The construction of yarn
Cluster planning
To configure
Test case
wordcount
Use the test case wordcount provided by MapReduce
The construction of yarn
Cluster planning
To configure
Modify the configuration file mapred-sitex.xml
<property>
<name>mapreduce.framework.name</name>
<value& ...
Posted by marli on Wed, 30 Jan 2019 11:00:15 -0800