Introduction to SparkSQL Case 2 (SparkSQL 1.x)

The main ideas in the introduction case of SparkSQL are as follows: Create SparkContext Create SQLContext 3). Create RDD 4. Create a class and define its member variables 5. Collate data and associate class es 6. Converting RDD to DataFrame (Importing Implicit Conversion) 7. Register the DataFrame as a temporary table 8. Writi ...

Posted by pietbez on Thu, 31 Jan 2019 19:57:15 -0800

Initialization failure of hiveMetastore metadata base: java.io.IOException: Schema script failed, errorcode 2

Traceback (most recent call last): File "/var/lib/ambari-agent/cache/common-services/HIVE/0.12.0.2.0/package/scripts/hive_metastore.py", line 259, in <module> HiveMetastore().execute() File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 280, in execute method(env) File ...

Posted by aceconcepts on Thu, 31 Jan 2019 16:00:15 -0800

Hadoop 2.6+Zookper3.4+Hbase 1.0 Deployment and Installation

Following hadoop fully distributed installation After that, zookper+hbase security is combined. Continue with the previous environment configuration. I. zookper Installation 1.1 Download and Unzip Software cd /software wget -c http://apache.fayea.com/zookeeper/zookeeper-3.4.10/zookeeper-3.4.10.tar.gz tar -zxf zookeeper-3.4.1 ...

Posted by berrberr on Thu, 31 Jan 2019 14:30:16 -0800

DBUtils Example 2 Implementing Paging Query

Requirements: For multiple pieces of data queried from the database, display the number of pages on one page Function: Improve user experience Such as: [Previous Page] 12 3 4 5 6 7 [Next Page] Which Page/Total Pages mysql database paging: Selec... from XXX limit n, m (n is the index of the query, m queries several) Page 1 ...

Posted by JasonL on Thu, 31 Jan 2019 14:15:15 -0800

Call From hadoop01/192.168.80.128 to 0.0.0.0:10020 failed on connection exception about hive execution mr

Today, 42 jobs were enabled when using hive to perform mr. Halfway through the execution, the following errors were reported suddenly. I have never encountered them before. I don't know if it is the reason why there are too many jobs. The error prompt was that port 10020 could not be accessed from the host. Check the reason on ...

Posted by pha3dr0n on Thu, 31 Jan 2019 13:30:15 -0800

Centos7 Installs GLPI Asset Management System

I. Environmental preparation CentOS 7 + Apache 2.4.6 + PHP +Mariadb Apache and Mariadb are installed directly by yum, and PHP is installed by binary source code. The previous items here are the same as the snipeit asset management system installed before. Pre-installation preparation 1. System Update # Note Use Centos 7.5 ...

Posted by ronald29x on Thu, 31 Jan 2019 05:09:16 -0800

Lucene Notes 05-Lucene Index Weighting Operation and Luke's Simple Demonstration

I. Weighting the Index package com.wsy; import java.io.File; import java.io.IOException; import java.util.HashMap; import java.util.Map; import org.apache.lucene.analysis.standard.StandardAnalyzer; import org.apache.lucene.document.Document; import org.apache.lucene.document.Field; import org.apache.lucene.index.IndexReader; ...

Posted by steeveherris on Thu, 31 Jan 2019 03:45:14 -0800

Introduction to spark 4 (RDD Advanced Operator 1)

1. mapPartitionsWithIndex Create RDD with a specified partition number of 2 scala> val rdd1 = sc.parallelize(List(1,2,3,4,5,6,7),2) View partition scala> rdd1.partitions - As follows: res0: Array[org.apache.spark.Partition] = Array(org.apache.spark.rdd.ParallelCollectionPartition@691, org.apache.spark.rdd.ParallelColle ...

Posted by r4ck4 on Wed, 30 Jan 2019 19:06:16 -0800

Spark Streaming integrates flume(Poll and Push)

As a framework of log real-time collection, flume can be connected with SparkStreaming real-time processing framework. Flume generates data in real-time and sparkStreaming does real-time processing. Spark Streaming docks with FlumeNG in two ways: one is that FlumeNG pushes the message Push to Spark Streaming, the other is that S ...

Posted by kobayashi_one on Wed, 30 Jan 2019 17:18:15 -0800

spark-streaming sample program

Develop spark-streaming to receive data worldcount from server port in real time. Environment building idea+maven's pom file is as follows: <?xml version="1.0" encoding="UTF-8"?> <project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLoc ...

Posted by phpnew on Wed, 30 Jan 2019 12:21:15 -0800