Learning Manifest, ClassTag, TypeTag in Scala
Introduction to Manifest
Manifest is a feature introduced by Scala 2.8 that allows compilers to obtain generic type information at runtime. stay JVM On the other hand, the generic parameter type T is "wiped" off at runtime, and the compiler treats T as an Object. So the specific information of T is not available; in order to get the ...
Posted by SheepWoolie on Sun, 14 Apr 2019 16:21:32 -0700
Installation and Use of Log Collection Framework Flume
Installation and Use of Log Collection Framework Flume
1. Introduction to Flume
1.1. Overview of Flume
Flume is a distributed, reliable and highly available system for collecting, transmitting and aggregating massive logs.
Flume can collect files, socket data packets and other forms of source data.
It can also output the collected d ...
Posted by MichaelR on Thu, 04 Apr 2019 20:12:30 -0700
Hadoop and those things (4) - - MapReduce programming example (foundation)
Preface
In the last article, I took WordCount as an example to explain the code structure and operation mechanism of MapReduce. This article will further understand MapReduce through a few simple examples.
1. Data Retrieval
Problem description
Assuming there are many pieces of data, we look for statements that contain a string.
Solu ...
Posted by pornost4r on Sat, 30 Mar 2019 18:39:31 -0700
Hadoop Double Name Node Configuration Build (HA)
The purpose of configuring dual namenode is to prevent errors and data loss from a namenode hanging up. The specific principle is not explained in detail in this paper, but the specific installation process is described here.
The construction of Hadoop HA is based on Zookeeper. You can see the construction of Zookeeper here. hadoop, zookeepe ...
Posted by anothersystem on Sat, 30 Mar 2019 08:21:30 -0700
Install hbase on ubuntu
hbase introduction
HBase is a distributed, column-oriented open source database. The technology originates from Fay Chang's Google paper Bigtable: A Distributed Storage System for Structured Data. Just as Bigtable takes advantage of the distributed data storage provided by the Google File System, HBase provides Bigtable-like capabilities o ...
Posted by expert_21 on Fri, 29 Mar 2019 11:27:29 -0700
Variables, Properties and Common Commands of Hive Command Line Interface
After "$" corresponds to the Shell command of Linux, and after "hive >" corresponds to the command in CLI.
CLI options
- help to see the parameters of hive
$hive --help --service cli
usage: hive
-d,--define <key=value> Variable subsitution to apply to hive
commands. ...
Posted by Visualant on Wed, 27 Mar 2019 20:54:28 -0700
Using Docker to Build Hadoop Distributed Cluster in Linux Environment
First install Docker:
$ sudo apt-get install apt-transport-https
$ sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv-keys 36A1D7869245C8950F966E92D8576A8BA88D21E9
$ sudo bash -c "echo deb https://get.docker.io/ubuntu docker main > /etc/apt/sources.list.d/docker.list"
$ sudo apt-get update
$ sudo apt-get install lxc-do ...
Posted by escabiz on Sun, 24 Mar 2019 18:00:29 -0700
flume monitors spoolDir logs to HDFS for the entire process of small Demo
1. compiling Java Code, randomly generated user ID number, district and county number, Township number (district and township number in random three letters) and personal total income format sample: 7762a1-bf04-468a-91b6-a19d772f41fa\\\\\\\\\\\\
2. Use a thread loop to execute, Thread.sleep(100) to control the thread execution to stop 100 ms a ...
Posted by furma on Fri, 22 Mar 2019 15:42:52 -0700
HDFS delegate token could not be found in Hive execution group by cache
When I run the hit query, an error occurs. When I execute the query group by statement, it throws an error:
java.sql.SQLException: Error while processing statement: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex failed, vertexName=Map 8, vertexId=vertex_1530387194612_0030_4_00, diagnostics=[Vertex ...
Posted by centerwork on Tue, 12 Mar 2019 04:57:25 -0700
Resolve the problem that time zone causes druid task construction to fail: No buckets?? See there is no data to index.
Error stack as fllow:
2017-07-10T08:41:36,134 ERROR [task-runner-0-priority-0] io.druid.indexing.overlord.ThreadPoolTaskRunner - Exception while running task[HadoopIndexTask{id=index_hadoop_pageviews_2017-07-10T08:40:32.650Z, type=index_hadoop, dataSource=pageviews}]
java.lang.RuntimeException: java.lang.reflect.InvocationTargetException
a ...
Posted by iconicCreator on Thu, 14 Feb 2019 21:45:20 -0800