Learning Manifest, ClassTag, TypeTag in Scala

Introduction to Manifest Manifest is a feature introduced by Scala 2.8 that allows compilers to obtain generic type information at runtime. stay JVM On the other hand, the generic parameter type T is "wiped" off at runtime, and the compiler treats T as an Object. So the specific information of T is not available; in order to get the ...

Posted by SheepWoolie on Sun, 14 Apr 2019 16:21:32 -0700

Installation and Use of Log Collection Framework Flume

Installation and Use of Log Collection Framework Flume 1. Introduction to Flume 1.1. Overview of Flume Flume is a distributed, reliable and highly available system for collecting, transmitting and aggregating massive logs. Flume can collect files, socket data packets and other forms of source data. It can also output the collected d ...

Posted by MichaelR on Thu, 04 Apr 2019 20:12:30 -0700

Hadoop and those things (4) - - MapReduce programming example (foundation)

Preface In the last article, I took WordCount as an example to explain the code structure and operation mechanism of MapReduce. This article will further understand MapReduce through a few simple examples. 1. Data Retrieval Problem description Assuming there are many pieces of data, we look for statements that contain a string. Solu ...

Posted by pornost4r on Sat, 30 Mar 2019 18:39:31 -0700

Hadoop Double Name Node Configuration Build (HA)

The purpose of configuring dual namenode is to prevent errors and data loss from a namenode hanging up. The specific principle is not explained in detail in this paper, but the specific installation process is described here. The construction of Hadoop HA is based on Zookeeper. You can see the construction of Zookeeper here. hadoop, zookeepe ...

Posted by anothersystem on Sat, 30 Mar 2019 08:21:30 -0700

Install hbase on ubuntu

hbase introduction HBase is a distributed, column-oriented open source database. The technology originates from Fay Chang's Google paper Bigtable: A Distributed Storage System for Structured Data. Just as Bigtable takes advantage of the distributed data storage provided by the Google File System, HBase provides Bigtable-like capabilities o ...

Posted by expert_21 on Fri, 29 Mar 2019 11:27:29 -0700

Variables, Properties and Common Commands of Hive Command Line Interface

After "$" corresponds to the Shell command of Linux, and after "hive >" corresponds to the command in CLI. CLI options - help to see the parameters of hive $hive --help --service cli usage: hive -d,--define <key=value> Variable subsitution to apply to hive commands. ...

Posted by Visualant on Wed, 27 Mar 2019 20:54:28 -0700

Using Docker to Build Hadoop Distributed Cluster in Linux Environment

First install Docker: $ sudo apt-get install apt-transport-https $ sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv-keys 36A1D7869245C8950F966E92D8576A8BA88D21E9 $ sudo bash -c "echo deb https://get.docker.io/ubuntu docker main > /etc/apt/sources.list.d/docker.list" $ sudo apt-get update $ sudo apt-get install lxc-do ...

Posted by escabiz on Sun, 24 Mar 2019 18:00:29 -0700

flume monitors spoolDir logs to HDFS for the entire process of small Demo

1. compiling Java Code, randomly generated user ID number, district and county number, Township number (district and township number in random three letters) and personal total income format sample: 7762a1-bf04-468a-91b6-a19d772f41fa\\\\\\\\\\\\ 2. Use a thread loop to execute, Thread.sleep(100) to control the thread execution to stop 100 ms a ...

Posted by furma on Fri, 22 Mar 2019 15:42:52 -0700

HDFS delegate token could not be found in Hive execution group by cache

When I run the hit query, an error occurs. When I execute the query group by statement, it throws an error: java.sql.SQLException: Error while processing statement: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex failed, vertexName=Map 8, vertexId=vertex_1530387194612_0030_4_00, diagnostics=[Vertex ...

Posted by centerwork on Tue, 12 Mar 2019 04:57:25 -0700

Resolve the problem that time zone causes druid task construction to fail: No buckets?? See there is no data to index.

Error stack as fllow: 2017-07-10T08:41:36,134 ERROR [task-runner-0-priority-0] io.druid.indexing.overlord.ThreadPoolTaskRunner - Exception while running task[HadoopIndexTask{id=index_hadoop_pageviews_2017-07-10T08:40:32.650Z, type=index_hadoop, dataSource=pageviews}] java.lang.RuntimeException: java.lang.reflect.InvocationTargetException a ...

Posted by iconicCreator on Thu, 14 Feb 2019 21:45:20 -0800