Hbase High Availability Cluster Building

Cluster Resource and Role Planning node1 node2 node3 node4 node5 zookeeper zookeeper zookeeper nn1 nn2 datanode datanode datanode datanode datanode journal journal journal rm1 rm2 nodemanager nodemanager nodemanager nodemanager nodemanager HMaster HMaster HRegionServe ...

Posted by nikkio3000 on Tue, 30 Jul 2019 14:17:05 -0700

Big Data Framework Foundation Initial Hadoop Environment Installation Settings

Hadoop is supported by the GNU/Linux platform (recommendation). Therefore, you need to install a Linux operating system and set up a Hadoop environment. If you have a Linux operating system and so on, you can install it in VirtualBox (with the experience of installing Linux in VirtualBox, you can lea ...

Posted by welshmike on Fri, 26 Jul 2019 21:10:39 -0700

Common errors in Flink on yarn

1 Retrying connect to server 2 Unable to get ClusterClient status from Application Client 3 Cannot instantiate user function 4 Could not resolve substitution to a value: ${akka.stream.materializer} 5 java.lang.NoClassDefFoundError: org/apache/kafka/common/serialization/ByteArrayDeserializer 1 Retry ...

Posted by pandhandp on Tue, 23 Jul 2019 02:12:34 -0700

Hadoop Cluster Build-04 Installation Configuration HDFS

Hadoop cluster build-03 compile and install Hadoop Hadoop Cluster Build-02 Installation Configuration Zookeeper Preparations for Hadoop Cluster Building-01 HDFS is a distributed file system used with Hadoop, which is divided into two parts. namenode: nn1.hadoop nn2.hadoop datanode: s1.hadoop s2.hadoop s3.hadoop (If you can't understand th ...

Posted by Chevy on Mon, 22 Jul 2019 03:49:20 -0700

Construction of Hadoop Pseudo-Distribution Environment and hbase Configuration in ubantu System

I don't know much about hadoop, but I have encountered a lot of unexpected mistakes in the course of construction. Now I summarize how to build and common error solutions, hoping to help you all. If the following contents are infringed, please contact the author to delete, thank you! First of all, the author uploaded the software version used ...

Posted by BRUm on Mon, 15 Jul 2019 16:09:43 -0700

ROS One-Click Deployment of Spark Distributed Cluster

Apache Spark is a fast and versatile computing engine designed for large-scale data processing. It can perform a wide range of operations, including SQL queries, text processing, machine learning, etc. Before the advent of Spark, we generally needed to learn a variety of engines to handle these requirements separately.The main purpose of this a ...

Posted by scottb1 on Mon, 08 Jul 2019 09:48:51 -0700

Phoenix Installation and Its Use

Introduction to Phoenix Phoenix is an open source SQL engine for HBase. You can use the standard JDBC API instead of the HBase client API to create tables, insert data, and query your HBase data.Phoenix is an SQL engine built on HBase. You may have "Will Phoenix reduce the efficiency of HBase?" Or "Is Phoenix inefficient?" ...

Posted by n14charlie on Sun, 07 Jul 2019 13:17:47 -0700

Word Count for Hadoop Introduction Basic Tutorial

Word counting is one of the simplest and most thought-provoking programs of MapReduce, known as the MapReduce version of Hello World, whose complete code can be found in Hadoop Found in the src/example directory of the installation package.The main function of word count is to count the number of occurrences of each word in a series of text fil ...

Posted by paulsiew2 on Wed, 03 Jul 2019 09:23:27 -0700

Hive HQL Data Operation and Data Query

HQL Data Operation The source of our content is Hadoop Mass Data Processing Technology Details and Project Practice, People's Posts and Telecommunications Publishing House 1. Loading data LOAD DATA INPATH '/user/hadoop/o' INTO TABLE test; If the test table is a partition table, specify a distinction in HQL: LOAD DATA INPATH '/USER/HA ...

Posted by altis88 on Tue, 02 Jul 2019 15:05:17 -0700

HBase Coprocessor Details

1. Brief description When using HBase, if you have billions of rows or millions of columns of data, the ability to return a large amount of data in a query is subject to the bandwidth of the network, and even if the network conditions allow, the client's computing may not be able to meet the requirements.In this case, Coprocessors arise.It all ...

Posted by richard-elland on Sun, 23 Jun 2019 09:53:31 -0700