Tech | Hbase Migration to SequoiaDB

background In traditional bank IT architecture, online trading and statistical analysis systems often use different technologies and physical devices to migrate online trading data to the analysis system through ETL which is executed regularly. As a data service resource pool, the same data may be accessed by different types of microservices. ...

Posted by WinterDragon on Wed, 18 Sep 2019 01:50:15 -0700

Hive QL: Window Opening Function (Cumulative Statistics)

Catalog   Preface 1. What is a windowing function 2. Window-opening function syntax 3. Classification of window-opening functions 4. Cumulative Statistics Window Opening Function 4.1 Cumulative sum(xx) over 4.2 Cumulative Average avg(xx) ...

Posted by AdamSnow on Mon, 09 Sep 2019 18:12:47 -0700

Large Data Hive Installation Configuration

Installation of Hive libraries for large data components: 1. First, download it from the official website of hive. https://hive.apache.org/downloads.html Note that when choosing the hive version, you need to pay attention to the hadoop version. The hive must be based on hadoop before it can be started. The following error is reported in the ...

Posted by Jak-S on Thu, 05 Sep 2019 23:57:36 -0700

Super simple centos7 configuration Hadoop 2.7.7 + flume 1.8.0 (including examples)

Super simple centos7 configuration Hadoop 2.7.7 + flume 1.8.0 Introduction of flume: https://blog.csdn.net/qq_40343117/article/details/100119574 1 - Download the installation package Download address: http://www.apache.org/dist/flume/ Choose the ...

Posted by map200uk on Wed, 28 Aug 2019 04:57:15 -0700

17hive mainstream storage format

3. File storage format 1. Column Storage and Row Storage (1) Features of column storage: Because the data aggregated storage for each field can greatly reduce the amount of data read when only a few fields are needed for query. Each field must ...

Posted by Mr_Mako on Sat, 17 Aug 2019 19:26:26 -0700

Spark Learning Instance (Python): Load Data Source loads the data source

When we use Spark, we mainly use it to process large quantities of data quickly. So what data sources will we have in actual development and production? I summarize them as follows: text csv json parquet jdbc hive kafka elasticsearch Next, all the tests are based on the spark local mode, be ...

Posted by Revlet on Thu, 08 Aug 2019 23:40:57 -0700

Tuples. Dictionaries. Collections and built-in methods

- Restore content to start --- 1: tuple: Similar to lists, you can store multiple values, multiple data types, but the difference is that the ancestor itself cannot be modified 1 Purpose: Record multiple values, tuples are more appropriate when multiple values have no changed requirements 2 Definition: Compared to the list type, just [] is ...

Posted by mjm on Thu, 04 Jul 2019 10:24:24 -0700

Hive HQL Data Operation and Data Query

HQL Data Operation The source of our content is Hadoop Mass Data Processing Technology Details and Project Practice, People's Posts and Telecommunications Publishing House 1. Loading data LOAD DATA INPATH '/user/hadoop/o' INTO TABLE test; If the test table is a partition table, specify a distinction in HQL: LOAD DATA INPATH '/USER/HA ...

Posted by altis88 on Tue, 02 Jul 2019 15:05:17 -0700

Hive Learning: Hive Installation

Installation Prerequisites Hadoop cluster has been installed and configured (single or fully distributed) Software Download Hive: https://hive.apache.org/index.html Hive Installation Configuring environment variables Upload the downloaded Live package to the machine and extract it to the specified path Edit / etc/profile to configure Hi ...

Posted by Adeus on Sat, 08 Jun 2019 14:45:09 -0700

Hive analysis function and window function

Hive analysis function and window function Supported after Hive 0.11, scan multiple input rows to calculate the results of each row. Usually used with OVER,PARTITION BY, ORDER BY, WINDOWING. Unlike traditional grouping results, traditional results have only one result in each group. The results of the analysis function will appear many times, ...

Posted by Vincent III on Fri, 31 May 2019 12:56:36 -0700