Basic knowledge and use of impala

Chapter 1 basic concepts of Impala 1.1 what is Impala Cloudera provides interactive SQL query function with high performance and low latency for HDFS and HBase data. Based on Hive, it uses memory computing, takes into account data warehouse, and has the advantages of real-time, batch processing, multi concurrency and so on. It is the prefer ...

Posted by yaba on Sun, 19 Sep 2021 06:13:40 -0700

MySQL learning path - database transactions, views

1, What is a transaction          Transaction refers to a program execution unit that accesses and updates various data in the database. MySQL transactions are mainly used to process data with large amount of operations and high complexity.          (1) . four features (ACID): ...

Posted by Eggzorcist on Sun, 19 Sep 2021 04:48:15 -0700

Spark--spark Core Programming (RDD)

The Spark Computing Framework encapsulates three data structures for high concurrency and high throughput data processing in different application scenarios. RDD: Elastic Distributed DatasetAccumulator: Distributed shared write-only variablesBroadcast variables: distributed shared read-only variables RDD 1. What is RDD RDD (Resilient Di ...

Posted by ramez_sever on Sat, 18 Sep 2021 11:30:40 -0700

Presto Distributed SQL Query Engine

Introduction to Presto Presto is a distributed SQL query engine developed by Facebook for efficient and real-time data analysis.Presto can connect Hive, Mysql, Kafka and other data sources. The most common way to connect Hive data sources is through Presto, which can solve the problem that Hive's MapReduce query takes too long.Presto is a ...

Posted by wing328 on Thu, 16 Sep 2021 12:43:56 -0700

[hard HBase] HBase Optimization: pre partition / RowKey design / memory optimization / basic optimization

This article is right [hard big data learning route] learning guide for experts from zero to big data (fully upgraded version) HBase partial supplement. 1 high availability In HBase, HMaster is responsible for monitoring the lifecycle of HRegionServer and balancing the load of regional server. If HMaster fails, the whole HBase cluster will fa ...

Posted by svihas on Wed, 15 Sep 2021 19:31:12 -0700

❤️ Introduction teaching of "python data analysis" with 80000 words ❤️ Afraid of learning? Hand in hand Church ❤️

preface Pandas is the most important data analysis toolkit in Python. It is the abbreviation of Panel Data Analysis and the most popular Python data analysis tool at present. With the development of digital transformation of global economy, a large number of data have been accumulated in all walks of life. Companies with unique insights fr ...

Posted by dreamkiller23 on Wed, 15 Sep 2021 16:34:45 -0700

MYSQL learning notes - basic use of database

1, mysql database common operations Account login -h specifies the host address of the server, - p specifies the port number, - u specifies the current shell user, - p is the password, which can not be entered on the command line Create database The database is a collection of tables. The database server can contain multiple databases. ...

Posted by AjithTV on Tue, 14 Sep 2021 15:23:04 -0700

Mysql (relational database)

Nature of database: A data storage method that can permanently save and query fast data storage. Three connection methods of database server: Open through management service Open through the graphical user interface (navicat). Open through the upper triangle of the taskbar.   Diagram: Database:  table: Each row is called a ...

Posted by marukochan on Sun, 12 Sep 2021 00:20:46 -0700

clickhouse type error, aggregate function error -- precautions for field alias

Recently, I came into contact with clickhouse. I have to say that the performance is really powerful. Even the simplest use without any optimization is much faster than a conventional database. Therefore, it has always been used as an ordinary database and supports sql statements. It is easy to get started. However, the case when statement is u ...

Posted by healy787 on Fri, 10 Sep 2021 23:12:28 -0700

[Database series] arrangement of basic knowledge points of database

The content comes from the job hunting Dictionary of kingly programmers 1, Basic concepts 1. Data model The core and foundation of database system is data model. Generally speaking, a data model is a set of strictly defined concepts. These concepts accurately describe the static characteristics, dynamic characteristics and integrity ...

Posted by marcth on Fri, 10 Sep 2021 18:35:40 -0700