This article is MySQL HeatWave Quickstart (quick start) A sequel to.
I watched it today MySQL HeatWave user manual , it has changed from more than 60 pages to 96 pages, and Autopilot, object storage persistence layer and other contents have been added.
Basic concepts:
- HeatWave is memory array, distributed and shared
- HeatWave Cluster includes 1 InnoDB and multiple HeatWave nodes (at least 2 and at most 64).
- HeatWave is only available on Oracle's public cloud.
Let's take a look at the latest architecture:
OCI Object Storage in the figure is the latest function to:
HeatWave Storage Layer automatically persists the data to OCI Object Storage for fast recovery in case of a HeatWave node or cluster failure.
The time required to reload data is constant regardless of data size or HeatWave cluster size.
This paper discusses the concept of push based vectorized query processing. Compared with tuple based, it is described as follows:
HeatWave processes queries by pushing vector blocks (slices of columnar data) through the query execution plan from one operator to another. A push-based execution model avoids deep call stacks and saves valuable resources compared to tuple-based processing models.
I don't quite understand. I'll read it later.
A table is distributed to all HeatWave nodes, so it needs a primary key.
At present, there are 2 kinds of shapes to choose from:
- MySQL.HeatWave.VM.Standard.E3 (16 OCPU,512GB RAM)
- MySQL.HeatWave.BM.Standard.E3(128 OCPU,2TB RAM)
AutoPilot, see Oracle Announces MySQL Autopilot for MySQL HeatWave Service.
as well as MySQL Autopilot - Machine Learning Automation for MySQL HeatWave.
AutoPilot is an automation service based on machine learning.
Let's take a look at the four functions of AutoPilot:
System Setup
Auto provisioning
Help see here.
The number of nodes in HeatWave Cluster can be automatically estimated. See the menu "Estimate Node Count".
When this step is executed, there must be data in the Schema. The output indicates that the number of estimated nodes is 1, but the minimum should be 2.
Data Load
Auto Parallel Load, Auto Encoding and auto data placement.
Auto Parallel Load
Help see here . You can specify a table. For simplicity, the following example specifies the entire Schema.
Let's first look at Auto Parallel Load, which is actually executing the command given during node estimation, which is much more convenient than before!:
MySQL 10.0.1.28:3306 ssl tpch SQL > CALL sys.heatwave_load(JSON_ARRAY("tpch"), NULL); +------------------------------------------+ | INITIALIZING HEATWAVE AUTO LOAD | +------------------------------------------+ | Version: 1.15 | | | | Load Mode: normal | | Load Policy: disable_unsupported_columns | | Output Mode: normal | | | +------------------------------------------+ 6 rows in set (0.0080 sec) +------------------------------------------------------------------------+ | OFFLOAD ANALYSIS | +------------------------------------------------------------------------+ | Verifying input schemas: 1 | | User excluded items: 0 | | | | SCHEMA OFFLOADABLE OFFLOADABLE SUMMARY OF | | NAME TABLES COLUMNS ISSUES | | ------ ----------- ----------- ---------- | | `tpch` 8 61 | | | | Total offloadable schemas: 1 | | | +------------------------------------------------------------------------+ 10 rows in set (0.0080 sec) +-----------------------------------------------------------------------------------------------------------------------------+ | CAPACITY ESTIMATION | +-----------------------------------------------------------------------------------------------------------------------------+ | Default load pool for tables: TRANSACTIONAL | | Default encoding for string columns: VARLEN (unless specified in the schema) | | Estimating memory footprint for 1 schema(s) | | | | TOTAL ESTIMATED ESTIMATED TOTAL DICTIONARY VARLEN ESTIMATED | | SCHEMA OFFLOADABLE HEATWAVE NODE MYSQL NODE STRING ENCODED ENCODED LOAD | | NAME TABLES FOOTPRINT FOOTPRINT COLUMNS COLUMNS COLUMNS TIME | | ------ ----------- --------- --------- ------- ---------- ------- --------- | | `tpch` 8 2.22 GiB 32.00 MiB 29 0 29 10.00 s | | | | Sufficient MySQL host memory available to load all tables. | | Sufficient HeatWave cluster memory available to load all tables. | | | +-----------------------------------------------------------------------------------------------------------------------------+ 13 rows in set (0.0080 sec) +---------------------------------------------------------------------------------------------------------------------------------------+ | EXECUTING LOAD | +---------------------------------------------------------------------------------------------------------------------------------------+ | HeatWave Load script generated | | Retrieve load script containing 24 generated DDL command(s) using the query below: | | SELECT log->>"$.sql" AS "Load Script" FROM sys.heatwave_load_report WHERE type = "sql" ORDER BY id; | | | | Adjusting load parallelism dynamically per table | | Using current parallelism of 32 thread(s) as maximum | | | | Using SQL_MODE: ONLY_FULL_GROUP_BY,STRICT_TRANS_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,NO_ENGINE_SUBSTITUTION | | | | Proceeding to load 8 tables into HeatWave | | | | Applying changes will take approximately 9.74 s | | | +---------------------------------------------------------------------------------------------------------------------------------------+ 13 rows in set (0.0080 sec) +----------------------------------------+ | LOADING TABLE | +----------------------------------------+ | TABLE (1 of 8): `tpch`.`customer` | | Commands executed successfully: 3 of 3 | | Warnings encountered: 0 | | Table loaded successfully! | | Total columns loaded: 8 | | Table loaded using 3 thread(s) | | | +----------------------------------------+ 7 rows in set (0.0080 sec) +----------------------------------------+ | LOADING TABLE | +----------------------------------------+ | TABLE (2 of 8): `tpch`.`lineitem` | | Commands executed successfully: 3 of 3 | | Warnings encountered: 0 | | Table loaded successfully! | | Total columns loaded: 16 | | Table loaded using 32 thread(s) | | | +----------------------------------------+ 7 rows in set (0.0080 sec) +----------------------------------------+ | LOADING TABLE | +----------------------------------------+ | TABLE (3 of 8): `tpch`.`nation` | | Commands executed successfully: 3 of 3 | | Warnings encountered: 0 | | Table loaded successfully! | | Total columns loaded: 4 | | Table loaded using 1 thread(s) | | | +----------------------------------------+ 7 rows in set (0.0080 sec) +----------------------------------------+ | LOADING TABLE | +----------------------------------------+ | TABLE (4 of 8): `tpch`.`orders` | | Commands executed successfully: 3 of 3 | | Warnings encountered: 0 | | Table loaded successfully! | | Total columns loaded: 9 | | Table loaded using 16 thread(s) | | | +----------------------------------------+ 7 rows in set (0.0080 sec) +----------------------------------------+ | LOADING TABLE | +----------------------------------------+ | TABLE (5 of 8): `tpch`.`part` | | Commands executed successfully: 3 of 3 | | Warnings encountered: 0 | | Table loaded successfully! | | Total columns loaded: 9 | | Table loaded using 4 thread(s) | | | +----------------------------------------+ 7 rows in set (0.0080 sec) +----------------------------------------+ | LOADING TABLE | +----------------------------------------+ | TABLE (6 of 8): `tpch`.`partsupp` | | Commands executed successfully: 3 of 3 | | Warnings encountered: 0 | | Table loaded successfully! | | Total columns loaded: 5 | | Table loaded using 15 thread(s) | | | +----------------------------------------+ 7 rows in set (0.0080 sec) +----------------------------------------+ | LOADING TABLE | +----------------------------------------+ | TABLE (7 of 8): `tpch`.`region` | | Commands executed successfully: 3 of 3 | | Warnings encountered: 0 | | Table loaded successfully! | | Total columns loaded: 3 | | Table loaded using 1 thread(s) | | | +----------------------------------------+ 7 rows in set (0.0080 sec) +----------------------------------------+ | LOADING TABLE | +----------------------------------------+ | TABLE (8 of 8): `tpch`.`supplier` | | Commands executed successfully: 3 of 3 | | Warnings encountered: 0 | | Table loaded successfully! | | Total columns loaded: 7 | | Table loaded using 1 thread(s) | | | +----------------------------------------+ 7 rows in set (0.0080 sec) +-------------------------------------------------------------------------------+ | LOAD SUMMARY | +-------------------------------------------------------------------------------+ | | | SCHEMA TABLES TABLES COLUMNS LOAD | | NAME LOADED FAILED LOADED DURATION | | ------ ------ ------ ------- -------- | | `tpch` 8 0 61 2.55 s | | | +-------------------------------------------------------------------------------+ 6 rows in set (0.0080 sec) Query OK, 0 rows affected (0.0080 sec)
Related viewing commands:
MySQL 10.0.1.28:3306 ssl tpch SQL > SELECT log FROM sys.heatwave_load_report WHERE type="error"; Empty set (0.0007 sec) MySQL 10.0.1.28:3306 ssl tpch SQL > SELECT log FROM sys.heatwave_load_report WHERE type="warn"; Empty set (0.0006 sec) MySQL 10.0.1.28:3306 ssl tpch SQL > SELECT log->>"$.sql" AS "Load Script" FROM sys.heatwave_load_report -> WHERE type = "sql" ORDER BY id; +-------------------------------------------------------+ | Load Script | +-------------------------------------------------------+ | SET SESSION innodb_parallel_read_threads = 3; | | ALTER TABLE `tpch`.`customer` SECONDARY_ENGINE=RAPID; | | ALTER TABLE `tpch`.`customer` SECONDARY_LOAD; | | SET SESSION innodb_parallel_read_threads = 32; | | ALTER TABLE `tpch`.`lineitem` SECONDARY_ENGINE=RAPID; | | ALTER TABLE `tpch`.`lineitem` SECONDARY_LOAD; | | SET SESSION innodb_parallel_read_threads = 1; | | ALTER TABLE `tpch`.`nation` SECONDARY_ENGINE=RAPID; | | ALTER TABLE `tpch`.`nation` SECONDARY_LOAD; | | SET SESSION innodb_parallel_read_threads = 16; | | ALTER TABLE `tpch`.`orders` SECONDARY_ENGINE=RAPID; | | ALTER TABLE `tpch`.`orders` SECONDARY_LOAD; | | SET SESSION innodb_parallel_read_threads = 4; | | ALTER TABLE `tpch`.`part` SECONDARY_ENGINE=RAPID; | | ALTER TABLE `tpch`.`part` SECONDARY_LOAD; | | SET SESSION innodb_parallel_read_threads = 15; | | ALTER TABLE `tpch`.`partsupp` SECONDARY_ENGINE=RAPID; | | ALTER TABLE `tpch`.`partsupp` SECONDARY_LOAD; | | SET SESSION innodb_parallel_read_threads = 1; | | ALTER TABLE `tpch`.`region` SECONDARY_ENGINE=RAPID; | | ALTER TABLE `tpch`.`region` SECONDARY_LOAD; | | SET SESSION innodb_parallel_read_threads = 1; | | ALTER TABLE `tpch`.`supplier` SECONDARY_ENGINE=RAPID; | | ALTER TABLE `tpch`.`supplier` SECONDARY_LOAD; | +-------------------------------------------------------+ 24 rows in set (0.0007 sec)
Auto Encoding
Help see here.
In fact, it is a recommender to decide whether it is VARLEN or DICTIONARY coding
In the following output, it is recommended to change the encoding of some columns from VARLEN to DICTIONARY to improve performance. It also gives commands to generate modification scripts.
MySQL 10.0.1.28:3306 ssl tpch SQL > CALL sys.heatwave_advisor(JSON_OBJECT("auto_enc",JSON_OBJECT("mode","recommend"))); +-------------------------------+ | INITIALIZING HEATWAVE ADVISOR | +-------------------------------+ | Version: 1.15 | | | | Output Mode: normal | | Excluded Queries: 0 | | Target Schemas: All | | | +-------------------------------+ 6 rows in set (0.0098 sec) +---------------------------------------------------------+ | ANALYZING LOADED DATA | +---------------------------------------------------------+ | Total 8 tables loaded in HeatWave for 1 schemas | | Tables excluded by user: 0 (within target schemas) | | | | SCHEMA TABLES COLUMNS | | NAME LOADED LOADED | | ------ ------ ------ | | `tpch` 8 61 | | | +---------------------------------------------------------+ 8 rows in set (0.0098 sec) +--------------------------------------------------------------------------------------------+ | ENCODING SUGGESTIONS | +--------------------------------------------------------------------------------------------+ | Total Auto Encoding suggestions produced for 25 columns | | Queries executed: 1 | | Total query execution time: 0 ps | | Most recent query executed on: Thursday 1st January 1970 00:00:00 | | Oldest query executed on: Thursday 1st January 1970 00:00:00 | | | | CURRENT SUGGESTED | | COLUMN COLUMN COLUMN | | NAME ENCODING ENCODING | | ------ -------- --------- | | `tpch`.`customer`.`C_ADDRESS` VARLEN DICTIONARY | | `tpch`.`customer`.`C_COMMENT` VARLEN DICTIONARY | | `tpch`.`customer`.`C_MKTSEGMENT` VARLEN DICTIONARY | | `tpch`.`customer`.`C_NAME` VARLEN DICTIONARY | | `tpch`.`customer`.`C_PHONE` VARLEN DICTIONARY | | `tpch`.`lineitem`.`L_COMMENT` VARLEN DICTIONARY | | `tpch`.`lineitem`.`L_SHIPINSTRUCT` VARLEN DICTIONARY | | `tpch`.`nation`.`N_COMMENT` VARLEN DICTIONARY | | `tpch`.`nation`.`N_NAME` VARLEN DICTIONARY | | `tpch`.`orders`.`O_CLERK` VARLEN DICTIONARY | | `tpch`.`orders`.`O_COMMENT` VARLEN DICTIONARY | | `tpch`.`orders`.`O_ORDERPRIORITY` VARLEN DICTIONARY | | `tpch`.`part`.`P_BRAND` VARLEN DICTIONARY | | `tpch`.`part`.`P_COMMENT` VARLEN DICTIONARY | | `tpch`.`part`.`P_CONTAINER` VARLEN DICTIONARY | | `tpch`.`part`.`P_MFGR` VARLEN DICTIONARY | | `tpch`.`part`.`P_NAME` VARLEN DICTIONARY | | `tpch`.`part`.`P_TYPE` VARLEN DICTIONARY | | `tpch`.`partsupp`.`PS_COMMENT` VARLEN DICTIONARY | | `tpch`.`region`.`R_COMMENT` VARLEN DICTIONARY | | `tpch`.`region`.`R_NAME` VARLEN DICTIONARY | | `tpch`.`supplier`.`S_ADDRESS` VARLEN DICTIONARY | | `tpch`.`supplier`.`S_COMMENT` VARLEN DICTIONARY | | `tpch`.`supplier`.`S_NAME` VARLEN DICTIONARY | | `tpch`.`supplier`.`S_PHONE` VARLEN DICTIONARY | | | | Applying the suggested encodings might improve query performance and cluster memory usage. | | Estimated HeatWave cluster memory savings: 348.50 MiB | | | +--------------------------------------------------------------------------------------------+ 39 rows in set (0.0098 sec) +----------------------------------------------------------------------------------------------------------------+ | SCRIPT GENERATION | +----------------------------------------------------------------------------------------------------------------+ | Script generated for applying suggestions for 8 loaded tables | | | | Applying changes will take approximately 11.00 s | | | | Retrieve script containing 65 generated DDL commands using the query below: | | SELECT log->>"$.sql" AS "SQL Script" FROM sys.heatwave_advisor_report WHERE type = "sql" ORDER BY id; | | | | Caution: Executing the generated script will alter the column comment and secondary engine flags in the schema | | | +----------------------------------------------------------------------------------------------------------------+ 9 rows in set (0.0098 sec) Query OK, 0 rows affected (0.0098 sec)
Auto Data Placement
Help see here.
Give suggestions on data placement key.
You must have run some queries, at least 5.
Display history query:
SELECT query_id, LEFT(query_text,160) FROM performance_schema.rpd_query_stats;
Executive recommender:
MySQL 10.0.1.28:3306 ssl tpch SQL > CALL sys.heatwave_advisor(JSON_OBJECT("target_schema",JSON_ARRAY("tpch"))); +-------------------------------+ | INITIALIZING HEATWAVE ADVISOR | +-------------------------------+ | Version: 1.15 | | | | Output Mode: normal | | Excluded Queries: 0 | | Target Schemas: 1 | | | +-------------------------------+ 6 rows in set (0.0069 sec) +---------------------------------------------------------+ | ANALYZING LOADED DATA | +---------------------------------------------------------+ | Total 8 tables loaded in HeatWave for 1 schemas | | Tables excluded by user: 0 (within target schemas) | | | | SCHEMA TABLES COLUMNS | | NAME LOADED LOADED | | ------ ------ ------ | | `tpch` 8 61 | | | +---------------------------------------------------------+ 8 rows in set (0.0069 sec) +------------------------------------------------------------------------+ | AUTO DATA PLACEMENT | +------------------------------------------------------------------------+ | Auto Data Placement Configuration: | | | | Minimum benefit threshold: 1% | | | | Producing Data Placement suggestions for current setup: | | | | Tables Loaded: 8 | | Queries used: 6 | | Total query execution time: 5.85 s | | Most recent query executed on: Friday 10th September 2021 08:50:33 | | Oldest query executed on: Friday 10th September 2021 08:47:25 | | HeatWave cluster size: 2 nodes | | | +------------------------------------------------------------------------+ 13 rows in set (0.0069 sec) +--------------------------------------------------------------------------------------+ | DATA PLACEMENT SUGGESTIONS | +--------------------------------------------------------------------------------------+ | No Data Placement suggestion produced | | Issue: No candidate Data Placement suggestion found where performance benefit > 1% | +--------------------------------------------------------------------------------------+ 2 rows in set (0.0069 sec) Query OK, 0 rows affected (0.0069 sec)
Query Execution
Auto Query Plan Improvement
See details here . Automatic features, not repeated.
Auto Query Time Estimation
Also known as Query Insight. See details here.
The premise must have query history.
MySQL 10.0.1.28:3306 ssl tpch SQL > CALL sys.heatwave_advisor(JSON_OBJECT("query_insights", TRUE)); +-------------------------------+ | INITIALIZING HEATWAVE ADVISOR | +-------------------------------+ | Version: 1.15 | | | | Output Mode: normal | | Excluded Queries: 0 | | Target Schemas: All | | | +-------------------------------+ 6 rows in set (0.0074 sec) +---------------------------------------------------------+ | ANALYZING LOADED DATA | +---------------------------------------------------------+ | Total 8 tables loaded in HeatWave for 1 schemas | | Tables excluded by user: 0 (within target schemas) | | | | SCHEMA TABLES COLUMNS | | NAME LOADED LOADED | | ------ ------ ------ | | `tpch` 8 61 | | | +---------------------------------------------------------+ 8 rows in set (0.0074 sec) +--------------------------------------------------------------------------------------------------------------------+ | QUERY INSIGHTS | +--------------------------------------------------------------------------------------------------------------------+ | Queries executed on Heatwave: 13 | | Session IDs (as filter): None | | | | QUERY-ID SESSION-ID QUERY-STRING EXEC-RUNTIME (s) COMMENT | | -------- ---------- ------------ ---------------- ------- | | 1 21 SELECT SUM(l_extendedprice * l_discount) AS revenue ... 0.001 (est.) Explain. | | 2 21 SELECT l_returnflag, l_linestatus, SUM(l_quantit... 0.226 | | 3 21 SELECT l_orderkey, SUM(l_extendedprice * (1 - l_disc... 0.352 | | 4 21 SELECT nation, o_year, SUM(amount) AS sum_profit FROM ... 0.233 | | 5 21 select c_custkey, c_name, sum(l_extendedprice * (1 - l_disco... 4.923 | | 6 21 select ps_partkey, sum(ps_supplycost * ps_availqty) as value... 0.039 | | 7 21 select ps_partkey, sum(ps_supplycost * ps_availqty) as value... 0.079 | | 8 21 SELECT l_orderkey, SUM(l_extendedprice * (1 - l_disc... 0.039 | | 9 21 SELECT l_orderkey, SUM(l_extendedprice * (1 - l_disc... 0.050 | | 10 21 SELECT l_orderkey, SUM(l_extendedprice * (1 - l_disc... 0.040 | | 11 21 SELECT l_orderkey, SUM(l_extendedprice * (1 - l_disc... 0.047 | | 12 21 SELECT l_orderkey, SUM(l_extendedprice * (1 - l_disc... 0.006 (est.) Explain. | | 13 21 SELECT l_orderkey, SUM(l_extendedprice * (1 - l_disc... 0.006 (est.) Explain. | | | | TOTAL ESTIMATED: 3 EXEC-RUNTIME: 0.012 sec | | TOTAL EXECUTED: 10 EXEC-RUNTIME: 6.029 sec | | | | | | Retrieve detailed query statistics using the query below: | | SELECT log FROM sys.heatwave_advisor_report WHERE stage = "QUERY_INSIGHTS" AND type = "info"; | | | +--------------------------------------------------------------------------------------------------------------------+ 26 rows in set (0.0074 sec) Query OK, 0 rows affected (0.0074 sec)
Auto Change Propagation
It refers to the data modified on the InnoDB side, which intelligently determines when to propagate to the storage layer of HeatWave.
Auto Scheduling
HeatWave uses a workload-aware, priority-based, automated scheduling mechanism to schedule concurrently issued queries for execution. The scheduling mechanism prioritizes short-running queries but considers wait time in the queue so that costlier queries are eventually scheduled for execution. This scheduling approach reduces query execution wait times overall.
When there are multiple queries, decide which to execute first.
Failure Handling
Auto Error Recovery
Provisions new HeatWave nodes and reloads data from the HeatWave storage layer if one or more HeatWave nodes becomes unresponsive due to a software or hardware failure.
At this point, the benefits of object storage persistence are reflected.
See details here.
To view SQL status:
MySQL 10.0.1.28:3306 ssl tpch SQL > SHOW STATUS LIKE 'rapid_service_status'; +----------------------+--------+ | Variable_name | Value | +----------------------+--------+ | rapid_service_status | ONLINE | +----------------------+--------+ 1 row in set (0.0015 sec) MySQL 10.0.1.28:3306 ssl tpch SQL > USE performance_schema; Default schema set to `performance_schema`. Fetching table and column names from `performance_schema` for auto-completion... Press ^C to stop. MySQL 10.0.1.28:3306 ssl performance_schema SQL > SELECT NAME, LOAD_STATUS FROM rpd_tables,rpd_table_id WHERE rpd_tables.ID = rpd_table_id.ID; +---------------+---------------------+ | NAME | LOAD_STATUS | +---------------+---------------------+ | tpch.supplier | AVAIL_RPDGSTABSTATE | | tpch.partsupp | AVAIL_RPDGSTABSTATE | | tpch.orders | AVAIL_RPDGSTABSTATE | | tpch.lineitem | AVAIL_RPDGSTABSTATE | | tpch.nation | AVAIL_RPDGSTABSTATE | | tpch.customer | AVAIL_RPDGSTABSTATE | | tpch.region | AVAIL_RPDGSTABSTATE | | tpch.part | AVAIL_RPDGSTABSTATE | +---------------+---------------------+ 8 rows in set (0.0007 sec)
Finally, some useful metadata query SQL are shown in here.