Magic SQL Join Table Details_MySQL JOIN Execution Process

Keywords: MySQL SQL Android iOS

a moment of fun

Me: Hey, madam, there's Iced Black Tea
Madame: Yes
Me: How much is a bottle
Empress: 3 pieces
Me: Give me a bottle, give it, three pieces
Madame: Come on, your iced black tea
Me: Play, I want Iced Black tea. Do you want a cap for me?
Mrs. Boss: This is another bottle. My family is sold out. Go to the next house and have a change.

Question Background

For MySQL JOIN, I don't know if you have thought about its execution process or doubted your understanding (self-confident self-esteem!)If you don't know how to test, try answering the following questions

Selection of driver tables

How will MySQL select the drive table, first in left-to-right order?

The order in which multiple tables are joined

Suppose we have three tables: A, B, C, and the following SQL

-- false SQL,Not directly executable
A LEFT JOIN B ON B.aId = A.id
LEFT JOIN C ON C.aId = A.id
WHERE A.name = '666' AND B.state = 1 AND C.create_time > '2019-11-22 12:12:30'

Are the results of A and B joined tables processed before joining with C, or are A, B, C joined tables before filtering, or are both incorrect and there are other ways to handle them?

ON, WHERE Effective Time

The landlord inadvertently browsed to a blog post, which contains the following description

Orthogonal Fig.1 is taken from Mysql - JOIN Details

After reading this, the owner felt that he discovered the new continent at the first time. The original JOIN execution order was like this (not overturning the owner's previous knowledge, because the owner had not thought about this problem before, but satisfied with the acquisition of a new skill). The more he thought about it later, the more he felt like he had learned the wrong skill (Grade 6 was not a college)!

If there are tens of millions of data in each table, the Cartesian product of these two tables would be unthinkable!That is to say, the order in Figure 1 remains to be discussed, and the effective time of ON and WHERE remains to be discussed.

If you know all these things at your fingertips, go away and don't hinder me from pretending to be pushed; if you don't know them very well, sit back and I'll start pretending to be pushed

Prerequisite preparation

Before you start speaking officially, prepare some peanuts, melons and beer for everyone. If you don't pretend it, you will have a pretentious atmosphere. Otherwise, what do you think?(landlord, you liar, goods?)

Drive table

Driven Table refers to the first table to be processed, also known as the base table, when a multi-table Association query is performed, and then the records of this table are used to associate other tables.The selection of the drive table follows a principle that the table with the least result set is preferred as the drive table without affecting the final result set.This principle is not easy to understand and has the fewest result sets. We may be able to estimate it, but it does not affect the final result set. This is a bad judgment and difficult to return to, but there is still a certain rule:

LEFT JOIN generally uses left table as the drive table (RIGHT JOIN is usually right table), INNER JOIN generally uses tables with fewer result sets as the drive table, if in doubt, you can use EXPLAIN to find the drive table, the first table of the result is the drive table.
Do you think EXPLAIN must be right?Execution plans may change when they are actually executed! Applicable in most cases, especially EXPLAIN

LEFT JOIN s are optimized by the query optimizer to INNER JOIN s in some cases; result sets refer to filtered results of records in tables, not all records in tables, and if no filtering condition exists, all records in tables

More information can be found: Execution details of Mysql multi-table join query (1)

Flowchart of SQL execution

What MySQL did when we sent a request to MySQL

SQL Execution Path, from High Performance MySQL

You can see that the execution plan is the output of the query optimizer, and the execution engine queries the data based on the execution plan

- Data preparation

MySQL 5.7.1, InnoDB engine; table building SQL and data initial SQL

-- Table creation and data initialization
DROP TABLE IF EXISTS tbl_user;
CREATE TABLE tbl_user (
  id INT(11) UNSIGNED NOT NULL AUTO_INCREMENT COMMENT 'Self-adding Primary Key',
  user_name VARCHAR(50) NOT NULL COMMENT 'User name',
  sex TINYINT(1) NOT NULL COMMENT 'Gender, 1:Male, 0:female',
  create_time datetime NOT NULL COMMENT 'Creation Time',
  update_time datetime NOT NULL COMMENT 'Update Time',
    remark VARCHAR(255) NOT NULL DEFAULT '' COMMENT 'Remarks',
  PRIMARY KEY (id)
) COMMENT='User table';

DROP TABLE IF EXISTS tbl_user_login_log;
CREATE TABLE tbl_user_login_log (
  id INT(11) UNSIGNED NOT NULL AUTO_INCREMENT COMMENT 'Self-adding Primary Key',
  user_name VARCHAR(50) NOT NULL COMMENT 'User name',
  ip VARCHAR(15) NOT NULL COMMENT 'Sign in IP',
  client TINYINT(1) NOT NULL COMMENT 'Login, 1:android, 2:ios, 3:PC, 4:H5',
  create_time datetime NOT NULL COMMENT 'Creation Time',
  PRIMARY KEY (id)
) COMMENT='Logon Log';
INSERT INTO tbl_user(user_name,sex,create_time,update_time,remark) VALUES
('Hetianxiang',1,NOW(), NOW(),'Eyebrows, a table of people'),
('Xue Chen Xiang',0,NOW(), NOW(),'Xue Yanhong, the daughter of the main building owner of Tianxing Building, is also the minor chief building owner of Tianxing Building. She is full of body and elegant with dark hair. She refers to the first beautiful woman in Wulin who has spring onions, sleeves and sleeves like Jade with elegant, noble and elegant appearance and is known as "Tianxing Yuxiang".'),
('Mu Ronglanjuan',0,NOW(), NOW(),'The only daughter of Murong, a long-lasting family from the four northwest and southeastern families in Wulin, was born exquisitely and beautifully, but had a fierce temper and a passion for fire. Therefore, she was nicknamed "Fire Phoenix", the second most beautiful woman in Wulin except Xue Chen-xiang in Tianxing Building.'),
('Aiting',0,NOW(), NOW(),'Today's emperor's favorite niece, the principal of the northern palace, has slender waists and legs, is full of Roche, her eyebrows are painted in ink, and her lips are cherry red; although there is no deep fragrance and elegance, the orchid Juan is warm, but it shows a kind of empty space.'),
('Willow in marriage',0,NOW(), NOW(),'One of the four most melancholy Fairies in Wulin, Dong Wanwanwan's disciple, with a slim physique and beautiful appearance, is really autumn water as bone, Hibiscus as waist, eyebrow as ink painting, lips as cherry, half as weak as West Zi, better than Yuhuan.; Shake the red building, listen to the rain Xuan, Pipa is worth a thousand gold!'),
('Li Ningxue',0,NOW(), NOW(),'Li Xiangguo's daughter is radiant, beautiful and clearly loved and hated.'),
('Week's Dreams',0,NOW(), NOW(),'Sound God passed on, owner of the Xiangfei bamboo piano, cloud bun high plate, dressed in a black cicada wing yarn, the more I feel ice-mysterious bone, flour cherry lips, especially charming'),
('Leaf traces',0,NOW(), NOW(),'Saint Domain Virgin, white as snow, floating in white clothes, like a fairy, with an unspeakable soft beauty in her smile'),
('Guo Shuying',0,NOW(), NOW(),'The disciples of the Right Apostle with gray hair and fine eyebrows, smooth and delicate skin'),
('Zhong Juntian',0,NOW(), NOW(),'Heaven, Xuan Tian Nine - The Minister of Jun Heaven, extraordinary, fabulous'),
('Wang Yanyun',0,NOW(), NOW(),'Miss Dust Margin Villa 2, tough and headless'),
('Xu Shi Cream',0,NOW(), NOW(),'Yao Wang Gu Gu's master daughter is skilled in medicine'),
('Feng Yunning',0,NOW(), NOW(),'The Porter of peach blossom is charming and charming');
INSERT INTO tbl_user_login_log(user_name, ip, client, create_time) VALUES
('Xue Chen Xiang', '10.53.56.78',2, '2019-10-12 12:23:45'),
('Aiting', '10.53.56.78',2, '2019-10-12 22:23:45'),
('Mu Ronglanjuan', '10.53.56.12',1, '2018-08-12 22:23:45'),
('Hetianxiang', '10.53.56.12',1, '2019-10-19 10:23:45'),
('Willow in marriage', '198.11.132.198',2, '2018-05-12 22:23:45'),
('Feng Yunning', '198.11.132.198',2, '2018-11-11 22:23:45'),
('Week's Dreams', '198.11.132.198',2, '2019-06-18 22:23:45'),
('Guo Shuying', '220.181.38.148',3, '2019-10-21 09:45:56'),
('Xue Chen Xiang', '220.181.38.148',3, '2019-10-26 22:23:45'),
('Aiting', '104.69.160.60',4, '2019-10-12 10:23:45'),
('Wang Yanyun', '104.69.160.61',4, '2019-10-16 20:23:45'),
('Li Ningxue', '104.69.160.62',4, '2019-10-17 20:23:45'),
('Xu Shi Cream', '104.69.160.63',4, '2019-10-18 20:23:45'),
('Leaf traces', '104.69.160.64',4, '2019-10-19 20:23:45'),
('Wang Yanyun', '104.69.160.65',4, '2019-10-20 20:23:45'),
('Leaf traces', '104.69.160.66',4, '2019-10-21 20:23:45');

SELECT * FROM tbl_user;
SELECT * FROM tbl_user_login_log;

Single form query

The process of single-form queries is well understood, roughly as follows

The single-form query is not detailed, it mainly involves: clustering index, overwriting index, return table operation, know these three points, the figure above will be easy to understand (don't know to quickly look up data, expose it and lose it!).

Joining table algorithm

MySQL's join table algorithm is based on a nested loop algorithm ( nested-loop algorithm ) and a series of algorithms derived from it, choosing different algorithms according to different conditions

In the case of index association, there are two algorithms, Index Nested-Loop join and Batched Key Access join.
There are two algorithms, Simple Nested-Loop join and Block Nested-Loop join, without using index associations;

  Simple Nested-Loop

Simple nested loops, referred to as SNL s; match one by one, like this

for each row in t1 matching range {
  for each row in t2 matching reference key {
    for each row in t3 {
      if row satisfies join conditions, send to client
    }
  }
}

This algorithm is simple and rough, but has no performance at all. Time performance is the m-th power of n (the number of records in the table), so MySQL is optimized to avoid this algorithm when joining table queries, even when there is no WHERE condition and there is no index on the ON join key.

  Block Nested-Loop

(Cache block nested loop join, or BNL for short, is an optimization of INL; cache data from multiple drive tables at once, then match batches of data from Join Buffer to data read by inner loops, like this

for each row in t1 matching range {
  for each row in t2 matching reference key {
    store used columns from t1, t2 in join buffer
    if buffer is full {
      for each row in t3 {
        for each t1, t2 combination in join buffer {
          if row satisfies join conditions, send to client
        }
      }
      empty join buffer
    }
  }
}

if buffer is not empty {
  for each row in t3 {
    for each t1, t2 combination in join buffer {
      if row satisfies join conditions, send to client
    }
  }
}

Comparing each row read in the inner loop with all records in the buffer reduces the number of tables read in the inner loop.For example, if there is no Join Buffer, the drive table has 30 records and the drive table has 50 records, then the number of tables read by the inner loop should be 30 * 50 = 1500. If Join Buffer is available and can store 10 records, the number of tables read by the inner loop should be 30 / 10 * 50 = 150, and the number of tables read by the drive table must be reduced by an order of magnitude.

This algorithm is often used to join tables when the driven table has no index on the join key and the driven table has no index on the WHERE filter condition, as shown below.

  Index Nested-Loop

The index nested loop, or INL, is an algorithm for joining based on the index of the driven table; records of the driven table are matched to the index of the driven table one by one, avoiding comparison with each record of the driven table, reducing the number of matches to the driven table, as shown in the following figure

Let's look at the actual case, first index tbl_user_login_log ALTER TABLE tbl_user_login_log ADD INDEX idx_user_name (user_name);, let's look at the join table execution plan again

You can see that the index for tbl_user_login_log is in effect. Let's look down

The interesting thing happened: the drive table became tbl_user_login_log, the tbl_user became the drive table, the tbl_user_login_log filtered by the index and matched the result set with tbl_user by the BNL algorithm.This is really MySQL optimized, because tbl_user_login_log has fewer result sets than tbl_user records after indexing and filtering, so tbl_user_login_log is chosen as the drive table, so it's reasonable to think that MySQL is so powerful?

  Batched Key Access

Bulk key access, referred to as BKA, is an optimization of INL algorithm.

BKA's optimization of INL is similar to BNL's optimization of SNL, but different. Due to the length of the article, BKA will be explained in the next section. I hope you guys will understand!That's really not possible. Come and hit me!

summary

1. The selection of the driver table has its own set of algorithms, which can be specially studied if you are interested; The method to determine the reliability of comparison is to use EXPLAIN.

2. The join order is not to join the third table after two joins, but to repeat the join operation after one record of the driving table has been matched to all the related tables.

3. MySQL's connection algorithm is based on nested loop algorithm and different derivation algorithms depending on different situations

4. With regard to ON and WHERE, in our next detailed explanation, you may first consider their differences and the effective time.

Reference resources

  Execution details of Mysql multi-table join query (1)

Posted by fansphp on Mon, 09 Dec 2019 08:22:15 -0800