MySQL Builds Handy Database Architecture Notes - Currently there are only monitoring notes

[TOC]

MySQL Create Capable Database Architecture Notes

Database Monitoring

Content to monitor

Monitor the availability of a database: instead of just monitoring the existence of a database process, connect to the database over the network and determine if it is available
Monitor database performance:
QPS
TPS,
Number of concurrent threads,
innnoDB blocking and deadlocks
Monitor master-slave replication:
Master-Slave Link Status,
Master-slave delay,
Master-Slave Data Consistency
Monitoring server resources:
Disk: And that doesn't mean that the disk space is large, so MySQL will be able to use it because the partition allocation may not be large enough.
CPU usage
Memory usage
swap partition usage
Network IO usage

Database Availability Monitoring

Verify that the database is available for normal connections over the network

Note that if we use SQL to connect to MySQL natively in MySQL, this does not mean that MySQL can also be accessed externally through the tcp/ip protocol, because the external environment is more complex.

For example, tcp/ip is full, so we must actually make connection requests through remote servers:

Use mysqladmin:

# If the database survives, the command returns mysqld is alive
~ ⌚ 23:30:42
$ mysqladmin -u root -p -h localhost ping
Enter password:
mysqld is alive

Use Telnet (typically as manual)

# When a telnet connection succeeds, you know it's successful as long as you don't prompt for a connection failure, but also provide us with an interactive command line
~ ⌚ 23:42:01
$ telnet localhost 3306
Trying ::1...
telnet: connect to address ::1: Connection refused
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
N
5.6.47-log�+_#,Q4Pv!�"Cjl^uPqe*=4mysql_native_password

Set up database connections over the network using applications (recommended)

# For example, if we use PHP to do this, we can use PDO to do it, after connecting to MySQL, do something like select 1
<?php
$pdo = new PDO('mysql:host=localhost;dbname=test', 'root', '');//Send Connection
$result = $pdo->query('select 1')->fetchAll();//runtastic Heart Rate PRO
//$result = $pdo->query ('select @@version;') ->fetchAll(); //Or so
$pdo = null;//Release Connection
var_dump($result);

/**
[Running] php "/Users/liuhao/Desktop/tmp/pdo.php"
array(1) {
[0]=>
array(2) {
[1]=>
string(1) "1"
[2]=>
string(1) "1"
}
}
[Done] exited with code=0 in 0.103 seconds
*/

Verify that MySQL can read and write

Because it is possible that the read_only parameter that opens MySQL is not off, on because of incorrect configuration, this database can only read at this time.

For a single-machine service like that, without a master or slave, you can ignore the issue of read-write verification.

At this point, we can create a simple monitoring table for MySQL, such as only two fields, one ID and one time. Then use our monitoring program to read and write the table to monitor whether the database can read and write

#Not to mention here, just change the above code

Monitor the number of connections to the database

There are many situations that can result in MySQL connections being filled, such as blocking, cache penetration, and so on. This can lead to a sharp increase in the number of connections

So you have to keep an eye on the number of connections

Get the maximum number of MySQL connections

show variables like 'max_connections';

Get the current number of MySQL connections

show global status like 'Threads_connected'

Calculate alarm value

# If the current number of connections is > 0.8, you must alert at this time. In fact, in my personal experience, the number > 0.5 should alert, > 0.6 has generally been completed.
Threads_connected / max_connections > 0.8

Database Performance Monitoring

Performance monitoring is not used for usability monitoring. Performance monitoring is more about understanding trends in performance.

Performance monitoring is less important than usability monitoring when compared to execution.

Record database status collected during performance monitoring

The following queries,uptime_since_flush_status,Com_Insert,Com_delete can all be obtained by showing the global status like'XXX'. What's more specific is that uptime_since_flush_status is used to get it, which represents the last time FLUSH STATUS was used (in seconds).The FLUSH STATUS command is primarily used to reset the maximum number of connections that have been used simultaneously since the server was started (Max_used_connections)

Calculate QPS (queries per second)

Qps = (queries2 - queries1) / (uptime_since_flush_status2 - uptime_since_flush_status1)

MySQL's QPS do not refer to every select statement, but to include all SQL statements, so we'll take two samples of queries and calculate MySQL's QPS using their difference_between the two samples

Calculate TPS (transactions per second)

TPS=((Com_Insert2+Com_update2+Com_delete2) - 
(Com_Insertl+Com_updatel+Com_deletel)) / 
(Uptime_since_flush_status2-Uptime_since_flush_statusl)

Here's why I don't record the online calculation by getting Com_commit, because it's obvious that transactions are unreliable.

Monitor the number of concurrent databases

The performance of databases usually decreases as the number of concurrent processing requests increases, and the usage of the cpu is included in the calculations.

Number of database concurrencies (see the number of threads currently running in the database)

show global status like 'Threads_connected'

The number of concurrent processes is usually much smaller than the number of threads connected to the database at the same time

It is important to note that, for example, a large amount of data congestion can lead to a surge in concurrency, and a high number will not overwhelm the number of available connections in an instant.

Monitor innodb blocking

Myisam uses table-level locks and generally does not block, not considered here, but instead obtains myisam's blocking by monitoring the slow query log while optimizing table queries

Here's how to view it:

Note that root privilege information_schema is required here. This library is not authorized by MySQL and is not valid for myisam. It is valid only for innodb.

Blocked thread ID s can be captured accurately, but the blocked thread's SQL may not be captured because it is possible that the thread has executed multiple SQLs. When we go to the query, the SQL is finished. We can't catch it.

The following >10 seconds can be done according to your business needs, personal recommendation 30 seconds is already long! Recommend 30 seconds or 60 seconds
```
SELECT b.trx_mysql_thread_id                              AS 'Blocked Threads',
       b.trx_query                                        AS 'Blocked SQL',
       c.trx_mysql_thread_id                              AS 'Blocking Threads',
       c.trx_query                                        AS 'block SQL',
       (unix_timestamp() - unix_timestamp(c.trx_started)) AS 'Blocking time'
FROM `information_schema`.INNODB_LOCK_WAITS AS a
         join `information_schema`.INNODB_TRX AS b ON a.requesting_trx_id = b.trx_id
         join `information_schema`.INNODB_TRX AS c ON a.blocking_trx_id = c.trx_id
WHERE (unix_timestamp() - unix_timestamp(c.trx_started)) > 10;
```
Verify the accuracy of blocking query statements

Required SQL

# View the ID of the current session
select connection_id();
# Sets the time (in seconds) InnoDB transactions wait for row locks before abandoning
set global innodb_lock_wait_timeout=120;
# Select Library
use test;
# Open Transaction
begin;
# Locking
select * from user for update;
# RollBACK
rollback;

Execute lock in session1 window

mysql> select connection_id();
+-----------------+
| connection_id() |
+-----------------+
|             191 |
+-----------------+
1 row in set (0.00 sec)

mysql> use test;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed

mysql> set global innodb_lock_wait_timeout=120;
Query OK, 0 rows affected (0.00 sec)

mysql> begin;
Query OK, 0 rows affected (0.01 sec)


mysql> select * from user for update;
+----+--------+-----+
| id | name   | age |
+----+--------+-----+
|  1 | liuhao |   0 |
|  2 |        |   0 |
+----+--------+-----+
2 rows in set (6.09 sec)

Locking in session2, session2 should normally be blocked by session1

mysql> select connection_id();
+-----------------+
| connection_id() |
+-----------------+
|             192 |
+-----------------+
1 row in set (0.00 sec)

mysql> use test;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed

mysql> begin;
Query OK, 0 rows affected (0.00 sec)

mysql> select * from user for update;
# The current window is blocked at this time

Query blocking in session3

mysql> SELECT b.trx_mysql_thread_id                              AS 'Blocked Threads',
    ->        b.trx_query                                        AS 'Blocked SQL',
    ->        c.trx_mysql_thread_id                              AS 'Blocking Threads',
    ->        c.trx_query                                        AS 'block SQL',
    ->        (unix_timestamp() - unix_timestamp(c.trx_started)) AS 'Blocking time'
    -> FROM `information_schema`.INNODB_LOCK_WAITS AS a
    ->          join `information_schema`.INNODB_TRX AS b ON a.requesting_trx_id = b.trx_id
    ->          join `information_schema`.INNODB_TRX AS c ON a.blocking_trx_id = c.trx_id
    -> WHERE (unix_timestamp() - unix_timestamp(c.trx_started)) > 10;
+-----------------+-------------------------------+--------------+-----------+--------------+
| Blocked Threads      | Blocked SQL                     | Blocking Threads     | block SQL   | Blocking time     |
+-----------------+-------------------------------+--------------+-----------+--------------+
|             192 | select * from user for update |          191 | NULL      |          127 |
+-----------------+-------------------------------+--------------+-----------+--------------+
1 row in set (0.00 sec)

At this point, you can see that the blockage is correctly captured, which really fits our experimental process.

It is emphasized here that you can prepare to grab the blocked thread ID, but you may not be able to grab the blocked thread's SQL because it is possible that the thread has executed multiple SQLs. When we go to the query, the SQL is already executed. We will not be able to grab it.

MySQL Master-Slave Replication Monitoring

Monitor master-slave replication links

View master-slave replication links under slave Libraries

# View Slave Library Status
show slave status;
# If the following two results are not yes, the slave library is considered complete and the replication link is suspended
slave_io_running
slave_sql_running

Monitor delays in master-slave replication

Simple get master-slave delay:

Network 100% of stable servers can do this.

This is not fully prepared, because it is obtained by the primary Library Based on the time difference between the binlog on the primary library and the binlog log synchronized to and reexecuted from the library.

When there is a problem with the network, it will be inaccurate.

# Query the status of slave libraries in the master library
show slave status
# Judging by the fields below, a second is returned here
seconds_behind_master

Perfect acquisition master-slave delay:

At this point we need a multi-process program to monitor both master and slave binglog file numbers and binlog offsets

Master Library Monitoring

# View Master Library Status
show master 
# Judge by the following status
file: mysql-bin.00001 # binlog file number
position: 300000 # binlog offset

Monitoring from Library

View the name and offset of the binary binlog that is currently being transferred from master to slave

# View Slave Library Status
show slave status
master_log_file: mysql-bin.0001 # binlog file number
read_master_log_pos: 300000 # binlog offset

Name and offset of binary log on completed master

exec_master_log_pos:300000
relay_log_space:300001

In this case, compare the three returns above, and if their file names are the same and their offsets are the same, there is never any delay for the master.

Master-Slave Data Consistency Check

Note: I don't have a test to verify the function, I just learned it

If the master-slave problem occurs, we need to detect the master-slave data after we have fixed it

Use pt-table-checksum tool for automatic detection

#Just runs in the primary library, which automatically discovers all the slave information under the primary library and detects all the specified data information from the slave Library
 pt-table-checksum u=user name, p='password'\
--databases library name\
--replicate test.checksums #Under the test library, create the checksums table and write the data to the checksum table

Posted by groovything on Tue, 07 Apr 2020 14:54:18 -0700

Programmer Group