In the process of large-scale data processing, sometimes it is necessary to take out a part of data by groups as a reference for the whole data.
For example, there is a table [1], which can be divided into 200 groups by column [1]. Now we want to take 10 records from each group of these 200 groups as a representative.
We can operate MySQL through the following Python statements to achieve this function:
import mysql.connector conn = mysql.connector.connect(host = '127.0.0.1', port = 3306, user = 'root', password = 'xxx', db = 'xxx' ) cursor = conn.cursor() for i in range(0,200): #Note that i is indexed from 0 at limit i,j sql = "INSERT INTO Sample_table_1 SELECT t.column_1, t.column_2 FROM table_1 t, (SELECT * FROM table_1 group by column1 limit %s,1) p where t.column1=p.column1 limit 10;"%i cursor.execute(sql) conn.commit() conn.close()
Of course, because this is an insert operation, we can also use excuteany to implement:
import mysql.connector conn = mysql.connector.connect(host = '127.0.0.1', port = 3306, user = 'root', password = 'xxx', db = 'xxx' ) cursor = conn.cursor() cursor.executemany("INSERT INTO Sample_table_1 SELECT t.column_1, t.column_2 FROM table_1 t, (SELECT * FROM table_1 group by column1 limit %s,1) p where t.column1=p.column1 limit 10;", [(i,) for i in range(0,200)] ) #Note that "," in (i,) must be reserved, otherwise an error will be reported conn.commit() conn.close()
Alternatively, you can use MySQL stored procedures.
import mysql.connector conn = mysql.connector.connect(host = '127.0.0.1', port = 3306, user = 'root', password = 'xxx', db = 'xxx' ) cursor = conn.cursor() sql1="drop procedure if exists test;" sql2="create procedure test() \ begin\ declare i bigint default 0; \ DECLARE var BIGINT DEFAULT 0; \ while i < 200 do \ INSERT INTO Sample_table_1 SELECT t.column_1, t.column_2 FROM table_1 t, (SELECT * FROM table_1 group by column1 limit i,1) p where t.column1=p.column1 limit 10;\ set i = i + 1;\ end while;\ end;" sql3="call test;" cursor.execute(sql1) cursor.execute(sql2) cursor.execute(sql3) conn.commit() conn.close()
In terms of execution efficiency, excuteany is slightly higher than excute, and the efficiency of stored procedures is the lowest.