One article a day of mysql 15: perfect de duplication in mysql

Keywords: SQL

There are a lot of garbage data in the table. Now delete and keep one of them (whether to judge repeatedly is based on multiple fields)

Scenario 1:

DELETE
FROM
 vitae a
WHERE
 (a.peopleId, a.seq) IN (
  SELECT
   peopleId,
   seq
  FROM
   vitae
  GROUP BY
   peopleId,
   seq
  HAVING
   count(*) > 1
 )
AND rowid NOT IN (
 SELECT
  min(rowid)
 FROM
  vitae
 GROUP BY
  peopleId,
  seq
 HAVING
  count(*) > 1
)

No choice but to report mistakes

Solve

DELETE
FROM
 vitae a
WHERE
 (a.peopleId, a.seq) IN (
  SELECT t.* FROM (SELECT
   peopleId,
   seq
  FROM
   vitae
  GROUP BY
   peopleId,
   seq
  HAVING
   count(*) > 1) t
 )
AND rowid NOT IN (
 SELECT t.* FROM (SELECT
  min(rowid)
 FROM
  vitae
 GROUP BY
  peopleId,
  seq
 HAVING
  count(*) > 1) t
)

Scheme 2: perfect SQL

DELETE consum_record
FROM
 consum_record, 
 (
  SELECT
   min(id) id,
   user_id,
   monetary,
   consume_time
  FROM
   consum_record
  GROUP BY
   user_id,
   monetary,
   consume_time
  HAVING
   count(*) > 1
 ) t2
WHERE
 consum_record.user_id = t2.user_id 
 and consum_record.monetary = t2.monetary
 and consum_record.consume_time = t2.consume_time
AND consum_record.id > t2.id;



=======================================================


DELETE users  FROM users , (SELECT user_id FROM users 
    GROUP BY name, age 
    having COUNT(*) > 1) AS t1
WHERE users .user_id = t1.qzkh_id

Take a closer look at the above sql statement. It's not difficult to figure out the thinking. It can be roughly understood in three steps:

(select min (ID) id, user ID, monetary, consumption time from consumption [record group by user ID, monetary, consumption time having count (*) > 1) t2) queries the duplicate records to form a set (temporary table t2), in which is the smallest ID of each duplicate record

Consumption_record. User_id = T2. User_id and consumption_record. Monetary = T2. Monetary and consumption_record. Consumption_time = T2. Consumption_time

Posted by svguerin3 on Thu, 02 Jan 2020 02:58:27 -0800