Deleting many rows from a big table MySql 5.5.46

Hi,

The following statement deletes duplicate rows and keeps the highest id

DELETE t1 FROM contacts t1
INNER JOIN contacts t2 
WHERE 
    t1.id < t2.id AND 
    t1.email = t2.email;

This query references the contacts table twice, therefore, it uses the table alias t1 and t2.

But I have a 6 millions rows table and we need to clean it.

My first approach was create a SP with this lines

REPEAT
DELETE t1 FROM contacts t1
INNER JOIN contacts t2 
WHERE 
    t1.id < t2.id AND 
    t1.email = t2.email
ORDER BY t1.id ASC LIMIT 10000;
UNTIL ROW_COUNT() = 0 END REPEAT;

The error is

1064 - You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ‘ORDER BY t1.id ASC LIMIT 10000’ at line 17 Time: 0,063s

Help me to do it.

do it in 4 steps

CREATE TABLE temp_ids
AS 
SELECT t1.id
  FROM contacts t1
INNER 
  JOIN contacts t2 
    ON t2.email = t1.email
   AND t2.id > t1.id
;
ALTER TABLE temp_ids
ADD INDEX index_id ( id )
;
DELETE contacts
  FROM contacts
INNER
  JOIN temp_ids
    ON temp_ids.id = contacts.id
;
DROP TABLE temp_ids
;
1 Like

You really should upgrade your MySQL version. Yours has reached end of life some time ago.

1 Like

This topic was automatically closed 91 days after the last reply. New replies are no longer allowed.