How to delete duplicate records in MySQL, in a table without IDs?

.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty margin-bottom:0;

up vote
1
down vote

favorite

I need to delete the duplicate records in this table. However, there is no id for each row.

Example Data

+---------+--------+----------+
| product | amount | quantity |
+---------+--------+----------+
| table | 2000 | 5 |
| chair | 300 | 25 |
| TV | 30000 | 4 |
| bike | 300 | 25 |
| table | 2000 | 5 |
| chair | 300 | 25 |
| chair | 300 | 25 |
+---------+--------+----------+

Expected Results

I need to get this result.

+---------+--------+----------+
| product | amount | quantity |
+---------+--------+----------+
| table | 2000 | 5 |
| chair | 300 | 25 |
| TV | 30000 | 4 |
| bike | 300 | 25 |
+---------+--------+----------+

Script with ID

If there were an id, I could have used:

DELETE p1 FROM products p1
INNER JOIN products p2 
WHERE p1.id < p2.id AND p1.product = p2.product;

edited Aug 15 at 15:10

Anthony Genovese

1,6402622

asked Aug 15 at 4:50

Edwin Babu

ROW_NUMBER is there PostgreSQL, is there some function like that in mysql
â€“Â Edwin Babu
Aug 15 at 5:11

By "duplicate", you mean that all columns have the same values?
â€“Â Rick James
yesterday

yes, all columns having same value @RickJames
â€“Â Edwin Babu
yesterday

add a commentÂ |Â

up vote
1
down vote

favorite

I need to delete the duplicate records in this table. However, there is no id for each row.

Example Data

+---------+--------+----------+
| product | amount | quantity |
+---------+--------+----------+
| table | 2000 | 5 |
| chair | 300 | 25 |
| TV | 30000 | 4 |
| bike | 300 | 25 |
| table | 2000 | 5 |
| chair | 300 | 25 |
| chair | 300 | 25 |
+---------+--------+----------+

Expected Results

I need to get this result.

+---------+--------+----------+
| product | amount | quantity |
+---------+--------+----------+
| table | 2000 | 5 |
| chair | 300 | 25 |
| TV | 30000 | 4 |
| bike | 300 | 25 |
+---------+--------+----------+

Script with ID

If there were an id, I could have used:

DELETE p1 FROM products p1
INNER JOIN products p2 
WHERE p1.id < p2.id AND p1.product = p2.product;

edited Aug 15 at 15:10

Anthony Genovese

1,6402622

asked Aug 15 at 4:50

Edwin Babu

ROW_NUMBER is there PostgreSQL, is there some function like that in mysql
â€“Â Edwin Babu
Aug 15 at 5:11

By "duplicate", you mean that all columns have the same values?
â€“Â Rick James
yesterday

yes, all columns having same value @RickJames
â€“Â Edwin Babu
yesterday

add a commentÂ |Â

up vote
1
down vote

favorite

I need to delete the duplicate records in this table. However, there is no id for each row.

Example Data

+---------+--------+----------+
| product | amount | quantity |
+---------+--------+----------+
| table | 2000 | 5 |
| chair | 300 | 25 |
| TV | 30000 | 4 |
| bike | 300 | 25 |
| table | 2000 | 5 |
| chair | 300 | 25 |
| chair | 300 | 25 |
+---------+--------+----------+

Expected Results

I need to get this result.

+---------+--------+----------+
| product | amount | quantity |
+---------+--------+----------+
| table | 2000 | 5 |
| chair | 300 | 25 |
| TV | 30000 | 4 |
| bike | 300 | 25 |
+---------+--------+----------+

Script with ID

If there were an id, I could have used:

DELETE p1 FROM products p1
INNER JOIN products p2 
WHERE p1.id < p2.id AND p1.product = p2.product;

edited Aug 15 at 15:10

Anthony Genovese

1,6402622

asked Aug 15 at 4:50

Edwin Babu

I need to delete the duplicate records in this table. However, there is no id for each row.

Example Data

+---------+--------+----------+
| product | amount | quantity |
+---------+--------+----------+
| table | 2000 | 5 |
| chair | 300 | 25 |
| TV | 30000 | 4 |
| bike | 300 | 25 |
| table | 2000 | 5 |
| chair | 300 | 25 |
| chair | 300 | 25 |
+---------+--------+----------+

Expected Results

I need to get this result.

+---------+--------+----------+
| product | amount | quantity |
+---------+--------+----------+
| table | 2000 | 5 |
| chair | 300 | 25 |
| TV | 30000 | 4 |
| bike | 300 | 25 |
+---------+--------+----------+

Script with ID

If there were an id, I could have used:

DELETE p1 FROM products p1
INNER JOIN products p2 
WHERE p1.id < p2.id AND p1.product = p2.product;

edited Aug 15 at 15:10

Anthony Genovese

1,6402622

asked Aug 15 at 4:50

Edwin Babu

edited Aug 15 at 15:10

Anthony Genovese

1,6402622

edited Aug 15 at 15:10

Anthony Genovese

1,6402622

edited Aug 15 at 15:10

Anthony Genovese

1,6402622

asked Aug 15 at 4:50

Edwin Babu

asked Aug 15 at 4:50

Edwin Babu

asked Aug 15 at 4:50

Edwin Babu

ROW_NUMBER is there PostgreSQL, is there some function like that in mysql
â€“Â Edwin Babu
Aug 15 at 5:11

By "duplicate", you mean that all columns have the same values?
â€“Â Rick James
yesterday

yes, all columns having same value @RickJames
â€“Â Edwin Babu
yesterday

add a commentÂ |Â

ROW_NUMBER is there PostgreSQL, is there some function like that in mysql
â€“Â Edwin Babu
Aug 15 at 5:11

By "duplicate", you mean that all columns have the same values?
â€“Â Rick James
yesterday

yes, all columns having same value @RickJames
â€“Â Edwin Babu
yesterday

ROW_NUMBER is there PostgreSQL, is there some function like that in mysql
â€“Â Edwin Babu
Aug 15 at 5:11

By "duplicate", you mean that all columns have the same values?
â€“Â Rick James
yesterday

yes, all columns having same value @RickJames
â€“Â Edwin Babu
yesterday

add a commentÂ |Â

3 Answers
3

active

oldest

votes

up vote
8
down vote

accepted

There is no any field combination which identifies the record uniqually.

I see at least 2 different solutions.

First solution: move unique records to a copy of table and replace original table.

CREATE TABLE temp LIKE products;
INSERT INTO temp 
 SELECT DISTINCT * FROM products;
DROP TABLE products;
RENAME TABLE temp TO products;

Second solution: add temporary autoincrement, delete records using it, and drop temp field.

ALTER TABLE products ADD COLUMN temp SERIAL PRIMARY KEY;
DELETE t1.* 
 FROM products t1 
 LEFT JOIN ( SELECT MIN(temp) mintemp 
 FROM products
 GROUP BY field1,field2 /* , ... */ , fieldN) t2 
 ON t1.temp=t2.mintemp 
 WHERE t2.mintemp IS NULL;
ALTER TABLE products DROP COLUMN temp;

UPDATE

In second variant: the additional column definition as a primary key is redundant. It is enough to use

ALTER TABLE products ADD COLUMN temp SERIAL;

edited Aug 15 at 9:00

answered Aug 15 at 6:31

Akina

1,874129

You could also use a hash function to create a semi-unique ID, and then delete duplicate hash values.
â€“Â Russell Fox
Aug 15 at 14:46

In your first suggestion, do it this way so that the table is never missing: RENAME TABLE products TO old, temp TO products; DROP TABLE old;
â€“Â Rick James
yesterday

add a commentÂ |Â

up vote
3
down vote

Apart from Akinas answer, You could delete both rows and then insert one.

You should also really, really add a primary key to your table even if you don't need to for performance, specifically to avoid situaitons like this.

answered Aug 15 at 10:49

Guran

1485

add a commentÂ |Â

up vote
0
down vote

You could do a

Delete <condition> limit 1

That will only delete 1 row, even if multiple rows matches the condition.
This is explained in the official manual:

13.2.2 DELETE Syntax

Order of Deletion

If the DELETE statement includes an ORDER BY clause, rows are deleted in the order specified by the clause. This is useful primarily in conjunction with LIMIT. For example, the following statement finds rows matching the WHERE clause, sorts them by timestamp_column, and deletes the first (oldest) one:
DELETE FROM somelog WHERE user = 'jcole'
ORDER BY timestamp_column LIMIT 1;

edited Aug 15 at 12:25

hot2use

7,29041850

answered Aug 15 at 11:53

MTilsted

1252

1

He wants to delete all but one of the duplicates, not just one. And how would you do this across all product products?
â€“Â Barmar
Aug 15 at 17:53

add a commentÂ |Â

Your Answer

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "182"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: false,
noModals: false,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f214946%2fhow-to-delete-duplicate-records-in-mysql-in-a-table-without-ids%23new-answer', 'question_page');

);

Post as a guest

Name

3 Answers
3

active

oldest

votes

3 Answers
3

active

oldest

votes

up vote
8
down vote

accepted

There is no any field combination which identifies the record uniqually.

I see at least 2 different solutions.

First solution: move unique records to a copy of table and replace original table.

CREATE TABLE temp LIKE products;
INSERT INTO temp 
 SELECT DISTINCT * FROM products;
DROP TABLE products;
RENAME TABLE temp TO products;

Second solution: add temporary autoincrement, delete records using it, and drop temp field.

ALTER TABLE products ADD COLUMN temp SERIAL PRIMARY KEY;
DELETE t1.* 
 FROM products t1 
 LEFT JOIN ( SELECT MIN(temp) mintemp 
 FROM products
 GROUP BY field1,field2 /* , ... */ , fieldN) t2 
 ON t1.temp=t2.mintemp 
 WHERE t2.mintemp IS NULL;
ALTER TABLE products DROP COLUMN temp;

UPDATE

In second variant: the additional column definition as a primary key is redundant. It is enough to use

ALTER TABLE products ADD COLUMN temp SERIAL;

edited Aug 15 at 9:00

answered Aug 15 at 6:31

Akina

1,874129

You could also use a hash function to create a semi-unique ID, and then delete duplicate hash values.
â€“Â Russell Fox
Aug 15 at 14:46

In your first suggestion, do it this way so that the table is never missing: RENAME TABLE products TO old, temp TO products; DROP TABLE old;
â€“Â Rick James
yesterday

add a commentÂ |Â

up vote
8
down vote

accepted

There is no any field combination which identifies the record uniqually.

I see at least 2 different solutions.

First solution: move unique records to a copy of table and replace original table.

CREATE TABLE temp LIKE products;
INSERT INTO temp 
 SELECT DISTINCT * FROM products;
DROP TABLE products;
RENAME TABLE temp TO products;

Second solution: add temporary autoincrement, delete records using it, and drop temp field.

ALTER TABLE products ADD COLUMN temp SERIAL PRIMARY KEY;
DELETE t1.* 
 FROM products t1 
 LEFT JOIN ( SELECT MIN(temp) mintemp 
 FROM products
 GROUP BY field1,field2 /* , ... */ , fieldN) t2 
 ON t1.temp=t2.mintemp 
 WHERE t2.mintemp IS NULL;
ALTER TABLE products DROP COLUMN temp;

UPDATE

In second variant: the additional column definition as a primary key is redundant. It is enough to use

ALTER TABLE products ADD COLUMN temp SERIAL;

edited Aug 15 at 9:00

answered Aug 15 at 6:31

Akina

1,874129

You could also use a hash function to create a semi-unique ID, and then delete duplicate hash values.
â€“Â Russell Fox
Aug 15 at 14:46

In your first suggestion, do it this way so that the table is never missing: RENAME TABLE products TO old, temp TO products; DROP TABLE old;
â€“Â Rick James
yesterday

add a commentÂ |Â

up vote
8
down vote

accepted

There is no any field combination which identifies the record uniqually.

I see at least 2 different solutions.

First solution: move unique records to a copy of table and replace original table.

CREATE TABLE temp LIKE products;
INSERT INTO temp 
 SELECT DISTINCT * FROM products;
DROP TABLE products;
RENAME TABLE temp TO products;

Second solution: add temporary autoincrement, delete records using it, and drop temp field.

ALTER TABLE products ADD COLUMN temp SERIAL PRIMARY KEY;
DELETE t1.* 
 FROM products t1 
 LEFT JOIN ( SELECT MIN(temp) mintemp 
 FROM products
 GROUP BY field1,field2 /* , ... */ , fieldN) t2 
 ON t1.temp=t2.mintemp 
 WHERE t2.mintemp IS NULL;
ALTER TABLE products DROP COLUMN temp;

UPDATE

In second variant: the additional column definition as a primary key is redundant. It is enough to use

ALTER TABLE products ADD COLUMN temp SERIAL;

edited Aug 15 at 9:00

answered Aug 15 at 6:31

Akina

1,874129

There is no any field combination which identifies the record uniqually.

I see at least 2 different solutions.

First solution: move unique records to a copy of table and replace original table.

CREATE TABLE temp LIKE products;
INSERT INTO temp 
 SELECT DISTINCT * FROM products;
DROP TABLE products;
RENAME TABLE temp TO products;

Second solution: add temporary autoincrement, delete records using it, and drop temp field.

ALTER TABLE products ADD COLUMN temp SERIAL PRIMARY KEY;
DELETE t1.* 
 FROM products t1 
 LEFT JOIN ( SELECT MIN(temp) mintemp 
 FROM products
 GROUP BY field1,field2 /* , ... */ , fieldN) t2 
 ON t1.temp=t2.mintemp 
 WHERE t2.mintemp IS NULL;
ALTER TABLE products DROP COLUMN temp;

UPDATE

In second variant: the additional column definition as a primary key is redundant. It is enough to use

ALTER TABLE products ADD COLUMN temp SERIAL;

edited Aug 15 at 9:00

answered Aug 15 at 6:31

Akina

1,874129

edited Aug 15 at 9:00

answered Aug 15 at 6:31

Akina

1,874129

answered Aug 15 at 6:31

Akina

1,874129

answered Aug 15 at 6:31

Akina

1,874129

You could also use a hash function to create a semi-unique ID, and then delete duplicate hash values.
â€“Â Russell Fox
Aug 15 at 14:46

In your first suggestion, do it this way so that the table is never missing: RENAME TABLE products TO old, temp TO products; DROP TABLE old;
â€“Â Rick James
yesterday

add a commentÂ |Â

You could also use a hash function to create a semi-unique ID, and then delete duplicate hash values.
â€“Â Russell Fox
Aug 15 at 14:46

In your first suggestion, do it this way so that the table is never missing: RENAME TABLE products TO old, temp TO products; DROP TABLE old;
â€“Â Rick James
yesterday

You could also use a hash function to create a semi-unique ID, and then delete duplicate hash values.
â€“Â Russell Fox
Aug 15 at 14:46

In your first suggestion, do it this way so that the table is never missing: RENAME TABLE products TO old, temp TO products; DROP TABLE old;
â€“Â Rick James
yesterday

add a commentÂ |Â

up vote
3
down vote

Apart from Akinas answer, You could delete both rows and then insert one.

You should also really, really add a primary key to your table even if you don't need to for performance, specifically to avoid situaitons like this.

answered Aug 15 at 10:49

Guran

1485

add a commentÂ |Â

up vote
3
down vote

Apart from Akinas answer, You could delete both rows and then insert one.

You should also really, really add a primary key to your table even if you don't need to for performance, specifically to avoid situaitons like this.

answered Aug 15 at 10:49

Guran

1485

add a commentÂ |Â

up vote
3
down vote

Apart from Akinas answer, You could delete both rows and then insert one.

You should also really, really add a primary key to your table even if you don't need to for performance, specifically to avoid situaitons like this.

answered Aug 15 at 10:49

Guran

1485

Apart from Akinas answer, You could delete both rows and then insert one.

You should also really, really add a primary key to your table even if you don't need to for performance, specifically to avoid situaitons like this.

answered Aug 15 at 10:49

Guran

1485

answered Aug 15 at 10:49

Guran

1485

answered Aug 15 at 10:49

Guran

1485

answered Aug 15 at 10:49

Guran

1485

add a commentÂ |Â

up vote
0
down vote

You could do a

Delete <condition> limit 1

That will only delete 1 row, even if multiple rows matches the condition.
This is explained in the official manual:

13.2.2 DELETE Syntax

Order of Deletion

If the DELETE statement includes an ORDER BY clause, rows are deleted in the order specified by the clause. This is useful primarily in conjunction with LIMIT. For example, the following statement finds rows matching the WHERE clause, sorts them by timestamp_column, and deletes the first (oldest) one:
DELETE FROM somelog WHERE user = 'jcole'
ORDER BY timestamp_column LIMIT 1;

edited Aug 15 at 12:25

hot2use

7,29041850

answered Aug 15 at 11:53

MTilsted

1252

1

He wants to delete all but one of the duplicates, not just one. And how would you do this across all product products?
â€“Â Barmar
Aug 15 at 17:53

add a commentÂ |Â

up vote
0
down vote

You could do a

Delete <condition> limit 1

That will only delete 1 row, even if multiple rows matches the condition.
This is explained in the official manual:

13.2.2 DELETE Syntax

Order of Deletion

If the DELETE statement includes an ORDER BY clause, rows are deleted in the order specified by the clause. This is useful primarily in conjunction with LIMIT. For example, the following statement finds rows matching the WHERE clause, sorts them by timestamp_column, and deletes the first (oldest) one:
DELETE FROM somelog WHERE user = 'jcole'
ORDER BY timestamp_column LIMIT 1;

edited Aug 15 at 12:25

hot2use

7,29041850

answered Aug 15 at 11:53

MTilsted

1252

1

He wants to delete all but one of the duplicates, not just one. And how would you do this across all product products?
â€“Â Barmar
Aug 15 at 17:53

add a commentÂ |Â

up vote
0
down vote

You could do a

Delete <condition> limit 1

That will only delete 1 row, even if multiple rows matches the condition.
This is explained in the official manual:

13.2.2 DELETE Syntax

Order of Deletion

If the DELETE statement includes an ORDER BY clause, rows are deleted in the order specified by the clause. This is useful primarily in conjunction with LIMIT. For example, the following statement finds rows matching the WHERE clause, sorts them by timestamp_column, and deletes the first (oldest) one:
DELETE FROM somelog WHERE user = 'jcole'
ORDER BY timestamp_column LIMIT 1;

edited Aug 15 at 12:25

hot2use

7,29041850

answered Aug 15 at 11:53

MTilsted

1252

You could do a

Delete <condition> limit 1

That will only delete 1 row, even if multiple rows matches the condition.
This is explained in the official manual:

13.2.2 DELETE Syntax

Order of Deletion

If the DELETE statement includes an ORDER BY clause, rows are deleted in the order specified by the clause. This is useful primarily in conjunction with LIMIT. For example, the following statement finds rows matching the WHERE clause, sorts them by timestamp_column, and deletes the first (oldest) one:
DELETE FROM somelog WHERE user = 'jcole'
ORDER BY timestamp_column LIMIT 1;

edited Aug 15 at 12:25

hot2use

7,29041850

answered Aug 15 at 11:53

MTilsted

1252

edited Aug 15 at 12:25

hot2use

7,29041850

edited Aug 15 at 12:25

hot2use

7,29041850

edited Aug 15 at 12:25

hot2use

7,29041850

answered Aug 15 at 11:53

MTilsted

1252

answered Aug 15 at 11:53

MTilsted

1252

answered Aug 15 at 11:53

MTilsted

1252

1

He wants to delete all but one of the duplicates, not just one. And how would you do this across all product products?
â€“Â Barmar
Aug 15 at 17:53

add a commentÂ |Â

1

He wants to delete all but one of the duplicates, not just one. And how would you do this across all product products?
â€“Â Barmar
Aug 15 at 17:53

He wants to delete all but one of the duplicates, not just one. And how would you do this across all product products?
â€“Â Barmar
Aug 15 at 17:53

add a commentÂ |Â

draft saved

draft discarded

draft saved

draft discarded

Post as a guest

Name

搜尋此網誌

Vtyjkyuk