2022-03-27 18:35:17 +08:00
< p > Table: < code > Person< / code > < / p >
< pre >
+-------------+---------+
| Column Name | Type |
+-------------+---------+
| id | int |
| email | varchar |
+-------------+---------+
2023-12-09 18:42:21 +08:00
id is the primary key (column with unique values) for this table.
2022-03-27 18:35:17 +08:00
Each row of this table contains an email. The emails will not contain uppercase letters.
< / pre >
< p > < / p >
2023-12-09 18:42:21 +08:00
< p > Write a solution to< strong > delete< / strong > all duplicate emails, keeping only one unique email with the smallest < code > id< / code > .< / p >
2022-03-27 18:35:17 +08:00
2023-12-09 18:42:21 +08:00
< p > For SQL users, please note that you are supposed to write a < code > DELETE< / code > statement and not a < code > SELECT< / code > one.< / p >
2022-03-27 18:35:17 +08:00
2023-12-09 18:42:21 +08:00
< p > For Pandas users, please note that you are supposed to modify < code > Person< / code > in place.< / p >
< p > After running your script, the answer shown is the < code > Person< / code > table. The driver will first compile and run your piece of code and then show the < code > Person< / code > table. The final order of the < code > Person< / code > table < strong > does not matter< / strong > .< / p >
< p > The result format is in the following example.< / p >
2022-03-27 18:35:17 +08:00
< p > < / p >
2023-12-09 18:42:21 +08:00
< p > < strong class = "example" > Example 1:< / strong > < / p >
2022-03-27 18:35:17 +08:00
< pre >
< strong > Input:< / strong >
Person table:
+----+------------------+
| id | email |
+----+------------------+
| 1 | john@example.com |
| 2 | bob@example.com |
| 3 | john@example.com |
+----+------------------+
< strong > Output:< / strong >
+----+------------------+
| id | email |
+----+------------------+
| 1 | john@example.com |
| 2 | bob@example.com |
+----+------------------+
< strong > Explanation:< / strong > john@example.com is repeated two times. We keep the row with the smallest Id = 1.
< / pre >