leetcode-problemset/leetcode-cn/originData/drop-duplicate-rows.json

{
    "data": {
        "question": {
            "questionId": "3071",
            "questionFrontendId": "2882",
            "categoryTitle": "pandas",
            "boundTopicId": 2467488,
            "title": "Drop Duplicate Rows",
            "titleSlug": "drop-duplicate-rows",
            "content": "<pre>\nDataFrame customers\n+-------------+--------+\n| Column Name | Type   |\n+-------------+--------+\n| customer_id | int    |\n| name        | object |\n| email       | object |\n+-------------+--------+\n</pre>\n\n<p>There are some duplicate rows in the DataFrame based on the <code>email</code> column.</p>\n\n<p>Write a solution to remove these duplicate rows and keep only the <strong>first</strong> occurrence.</p>\n\n<p>The result format is in the following example.</p>\n\n<p>&nbsp;</p>\n<pre>\n<strong class=\"example\">Example 1:</strong>\n<strong>Input:</strong>\n+-------------+---------+---------------------+\n| customer_id | name    | email               |\n+-------------+---------+---------------------+\n| 1           | Ella    | emily@example.com   |\n| 2           | David   | michael@example.com |\n| 3           | Zachary | sarah@example.com   |\n| 4           | Alice   | john@example.com    |\n| 5           | Finn    | john@example.com    |\n| 6           | Violet  | alice@example.com   |\n+-------------+---------+---------------------+\n<strong>Output: </strong> \n+-------------+---------+---------------------+\n| customer_id | name    | email               |\n+-------------+---------+---------------------+\n| 1           | Ella    | emily@example.com   |\n| 2           | David   | michael@example.com |\n| 3           | Zachary | sarah@example.com   |\n| 4           | Alice   | john@example.com    |\n| 6           | Violet  | alice@example.com   |\n+-------------+---------+---------------------+\n<strong>Explanation:</strong>\nAlic (customer_id = 4) and Finn (customer_id = 5) both use john@example.com, so only the first occurrence of this email is retained.\n</pre>\n",
            "translatedTitle": "删去重复的行",
            "translatedContent": "<pre>\nDataFrame customers\n+-------------+--------+\n| Column Name | Type   |\n+-------------+--------+\n| customer_id | int    |\n| name        | object |\n| email       | object |\n+-------------+--------+\n</pre>\n\n<p>在 DataFrame 中基于&nbsp;<code>email</code>&nbsp;列存在一些重复行。</p>\n\n<p>编写一个解决方案，删除这些重复行，仅保留第一次出现的行。</p>\n\n<p>返回结果格式如下例所示。</p>\n\n<p>&nbsp;</p>\n\n<p><strong>示例 1:</strong></p>\n\n<pre>\n<b>输入：</b>\n+-------------+---------+---------------------+\n| customer_id | name    | email               |\n+-------------+---------+---------------------+\n| 1           | Ella    | emily@example.com   |\n| 2           | David   | michael@example.com |\n| 3           | Zachary | sarah@example.com   |\n| 4           | Alice   | john@example.com    |\n| 5           | Finn    | john@example.com    |\n| 6           | Violet  | alice@example.com   |\n+-------------+---------+---------------------+\n<b>输出：</b>\n+-------------+---------+---------------------+\n| customer_id | name    | email               |\n+-------------+---------+---------------------+\n| 1           | Ella    | emily@example.com   |\n| 2           | David   | michael@example.com |\n| 3           | Zachary | sarah@example.com   |\n| 4           | Alice   | john@example.com    |\n| 6           | Violet  | alice@example.com   |\n+-------------+---------+---------------------+\n<b>解释：</b>\nAlice (customer_id = 4) 和 Finn (customer_id = 5) 都使用 john@example.com，因此只保留该邮箱地址的第一次出现。\n</pre>\n",
            "isPaidOnly": false,
            "difficulty": "Easy",
            "likes": 1,
            "dislikes": 0,
            "isLiked": null,
            "similarQuestions": "[]",
            "contributors": [],
            "langToValidPlayground": "{\"cpp\": false, \"java\": false, \"python\": false, \"python3\": false, \"mysql\": false, \"mssql\": false, \"oraclesql\": false, \"c\": false, \"csharp\": false, \"javascript\": false, \"typescript\": false, \"bash\": false, \"php\": false, \"swift\": false, \"kotlin\": false, \"dart\": false, \"golang\": false, \"ruby\": false, \"scala\": false, \"html\": false, \"pythonml\": false, \"rust\": false, \"racket\": false, \"erlang\": false, \"elixir\": false, \"pythondata\": false, \"react\": false, \"vanillajs\": false, \"postgresql\": false}",
            "topicTags": [],
            "companyTagStats": null,
            "codeSnippets": [
                {
                    "lang": "Pandas",
                    "langSlug": "pythondata",
                    "code": "import pandas as pd\n\ndef dropDuplicateEmails(customers: pd.DataFrame) -> pd.DataFrame:\n    ",
                    "__typename": "CodeSnippetNode"
                }
            ],
            "stats": "{\"totalAccepted\": \"1.5K\", \"totalSubmission\": \"1.9K\", \"totalAcceptedRaw\": 1535, \"totalSubmissionRaw\": 1924, \"acRate\": \"79.8%\"}",
            "hints": [
                "Consider using a build-in function in pandas library to remove the duplicate rows based on specified data."
            ],
            "solution": null,
            "status": null,
            "sampleTestCase": "{\"headers\":{\"customers\":[\"customer_id\",\"name\",\"email\"]},\"rows\":{\"customers\":[[1,\"Ella\",\"emily@example.com\"],[2,\"David\",\"michael@example.com\"],[3,\"Zachary\",\"sarah@example.com\"],[4,\"Alice\",\"john@example.com\"],[5,\"Finn\",\"john@example.com\"],[6,\"Violet\",\"alice@example.com\"]]}}",
            "metaData": "{\n  \"pythondata\": [\n    \"customers = pd.DataFrame([], columns=['customer_id', 'name', 'email']).astype({'customer_id':'Int64', 'name':'object', 'email':'object'})\"\n  ],\n  \"database\": true,\n  \"name\": \"dropDuplicateEmails\",\n  \"languages\": [\n    \"pythondata\"\n  ]\n}",
            "judgerAvailable": true,
            "judgeType": "large",
            "mysqlSchemas": [],
            "enableRunCode": true,
            "envInfo": "{\"pythondata\":[\"Pandas\",\"<p>Python 3.10 with Pandas 2.0.2 and NumPy 1.25.0<\\/p>\"]}",
            "book": null,
            "isSubscribed": false,
            "isDailyQuestion": false,
            "dailyRecordStatus": null,
            "editorType": "CKEDITOR",
            "ugcQuestionId": null,
            "style": "LEETCODE",
            "exampleTestcases": "{\"headers\":{\"customers\":[\"customer_id\",\"name\",\"email\"]},\"rows\":{\"customers\":[[1,\"Ella\",\"emily@example.com\"],[2,\"David\",\"michael@example.com\"],[3,\"Zachary\",\"sarah@example.com\"],[4,\"Alice\",\"john@example.com\"],[5,\"Finn\",\"john@example.com\"],[6,\"Violet\",\"alice@example.com\"]]}}",
            "__typename": "QuestionNode"
        }
    }
}