update

2025-10-22 05:26:46 +08:00 · 2022-05-02 23:44:12 +08:00
parent 7ea03594b3
commit 2a71c78585
4790 changed files with 11696 additions and 10944 deletions
--- a/leetcode-cn/originData/utf-8-validation.json
+++ b/leetcode-cn/originData/utf-8-validation.json
@@ -7,12 +7,12 @@
            "boundTopicId": 1862,
            "title": "UTF-8 Validation",
            "titleSlug": "utf-8-validation",
-            "content": "<p>Given an integer array <code>data</code> representing the data, return whether it is a valid <strong>UTF-8</strong> encoding.</p>\n\n<p>A character in <strong>UTF8</strong> can be from <b>1 to 4 bytes</b> long, subjected to the following rules:</p>\n\n<ol>\n\t<li>For a <strong>1-byte</strong> character, the first bit is a <code>0</code>, followed by its Unicode code.</li>\n\t<li>For an <strong>n-bytes</strong> character, the first <code>n</code> bits are all one&#39;s, the <code>n + 1</code> bit is <code>0</code>, followed by <code>n - 1</code> bytes with the most significant <code>2</code> bits being <code>10</code>.</li>\n</ol>\n\n<p>This is how the UTF-8 encoding would work:</p>\n\n<pre>\n<code>   Char. number range  |        UTF-8 octet sequence\n      (hexadecimal)    |              (binary)\n   --------------------+---------------------------------------------\n   0000 0000-0000 007F | 0xxxxxxx\n   0000 0080-0000 07FF | 110xxxxx 10xxxxxx\n   0000 0800-0000 FFFF | 1110xxxx 10xxxxxx 10xxxxxx\n   0001 0000-0010 FFFF | 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx</code>\n</pre>\n\n<p><b>Note: </b>The input is an array of integers. Only the <b>least significant 8 bits</b> of each integer is used to store the data. This means each integer represents only 1 byte of data.</p>\n\n<p>&nbsp;</p>\n<p><strong>Example 1:</strong></p>\n\n<pre>\n<strong>Input:</strong> data = [197,130,1]\n<strong>Output:</strong> true\n<strong>Explanation:</strong> data represents the octet sequence: 11000101 10000010 00000001.\nIt is a valid utf-8 encoding for a 2-bytes character followed by a 1-byte character.\n</pre>\n\n<p><strong>Example 2:</strong></p>\n\n<pre>\n<strong>Input:</strong> data = [235,140,4]\n<strong>Output:</strong> false\n<strong>Explanation:</strong> data represented the octet sequence: 11101011 10001100 00000100.\nThe first 3 bits are all one&#39;s and the 4th bit is 0 means it is a 3-bytes character.\nThe next byte is a continuation byte which starts with 10 and that&#39;s correct.\nBut the second continuation byte does not start with 10, so it is invalid.\n</pre>\n\n<p>&nbsp;</p>\n<p><strong>Constraints:</strong></p>\n\n<ul>\n\t<li><code>1 &lt;= data.length &lt;= 2 * 10<sup>4</sup></code></li>\n\t<li><code>0 &lt;= data[i] &lt;= 255</code></li>\n</ul>\n",
+            "content": "<p>Given an integer array <code>data</code> representing the data, return whether it is a valid <strong>UTF-8</strong> encoding (i.e. it translates to a sequence of valid UTF-8 encoded characters).</p>\n\n<p>A character in <strong>UTF8</strong> can be from <strong>1 to 4 bytes</strong> long, subjected to the following rules:</p>\n\n<ol>\n\t<li>For a <strong>1-byte</strong> character, the first bit is a <code>0</code>, followed by its Unicode code.</li>\n\t<li>For an <strong>n-bytes</strong> character, the first <code>n</code> bits are all one&#39;s, the <code>n + 1</code> bit is <code>0</code>, followed by <code>n - 1</code> bytes with the most significant <code>2</code> bits being <code>10</code>.</li>\n</ol>\n\n<p>This is how the UTF-8 encoding would work:</p>\n\n<pre>\n     Number of Bytes   |        UTF-8 Octet Sequence\n                       |              (binary)\n   --------------------+-----------------------------------------\n            1          |   0xxxxxxx\n            2          |   110xxxxx 10xxxxxx\n            3          |   1110xxxx 10xxxxxx 10xxxxxx\n            4          |   11110xxx 10xxxxxx 10xxxxxx 10xxxxxx\n</pre>\n\n<p><code>x</code> denotes a bit in the binary form of a byte that may be either <code>0</code> or <code>1</code>.</p>\n\n<p><strong>Note: </strong>The input is an array of integers. Only the <strong>least significant 8 bits</strong> of each integer is used to store the data. This means each integer represents only 1 byte of data.</p>\n\n<p>&nbsp;</p>\n<p><strong>Example 1:</strong></p>\n\n<pre>\n<strong>Input:</strong> data = [197,130,1]\n<strong>Output:</strong> true\n<strong>Explanation:</strong> data represents the octet sequence: 11000101 10000010 00000001.\nIt is a valid utf-8 encoding for a 2-bytes character followed by a 1-byte character.\n</pre>\n\n<p><strong>Example 2:</strong></p>\n\n<pre>\n<strong>Input:</strong> data = [235,140,4]\n<strong>Output:</strong> false\n<strong>Explanation:</strong> data represented the octet sequence: 11101011 10001100 00000100.\nThe first 3 bits are all one&#39;s and the 4th bit is 0 means it is a 3-bytes character.\nThe next byte is a continuation byte which starts with 10 and that&#39;s correct.\nBut the second continuation byte does not start with 10, so it is invalid.\n</pre>\n\n<p>&nbsp;</p>\n<p><strong>Constraints:</strong></p>\n\n<ul>\n\t<li><code>1 &lt;= data.length &lt;= 2 * 10<sup>4</sup></code></li>\n\t<li><code>0 &lt;= data[i] &lt;= 255</code></li>\n</ul>\n",
            "translatedTitle": "UTF-8 编码验证",
            "translatedContent": "<p>给定一个表示数据的整数数组&nbsp;<code>data</code>&nbsp;，返回它是否为有效的 <strong>UTF-8</strong> 编码。</p>\n\n<p><strong>UTF-8</strong> 中的一个字符可能的长度为 <strong>1 到 4 字节</strong>，遵循以下的规则：</p>\n\n<ol>\n\t<li>对于 <strong>1 字节</strong>&nbsp;的字符，字节的第一位设为 0 ，后面 7 位为这个符号的 unicode 码。</li>\n\t<li>对于 <strong>n 字节</strong>&nbsp;的字符 (n &gt; 1)，第一个字节的前 n 位都设为1，第 n+1 位设为 0 ，后面字节的前两位一律设为 10 。剩下的没有提及的二进制位，全部为这个符号的 unicode 码。</li>\n</ol>\n\n<p>这是 UTF-8 编码的工作方式：</p>\n\n<pre>\n<code>   Char. number range  |        UTF-8 octet sequence\n      (hexadecimal)    |              (binary)\n   --------------------+---------------------------------------------\n   0000 0000-0000 007F | 0xxxxxxx\n   0000 0080-0000 07FF | 110xxxxx 10xxxxxx\n   0000 0800-0000 FFFF | 1110xxxx 10xxxxxx 10xxxxxx\n   0001 0000-0010 FFFF | 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx\n</code></pre>\n\n<p><strong>注意：</strong>输入是整数数组。只有每个整数的 <strong>最低 8 个有效位</strong> 用来存储数据。这意味着每个整数只表示 1 字节的数据。</p>\n\n<p>&nbsp;</p>\n\n<p><strong>示例 1：</strong></p>\n\n<pre>\n<strong>输入：</strong>data = [197,130,1]\n<strong>输出：</strong>true\n<strong>解释：</strong>数据表示字节序列:<strong>11000101 10000010 00000001</strong>。\n这是有效的 utf-8 编码，为一个 2 字节字符，跟着一个 1 字节字符。\n</pre>\n\n<p><strong>示例 2：</strong></p>\n\n<pre>\n<strong>输入：</strong>data = [235,140,4]\n<strong>输出：</strong>false\n<strong>解释：</strong>数据表示 8 位的序列: <strong>11101011 10001100 00000100</strong>.\n前 3 位都是 1 ，第 4 位为 0 表示它是一个 3 字节字符。\n下一个字节是开头为 10 的延续字节，这是正确的。\n但第二个延续字节不以 10 开头，所以是不符合规则的。\n</pre>\n\n<p>&nbsp;</p>\n\n<p><strong>提示:</strong></p>\n\n<ul>\n\t<li><code>1 &lt;= data.length &lt;= 2 * 10<sup>4</sup></code></li>\n\t<li><code>0 &lt;= data[i] &lt;= 255</code></li>\n</ul>\n",
            "isPaidOnly": false,
            "difficulty": "Medium",
-            "likes": 167,
+            "likes": 169,
            "dislikes": 0,
            "isLiked": null,
            "similarQuestions": "[]",
@@ -143,7 +143,7 @@
                    "__typename": "CodeSnippetNode"
                }
            ],
-            "stats": "{\"totalAccepted\": \"33.8K\", \"totalSubmission\": \"76.9K\", \"totalAcceptedRaw\": 33802, \"totalSubmissionRaw\": 76937, \"acRate\": \"43.9%\"}",
+            "stats": "{\"totalAccepted\": \"34.2K\", \"totalSubmission\": \"78K\", \"totalAcceptedRaw\": 34245, \"totalSubmissionRaw\": 78008, \"acRate\": \"43.9%\"}",
            "hints": [
                "All you have to do is follow the rules. For a given integer, obtain its binary representation in the string form and work with the rules given in the problem.",
                "An integer can either represent the start of a UTF-8 character, or a part of an existing UTF-8 character. There are two separate rules for these two scenarios in the problem.",