{ "data": { "question": { "questionId": "393", "questionFrontendId": "393", "boundTopicId": null, "title": "UTF-8 Validation", "titleSlug": "utf-8-validation", "content": "
Given an integer array data
representing the data, return whether it is a valid UTF-8 encoding (i.e. it translates to a sequence of valid UTF-8 encoded characters).
A character in UTF8 can be from 1 to 4 bytes long, subjected to the following rules:
\n\n0
, followed by its Unicode code.n
bits are all one's, the n + 1
bit is 0
, followed by n - 1
bytes with the most significant 2
bits being 10
.This is how the UTF-8 encoding would work:
\n\n\n Number of Bytes | UTF-8 Octet Sequence\n | (binary)\n --------------------+-----------------------------------------\n 1 | 0xxxxxxx\n 2 | 110xxxxx 10xxxxxx\n 3 | 1110xxxx 10xxxxxx 10xxxxxx\n 4 | 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx\n\n\n
x
denotes a bit in the binary form of a byte that may be either 0
or 1
.
Note: The input is an array of integers. Only the least significant 8 bits of each integer is used to store the data. This means each integer represents only 1 byte of data.
\n\n\n
Example 1:
\n\n\nInput: data = [197,130,1]\nOutput: true\nExplanation: data represents the octet sequence: 11000101 10000010 00000001.\nIt is a valid utf-8 encoding for a 2-bytes character followed by a 1-byte character.\n\n\n
Example 2:
\n\n\nInput: data = [235,140,4]\nOutput: false\nExplanation: data represented the octet sequence: 11101011 10001100 00000100.\nThe first 3 bits are all one's and the 4th bit is 0 means it is a 3-bytes character.\nThe next byte is a continuation byte which starts with 10 and that's correct.\nBut the second continuation byte does not start with 10, so it is invalid.\n\n\n
\n
Constraints:
\n\n1 <= data.length <= 2 * 104
0 <= data[i] <= 255
\r\nmask = 1 << 7\r\nwhile mask & num:\r\n n_bytes += 1\r\n mask = mask >> 1\r\n\r\n\r\nCan you use bit-masking to perform the second validation as well i.e. checking if the most significant bit is 1 and the second most significant bit a 0?", "To check if the most significant bit is a 1 and the second most significant bit is a 0, we can make use of the following two masks.\r\n\r\n
\r\nmask1 = 1 << 7\r\nmask2 = 1 << 6\r\n\r\nif not (num & mask1 and not (num & mask2)):\r\n return False\r\n" ], "solution": { "id": "596", "canSeeDetail": true, "paidOnly": false, "hasVideoSolution": false, "paidOnlyVideo": true, "__typename": "ArticleNode" }, "status": null, "sampleTestCase": "[197,130,1]", "metaData": "{\r\n \"name\": \"validUtf8\",\r\n \"params\": [\r\n {\r\n \"name\": \"data\",\r\n \"type\": \"integer[]\"\r\n }\r\n ],\r\n \"return\": {\r\n \"type\": \"boolean\"\r\n }\r\n}", "judgerAvailable": true, "judgeType": "large", "mysqlSchemas": [], "enableRunCode": true, "enableTestMode": false, "enableDebugger": true, "envInfo": "{\"cpp\": [\"C++\", \"
Compiled with clang 11
using the latest C++ 17 standard.
Your code is compiled with level two optimization (-O2
). AddressSanitizer is also enabled to help detect out-of-bounds and use-after-free bugs.
Most standard library headers are already included automatically for your convenience.
\"], \"java\": [\"Java\", \" OpenJDK 17
. Java 8 features such as lambda expressions and stream API can be used.
Most standard library headers are already included automatically for your convenience.
\\r\\nIncludes Pair
class from https://docs.oracle.com/javase/8/javafx/api/javafx/util/Pair.html.
Python 2.7.12
.
Most libraries are already imported automatically for your convenience, such as array, bisect, collections. If you need more libraries, you can import it yourself.
\\r\\n\\r\\nFor Map/TreeMap data structure, you may use sortedcontainers library.
\\r\\n\\r\\nNote that Python 2.7 will not be maintained past 2020. For the latest Python, please choose Python3 instead.
\"], \"c\": [\"C\", \"Compiled with gcc 8.2
using the gnu99 standard.
Your code is compiled with level one optimization (-O1
). AddressSanitizer is also enabled to help detect out-of-bounds and use-after-free bugs.
Most standard library headers are already included automatically for your convenience.
\\r\\n\\r\\nFor hash table operations, you may use uthash. \\\"uthash.h\\\" is included by default. Below are some examples:
\\r\\n\\r\\n1. Adding an item to a hash.\\r\\n
\\r\\nstruct hash_entry {\\r\\n int id; /* we'll use this field as the key */\\r\\n char name[10];\\r\\n UT_hash_handle hh; /* makes this structure hashable */\\r\\n};\\r\\n\\r\\nstruct hash_entry *users = NULL;\\r\\n\\r\\nvoid add_user(struct hash_entry *s) {\\r\\n HASH_ADD_INT(users, id, s);\\r\\n}\\r\\n\\r\\n\\r\\n\\r\\n
2. Looking up an item in a hash:\\r\\n
\\r\\nstruct hash_entry *find_user(int user_id) {\\r\\n struct hash_entry *s;\\r\\n HASH_FIND_INT(users, &user_id, s);\\r\\n return s;\\r\\n}\\r\\n\\r\\n\\r\\n\\r\\n
3. Deleting an item in a hash:\\r\\n
\\r\\nvoid delete_user(struct hash_entry *user) {\\r\\n HASH_DEL(users, user); \\r\\n}\\r\\n\\r\\n\"], \"csharp\": [\"C#\", \"\\r\\n\\r\\n
Your code is compiled with debug flag enabled (/debug
).
Node.js 16.13.2
.
Your code is run with --harmony
flag, enabling new ES6 features.
lodash.js library is included by default.
\\r\\n\\r\\nFor Priority Queue / Queue data structures, you may use datastructures-js/priority-queue and datastructures-js/queue.
\"], \"ruby\": [\"Ruby\", \"Ruby 3.1
Some common data structure implementations are provided in the Algorithms module: https://www.rubydoc.info/github/kanwei/algorithms/Algorithms
\"], \"swift\": [\"Swift\", \"Swift 5.5.2
.
Go 1.17.6
.
Support https://godoc.org/github.com/emirpasic/gods library.
\"], \"python3\": [\"Python3\", \"Python 3.10
.
Most libraries are already imported automatically for your convenience, such as array, bisect, collections. If you need more libraries, you can import it yourself.
\\r\\n\\r\\nFor Map/TreeMap data structure, you may use sortedcontainers library.
\"], \"scala\": [\"Scala\", \"Scala 2.13.7
.
Kotlin 1.3.10
.
Rust 1.58.1
Supports rand v0.6\\u00a0from crates.io
\"], \"php\": [\"PHP\", \"PHP 8.1
.
With bcmath module
\"], \"typescript\": [\"Typescript\", \"TypeScript 4.5.4, Node.js 16.13.2
.
Your code is run with --harmony
flag, enabling new ES2020 features.
lodash.js library is included by default.
\"], \"racket\": [\"Racket\", \"Run with Racket 8.3
.