{ "data": { "question": { "questionId": "393", "questionFrontendId": "393", "boundTopicId": null, "title": "UTF-8 Validation", "titleSlug": "utf-8-validation", "content": "

Given an integer array data representing the data, return whether it is a valid UTF-8 encoding (i.e. it translates to a sequence of valid UTF-8 encoded characters).

\n\n

A character in UTF8 can be from 1 to 4 bytes long, subjected to the following rules:

\n\n
    \n\t
  1. For a 1-byte character, the first bit is a 0, followed by its Unicode code.
  2. \n\t
  3. For an n-bytes character, the first n bits are all one's, the n + 1 bit is 0, followed by n - 1 bytes with the most significant 2 bits being 10.
  4. \n
\n\n

This is how the UTF-8 encoding would work:

\n\n
\n     Number of Bytes   |        UTF-8 Octet Sequence\n                       |              (binary)\n   --------------------+-----------------------------------------\n            1          |   0xxxxxxx\n            2          |   110xxxxx 10xxxxxx\n            3          |   1110xxxx 10xxxxxx 10xxxxxx\n            4          |   11110xxx 10xxxxxx 10xxxxxx 10xxxxxx\n
\n\n

x denotes a bit in the binary form of a byte that may be either 0 or 1.

\n\n

Note: The input is an array of integers. Only the least significant 8 bits of each integer is used to store the data. This means each integer represents only 1 byte of data.

\n\n

 

\n

Example 1:

\n\n
\nInput: data = [197,130,1]\nOutput: true\nExplanation: data represents the octet sequence: 11000101 10000010 00000001.\nIt is a valid utf-8 encoding for a 2-bytes character followed by a 1-byte character.\n
\n\n

Example 2:

\n\n
\nInput: data = [235,140,4]\nOutput: false\nExplanation: data represented the octet sequence: 11101011 10001100 00000100.\nThe first 3 bits are all one's and the 4th bit is 0 means it is a 3-bytes character.\nThe next byte is a continuation byte which starts with 10 and that's correct.\nBut the second continuation byte does not start with 10, so it is invalid.\n
\n\n

 

\n

Constraints:

\n\n\n", "translatedTitle": null, "translatedContent": null, "isPaidOnly": false, "difficulty": "Medium", "likes": 885, "dislikes": 2847, "isLiked": null, "similarQuestions": "[]", "exampleTestcases": "[197,130,1]\n[235,140,4]", "categoryTitle": "Algorithms", "contributors": [], "topicTags": [ { "name": "Array", "slug": "array", "translatedName": null, "__typename": "TopicTagNode" }, { "name": "Bit Manipulation", "slug": "bit-manipulation", "translatedName": null, "__typename": "TopicTagNode" } ], "companyTagStats": null, "codeSnippets": [ { "lang": "C++", "langSlug": "cpp", "code": "class Solution {\npublic:\n bool validUtf8(vector& data) {\n \n }\n};", "__typename": "CodeSnippetNode" }, { "lang": "Java", "langSlug": "java", "code": "class Solution {\n public boolean validUtf8(int[] data) {\n \n }\n}", "__typename": "CodeSnippetNode" }, { "lang": "Python", "langSlug": "python", "code": "class Solution(object):\n def validUtf8(self, data):\n \"\"\"\n :type data: List[int]\n :rtype: bool\n \"\"\"\n ", "__typename": "CodeSnippetNode" }, { "lang": "Python3", "langSlug": "python3", "code": "class Solution:\n def validUtf8(self, data: List[int]) -> bool:\n ", "__typename": "CodeSnippetNode" }, { "lang": "C", "langSlug": "c", "code": "bool validUtf8(int* data, int dataSize) {\n \n}", "__typename": "CodeSnippetNode" }, { "lang": "C#", "langSlug": "csharp", "code": "public class Solution {\n public bool ValidUtf8(int[] data) {\n \n }\n}", "__typename": "CodeSnippetNode" }, { "lang": "JavaScript", "langSlug": "javascript", "code": "/**\n * @param {number[]} data\n * @return {boolean}\n */\nvar validUtf8 = function(data) {\n \n};", "__typename": "CodeSnippetNode" }, { "lang": "TypeScript", "langSlug": "typescript", "code": "function validUtf8(data: number[]): boolean {\n \n};", "__typename": "CodeSnippetNode" }, { "lang": "PHP", "langSlug": "php", "code": "class Solution {\n\n /**\n * @param Integer[] $data\n * @return Boolean\n */\n function validUtf8($data) {\n \n }\n}", "__typename": "CodeSnippetNode" }, { "lang": "Swift", "langSlug": "swift", "code": "class Solution {\n func validUtf8(_ data: [Int]) -> Bool {\n \n }\n}", "__typename": "CodeSnippetNode" }, { "lang": "Kotlin", "langSlug": "kotlin", "code": "class Solution {\n fun validUtf8(data: IntArray): Boolean {\n \n }\n}", "__typename": "CodeSnippetNode" }, { "lang": "Dart", "langSlug": "dart", "code": "class Solution {\n bool validUtf8(List data) {\n \n }\n}", "__typename": "CodeSnippetNode" }, { "lang": "Go", "langSlug": "golang", "code": "func validUtf8(data []int) bool {\n \n}", "__typename": "CodeSnippetNode" }, { "lang": "Ruby", "langSlug": "ruby", "code": "# @param {Integer[]} data\n# @return {Boolean}\ndef valid_utf8(data)\n \nend", "__typename": "CodeSnippetNode" }, { "lang": "Scala", "langSlug": "scala", "code": "object Solution {\n def validUtf8(data: Array[Int]): Boolean = {\n \n }\n}", "__typename": "CodeSnippetNode" }, { "lang": "Rust", "langSlug": "rust", "code": "impl Solution {\n pub fn valid_utf8(data: Vec) -> bool {\n \n }\n}", "__typename": "CodeSnippetNode" }, { "lang": "Racket", "langSlug": "racket", "code": "(define/contract (valid-utf8 data)\n (-> (listof exact-integer?) boolean?)\n )", "__typename": "CodeSnippetNode" }, { "lang": "Erlang", "langSlug": "erlang", "code": "-spec valid_utf8(Data :: [integer()]) -> boolean().\nvalid_utf8(Data) ->\n .", "__typename": "CodeSnippetNode" }, { "lang": "Elixir", "langSlug": "elixir", "code": "defmodule Solution do\n @spec valid_utf8(data :: [integer]) :: boolean\n def valid_utf8(data) do\n \n end\nend", "__typename": "CodeSnippetNode" } ], "stats": "{\"totalAccepted\": \"119.9K\", \"totalSubmission\": \"266.4K\", \"totalAcceptedRaw\": 119857, \"totalSubmissionRaw\": 266415, \"acRate\": \"45.0%\"}", "hints": [ "Read the data integer by integer. When you read it, process the least significant 8 bits of it.", "Assume the next encoding is 1-byte data. If it is not 1-byte data, read the next integer and assume it is 2-bytes data.", "Similarly, if it is not 2-bytes data, try 3-bytes then 4-bytes. If you read four integers and it still does not match any pattern, return false." ], "solution": { "id": "596", "canSeeDetail": true, "paidOnly": false, "hasVideoSolution": false, "paidOnlyVideo": true, "__typename": "ArticleNode" }, "status": null, "sampleTestCase": "[197,130,1]", "metaData": "{\r\n \"name\": \"validUtf8\",\r\n \"params\": [\r\n {\r\n \"name\": \"data\",\r\n \"type\": \"integer[]\"\r\n }\r\n ],\r\n \"return\": {\r\n \"type\": \"boolean\"\r\n }\r\n}", "judgerAvailable": true, "judgeType": "large", "mysqlSchemas": [], "enableRunCode": true, "enableTestMode": false, "enableDebugger": true, "envInfo": "{\"cpp\": [\"C++\", \"

Compiled with clang 11 using the latest C++ 20 standard.

\\r\\n\\r\\n

Your code is compiled with level two optimization (-O2). AddressSanitizer is also enabled to help detect out-of-bounds and use-after-free bugs.

\\r\\n\\r\\n

Most standard library headers are already included automatically for your convenience.

\"], \"java\": [\"Java\", \"

OpenJDK 17. Java 8 features such as lambda expressions and stream API can be used.

\\r\\n\\r\\n

Most standard library headers are already included automatically for your convenience.

\\r\\n

Includes Pair class from https://docs.oracle.com/javase/8/javafx/api/javafx/util/Pair.html.

\"], \"python\": [\"Python\", \"

Python 2.7.12.

\\r\\n\\r\\n

Most libraries are already imported automatically for your convenience, such as array, bisect, collections. If you need more libraries, you can import it yourself.

\\r\\n\\r\\n

For Map/TreeMap data structure, you may use sortedcontainers library.

\\r\\n\\r\\n

Note that Python 2.7 will not be maintained past 2020. For the latest Python, please choose Python3 instead.

\"], \"c\": [\"C\", \"

Compiled with gcc 8.2 using the gnu11 standard.

\\r\\n\\r\\n

Your code is compiled with level one optimization (-O1). AddressSanitizer is also enabled to help detect out-of-bounds and use-after-free bugs.

\\r\\n\\r\\n

Most standard library headers are already included automatically for your convenience.

\\r\\n\\r\\n

For hash table operations, you may use uthash. \\\"uthash.h\\\" is included by default. Below are some examples:

\\r\\n\\r\\n

1. Adding an item to a hash.\\r\\n

\\r\\nstruct hash_entry {\\r\\n    int id;            /* we'll use this field as the key */\\r\\n    char name[10];\\r\\n    UT_hash_handle hh; /* makes this structure hashable */\\r\\n};\\r\\n\\r\\nstruct hash_entry *users = NULL;\\r\\n\\r\\nvoid add_user(struct hash_entry *s) {\\r\\n    HASH_ADD_INT(users, id, s);\\r\\n}\\r\\n
\\r\\n

\\r\\n\\r\\n

2. Looking up an item in a hash:\\r\\n

\\r\\nstruct hash_entry *find_user(int user_id) {\\r\\n    struct hash_entry *s;\\r\\n    HASH_FIND_INT(users, &user_id, s);\\r\\n    return s;\\r\\n}\\r\\n
\\r\\n

\\r\\n\\r\\n

3. Deleting an item in a hash:\\r\\n

\\r\\nvoid delete_user(struct hash_entry *user) {\\r\\n    HASH_DEL(users, user);  \\r\\n}\\r\\n
\\r\\n

\"], \"csharp\": [\"C#\", \"

C# 10 with .NET 6 runtime

\"], \"javascript\": [\"JavaScript\", \"

Node.js 16.13.2.

\\r\\n\\r\\n

Your code is run with --harmony flag, enabling new ES6 features.

\\r\\n\\r\\n

lodash.js library is included by default.

\\r\\n\\r\\n

For Priority Queue / Queue data structures, you may use 5.3.0 version of datastructures-js/priority-queue and 4.2.1 version of datastructures-js/queue.

\"], \"ruby\": [\"Ruby\", \"

Ruby 3.1

\\r\\n\\r\\n

Some common data structure implementations are provided in the Algorithms module: https://www.rubydoc.info/github/kanwei/algorithms/Algorithms

\"], \"swift\": [\"Swift\", \"

Swift 5.5.2.

\"], \"golang\": [\"Go\", \"

Go 1.21

\\r\\n

Support https://godoc.org/github.com/emirpasic/gods@v1.18.1 library.

\"], \"python3\": [\"Python3\", \"

Python 3.10.

\\r\\n\\r\\n

Most libraries are already imported automatically for your convenience, such as array, bisect, collections. If you need more libraries, you can import it yourself.

\\r\\n\\r\\n

For Map/TreeMap data structure, you may use sortedcontainers library.

\"], \"scala\": [\"Scala\", \"

Scala 2.13.7.

\"], \"kotlin\": [\"Kotlin\", \"

Kotlin 1.9.0.

\"], \"rust\": [\"Rust\", \"

Rust 1.58.1

\\r\\n\\r\\n

Supports rand v0.6\\u00a0from crates.io

\"], \"php\": [\"PHP\", \"

PHP 8.1.

\\r\\n

With bcmath module

\"], \"typescript\": [\"Typescript\", \"

TypeScript 5.1.6, Node.js 16.13.2.

\\r\\n\\r\\n

Your code is run with --harmony flag, enabling new ES2022 features.

\\r\\n\\r\\n

lodash.js library is included by default.

\"], \"racket\": [\"Racket\", \"

Run with Racket 8.3.

\"], \"erlang\": [\"Erlang\", \"Erlang/OTP 25.0\"], \"elixir\": [\"Elixir\", \"Elixir 1.13.4 with Erlang/OTP 25.0\"], \"dart\": [\"Dart\", \"

Dart 2.17.3

\\r\\n\\r\\n

Your code will be run directly without compiling

\"]}", "libraryUrl": null, "adminUrl": null, "challengeQuestion": null, "__typename": "QuestionNode" } } }