{ "data": { "question": { "questionId": "609", "questionFrontendId": "609", "categoryTitle": "Algorithms", "boundTopicId": 1788, "title": "Find Duplicate File in System", "titleSlug": "find-duplicate-file-in-system", "content": "
Given a list paths
of directory info, including the directory path, and all the files with contents in this directory, return all the duplicate files in the file system in terms of their paths. You may return the answer in any order.
A group of duplicate files consists of at least two files that have the same content.
\n\nA single directory info string in the input list has the following format:
\n\n"root/d1/d2/.../dm f1.txt(f1_content) f2.txt(f2_content) ... fn.txt(fn_content)"
It means there are n
files (f1.txt, f2.txt ... fn.txt)
with content (f1_content, f2_content ... fn_content)
respectively in the directory "root/d1/d2/.../dm"
. Note that n >= 1
and m >= 0
. If m = 0
, it means the directory is just the root directory.
The output is a list of groups of duplicate file paths. For each group, it contains all the file paths of the files that have the same content. A file path is a string that has the following format:
\n\n"directory_path/file_name.txt"
\n
Example 1:
\nInput: paths = [\"root/a 1.txt(abcd) 2.txt(efgh)\",\"root/c 3.txt(abcd)\",\"root/c/d 4.txt(efgh)\",\"root 4.txt(efgh)\"]\nOutput: [[\"root/a/2.txt\",\"root/c/d/4.txt\",\"root/4.txt\"],[\"root/a/1.txt\",\"root/c/3.txt\"]]\n
Example 2:
\nInput: paths = [\"root/a 1.txt(abcd) 2.txt(efgh)\",\"root/c 3.txt(abcd)\",\"root/c/d 4.txt(efgh)\"]\nOutput: [[\"root/a/2.txt\",\"root/c/d/4.txt\"],[\"root/a/1.txt\",\"root/c/3.txt\"]]\n\n
\n
Constraints:
\n\n1 <= paths.length <= 2 * 104
1 <= paths[i].length <= 3000
1 <= sum(paths[i].length) <= 5 * 105
paths[i]
consist of English letters, digits, '/'
, '.'
, '('
, ')'
, and ' '
.\n
Follow up:
\n\n给你一个目录信息列表 paths
,包括目录路径,以及该目录中的所有文件及其内容,请你按路径返回文件系统中的所有重复文件。答案可按 任意顺序 返回。
一组重复的文件至少包括 两个 具有完全相同内容的文件。
\n\n输入 列表中的单个目录信息字符串的格式如下:
\n\n\"root/d1/d2/.../dm f1.txt(f1_content) f2.txt(f2_content) ... fn.txt(fn_content)\"
这意味着,在目录 root/d1/d2/.../dm
下,有 n
个文件 ( f1.txt
, f2.txt
... fn.txt
) 的内容分别是 ( f1_content
, f2_content
... fn_content
) 。注意:n >= 1
且 m >= 0
。如果 m = 0
,则表示该目录是根目录。
输出 是由 重复文件路径组 构成的列表。其中每个组由所有具有相同内容文件的文件路径组成。文件路径是具有下列格式的字符串:
\n\n\"directory_path/file_name.txt\"
\n\n
示例 1:
\n\n\n输入:paths = [\"root/a 1.txt(abcd) 2.txt(efgh)\",\"root/c 3.txt(abcd)\",\"root/c/d 4.txt(efgh)\",\"root 4.txt(efgh)\"]\n输出:[[\"root/a/2.txt\",\"root/c/d/4.txt\",\"root/4.txt\"],[\"root/a/1.txt\",\"root/c/3.txt\"]]\n\n\n
示例 2:
\n\n\n输入:paths = [\"root/a 1.txt(abcd) 2.txt(efgh)\",\"root/c 3.txt(abcd)\",\"root/c/d 4.txt(efgh)\"]\n输出:[[\"root/a/2.txt\",\"root/c/d/4.txt\"],[\"root/a/1.txt\",\"root/c/3.txt\"]]\n\n\n
\n\n
提示:
\n\n1 <= paths.length <= 2 * 104
1 <= paths[i].length <= 3000
1 <= sum(paths[i].length) <= 5 * 105
paths[i]
由英文字母、数字、字符 '/'
、'.'
、'('
、')'
和 ' '
组成\n\n
进阶:
\n\n\\u7248\\u672c\\uff1a \\u7f16\\u8bd1\\u65f6\\uff0c\\u5c06\\u4f1a\\u91c7\\u7528 \\u4e3a\\u4e86\\u4f7f\\u7528\\u65b9\\u4fbf\\uff0c\\u5927\\u90e8\\u5206\\u6807\\u51c6\\u5e93\\u7684\\u5934\\u6587\\u4ef6\\u5df2\\u7ecf\\u88ab\\u81ea\\u52a8\\u5bfc\\u5165\\u3002<\\/p>\"],\"java\":[\"Java\",\" \\u7248\\u672c\\uff1a \\u4e3a\\u4e86\\u65b9\\u4fbf\\u8d77\\u89c1\\uff0c\\u5927\\u90e8\\u5206\\u6807\\u51c6\\u5e93\\u7684\\u5934\\u6587\\u4ef6\\u5df2\\u88ab\\u5bfc\\u5165\\u3002<\\/p>\\r\\n\\r\\n \\u5305\\u542b Pair \\u7c7b: https:\\/\\/docs.oracle.com\\/javase\\/8\\/javafx\\/api\\/javafx\\/util\\/Pair.html <\\/p>\"],\"python\":[\"Python\",\" \\u7248\\u672c\\uff1a \\u4e3a\\u4e86\\u65b9\\u4fbf\\u8d77\\u89c1\\uff0c\\u5927\\u90e8\\u5206\\u5e38\\u7528\\u5e93\\u5df2\\u7ecf\\u88ab\\u81ea\\u52a8 \\u5bfc\\u5165\\uff0c\\u5982\\uff1aarray<\\/a>, bisect<\\/a>, collections<\\/a>\\u3002\\u5982\\u679c\\u60a8\\u9700\\u8981\\u4f7f\\u7528\\u5176\\u4ed6\\u5e93\\u51fd\\u6570\\uff0c\\u8bf7\\u81ea\\u884c\\u5bfc\\u5165\\u3002<\\/p>\\r\\n\\r\\n \\u6ce8\\u610f Python 2.7 \\u5c06\\u57282020\\u5e74\\u540e\\u4e0d\\u518d\\u7ef4\\u62a4<\\/a>\\u3002 \\u5982\\u60f3\\u4f7f\\u7528\\u6700\\u65b0\\u7248\\u7684Python\\uff0c\\u8bf7\\u9009\\u62e9Python 3\\u3002<\\/p>\"],\"c\":[\"C\",\" \\u7248\\u672c\\uff1a \\u7f16\\u8bd1\\u65f6\\uff0c\\u5c06\\u4f1a\\u91c7\\u7528 \\u4e3a\\u4e86\\u4f7f\\u7528\\u65b9\\u4fbf\\uff0c\\u5927\\u90e8\\u5206\\u6807\\u51c6\\u5e93\\u7684\\u5934\\u6587\\u4ef6\\u5df2\\u7ecf\\u88ab\\u81ea\\u52a8\\u5bfc\\u5165\\u3002<\\/p>\\r\\n\\r\\n \\u5982\\u60f3\\u4f7f\\u7528\\u54c8\\u5e0c\\u8868\\u8fd0\\u7b97, \\u60a8\\u53ef\\u4ee5\\u4f7f\\u7528 uthash<\\/a>\\u3002 \\\"uthash.h\\\"\\u5df2\\u7ecf\\u9ed8\\u8ba4\\u88ab\\u5bfc\\u5165\\u3002\\u8bf7\\u770b\\u5982\\u4e0b\\u793a\\u4f8b:<\\/p>\\r\\n\\r\\n 1. \\u5f80\\u54c8\\u5e0c\\u8868\\u4e2d\\u6dfb\\u52a0\\u4e00\\u4e2a\\u5bf9\\u8c61\\uff1a<\\/b>\\r\\n 2. \\u5728\\u54c8\\u5e0c\\u8868\\u4e2d\\u67e5\\u627e\\u4e00\\u4e2a\\u5bf9\\u8c61\\uff1a<\\/b>\\r\\n 3. \\u4ece\\u54c8\\u5e0c\\u8868\\u4e2d\\u5220\\u9664\\u4e00\\u4e2a\\u5bf9\\u8c61\\uff1a<\\/b>\\r\\n C# 10<\\/a> \\u8fd0\\u884c\\u5728 .NET 6 \\u4e0a<\\/p>\"],\"javascript\":[\"JavaScript\",\" \\u7248\\u672c\\uff1a \\u60a8\\u7684\\u4ee3\\u7801\\u5728\\u6267\\u884c\\u65f6\\u5c06\\u5e26\\u4e0a lodash.js<\\/a> \\u5e93\\u5df2\\u7ecf\\u9ed8\\u8ba4\\u88ab\\u5305\\u542b\\u3002<\\/p>\\r\\n\\r\\n \\u5982\\u9700\\u4f7f\\u7528\\u961f\\u5217\\/\\u4f18\\u5148\\u961f\\u5217\\uff0c\\u60a8\\u53ef\\u4f7f\\u7528 datastructures-js\\/priority-queue@5.3.0<\\/a> \\u548c datastructures-js\\/queue@4.2.1<\\/a>\\u3002<\\/p>\"],\"ruby\":[\"Ruby\",\" \\u4f7f\\u7528 \\u4e00\\u4e9b\\u5e38\\u7528\\u7684\\u6570\\u636e\\u7ed3\\u6784\\u5df2\\u5728 Algorithms \\u6a21\\u5757\\u4e2d\\u63d0\\u4f9b\\uff1ahttps:\\/\\/www.rubydoc.info\\/github\\/kanwei\\/algorithms\\/Algorithms<\\/p>\"],\"swift\":[\"Swift\",\" \\u7248\\u672c\\uff1a \\u6211\\u4eec\\u901a\\u5e38\\u4fdd\\u8bc1\\u66f4\\u65b0\\u5230 Apple\\u653e\\u51fa\\u7684\\u6700\\u65b0\\u7248Swift<\\/a>\\u3002\\u5982\\u679c\\u60a8\\u53d1\\u73b0Swift\\u4e0d\\u662f\\u6700\\u65b0\\u7248\\u7684\\uff0c\\u8bf7\\u8054\\u7cfb\\u6211\\u4eec\\uff01\\u6211\\u4eec\\u5c06\\u5c3d\\u5feb\\u66f4\\u65b0\\u3002<\\/p>\"],\"golang\":[\"Go\",\" \\u7248\\u672c\\uff1a \\u652f\\u6301 https:\\/\\/godoc.org\\/github.com\\/emirpasic\\/gods@v1.18.1<\\/a> \\u7b2c\\u4e09\\u65b9\\u5e93\\u3002<\\/p>\"],\"python3\":[\"Python3\",\" \\u7248\\u672c\\uff1a \\u4e3a\\u4e86\\u65b9\\u4fbf\\u8d77\\u89c1\\uff0c\\u5927\\u90e8\\u5206\\u5e38\\u7528\\u5e93\\u5df2\\u7ecf\\u88ab\\u81ea\\u52a8 \\u5bfc\\u5165\\uff0c\\u5982array<\\/a>, bisect<\\/a>, collections<\\/a>\\u3002 \\u5982\\u679c\\u60a8\\u9700\\u8981\\u4f7f\\u7528\\u5176\\u4ed6\\u5e93\\u51fd\\u6570\\uff0c\\u8bf7\\u81ea\\u884c\\u5bfc\\u5165\\u3002<\\/p>\\r\\n\\r\\n \\u5982\\u9700\\u4f7f\\u7528 Map\\/TreeMap \\u6570\\u636e\\u7ed3\\u6784\\uff0c\\u60a8\\u53ef\\u4f7f\\u7528 sortedcontainers<\\/a> \\u5e93\\u3002<\\/p>\"],\"scala\":[\"Scala\",\" \\u7248\\u672c\\uff1a \\u7248\\u672c\\uff1a \\u6211\\u4eec\\u4f7f\\u7528\\u7684\\u662f JetBrains \\u63d0\\u4f9b\\u7684 experimental compiler\\u3002\\u5982\\u679c\\u60a8\\u8ba4\\u4e3a\\u60a8\\u9047\\u5230\\u4e86\\u7f16\\u8bd1\\u5668\\u76f8\\u5173\\u7684\\u95ee\\u9898\\uff0c\\u8bf7\\u5411\\u6211\\u4eec\\u53cd\\u9988<\\/p>\"],\"rust\":[\"Rust\",\" \\u7248\\u672c\\uff1a \\u652f\\u6301 crates.io \\u7684 rand<\\/a><\\/p>\"],\"php\":[\"PHP\",\" With bcmath module.<\\/p>\"],\"typescript\":[\"TypeScript\",\" TypeScript 5.1.6<\\/p>\\r\\n\\r\\n Compile Options: --alwaysStrict --strictBindCallApply --strictFunctionTypes --target ES2022<\\/p>\\r\\n\\r\\n lodash.js<\\/a> \\u5e93\\u5df2\\u7ecf\\u9ed8\\u8ba4\\u88ab\\u5305\\u542b\\u3002<\\/p>\\r\\n\\r\\n \\u5982\\u9700\\u4f7f\\u7528\\u961f\\u5217\\/\\u4f18\\u5148\\u961f\\u5217\\uff0c\\u60a8\\u53ef\\u4f7f\\u7528 datastructures-js\\/priority-queue@5.3.0<\\/a> \\u548c datastructures-js\\/queue@4.2.1<\\/a>\\u3002<\\/p>\"],\"racket\":[\"Racket\",\"clang 11<\\/code> \\u91c7\\u7528\\u6700\\u65b0C++ 20\\u6807\\u51c6\\u3002<\\/p>\\r\\n\\r\\n
-O2<\\/code>\\u7ea7\\u4f18\\u5316\\u3002AddressSanitizer<\\/a> \\u4e5f\\u88ab\\u5f00\\u542f\\u6765\\u68c0\\u6d4b
out-of-bounds<\\/code>\\u548c
use-after-free<\\/code>\\u9519\\u8bef\\u3002<\\/p>\\r\\n\\r\\n
OpenJDK 17<\\/code>\\u3002\\u53ef\\u4ee5\\u4f7f\\u7528Java 8\\u7684\\u7279\\u6027\\u4f8b\\u5982\\uff0clambda expressions \\u548c stream API\\u3002<\\/p>\\r\\n\\r\\n
Python 2.7.12<\\/code><\\/p>\\r\\n\\r\\n
GCC 8.2<\\/code>\\uff0c\\u91c7\\u7528GNU11\\u6807\\u51c6\\u3002<\\/p>\\r\\n\\r\\n
-O1<\\/code>\\u7ea7\\u4f18\\u5316\\u3002 AddressSanitizer<\\/a>\\u4e5f\\u88ab\\u5f00\\u542f\\u6765\\u68c0\\u6d4b
out-of-bounds<\\/code>\\u548c
use-after-free<\\/code>\\u9519\\u8bef\\u3002<\\/p>\\r\\n\\r\\n
\\r\\nstruct hash_entry {\\r\\n int id; \\/* we'll use this field as the key *\\/\\r\\n char name[10];\\r\\n UT_hash_handle hh; \\/* makes this structure hashable *\\/\\r\\n};\\r\\n\\r\\nstruct hash_entry *users = NULL;\\r\\n\\r\\nvoid add_user(struct hash_entry *s) {\\r\\n HASH_ADD_INT(users, id, s);\\r\\n}\\r\\n<\\/pre>\\r\\n<\\/p>\\r\\n\\r\\n
\\r\\nstruct hash_entry *find_user(int user_id) {\\r\\n struct hash_entry *s;\\r\\n HASH_FIND_INT(users, &user_id, s);\\r\\n return s;\\r\\n}\\r\\n<\\/pre>\\r\\n<\\/p>\\r\\n\\r\\n
\\r\\nvoid delete_user(struct hash_entry *user) {\\r\\n HASH_DEL(users, user); \\r\\n}\\r\\n<\\/pre>\\r\\n<\\/p>\"],\"csharp\":[\"C#\",\"
Node.js 16.13.2<\\/code><\\/p>\\r\\n\\r\\n
--harmony<\\/code> \\u6807\\u8bb0\\u6765\\u5f00\\u542f \\u65b0\\u7248ES6\\u7279\\u6027<\\/a>\\u3002<\\/p>\\r\\n\\r\\n
Ruby 3.1<\\/code>\\u6267\\u884c<\\/p>\\r\\n\\r\\n
Swift 5.5.2<\\/code><\\/p>\\r\\n\\r\\n
Go 1.21<\\/code><\\/p>\\r\\n\\r\\n
Python 3.10<\\/code><\\/p>\\r\\n\\r\\n
Scala 2.13<\\/code><\\/p>\"],\"kotlin\":[\"Kotlin\",\"
Kotlin 1.9.0<\\/code><\\/p>\\r\\n\\r\\n
rust 1.58.1<\\/code><\\/p>\\r\\n\\r\\n
PHP 8.1<\\/code>.<\\/p>\\r\\n\\r\\n