mirror of
https://gitee.com/coder-xiaomo/leetcode-problemset
synced 2025-01-10 18:48:13 +08:00
29 lines
1.7 KiB
HTML
29 lines
1.7 KiB
HTML
<p>The similarity of two documents (each with distinct words) is defined to be the size of the intersection divided by the size of the union. For example, if the documents consist of integers, the similarity of {1, 5, 3} and {1, 7, 2, 3} is 0.4, because the intersection has size 2 and the union has size 5. We have a long list of documents (with distinct values and each with an associated ID) where the similarity is believed to be "sparse". That is, any two arbitrarily selected documents are very likely to have similarity 0. Design an algorithm that returns a list of pairs of document IDs and the associated similarity.</p>
|
|
|
|
|
|
|
|
<p>Input is a 2D array <code>docs</code>, where <code>docs[i]</code> is the document with id <code>i</code>. Return an array of strings, where each string represents a pair of documents with similarity greater than 0. The string should be formatted as <code>{id1},{id2}: {similarity}</code>, where <code>id1</code> is the smaller id in the two documents, and <code>similarity</code> is the similarity rounded to four decimal places. You can return the array in any order.</p>
|
|
|
|
|
|
|
|
<p><strong>Example:</strong></p>
|
|
|
|
|
|
|
|
<pre>
|
|
|
|
<strong>Input:</strong>
|
|
|
|
<code>[
|
|
|
|
[14, 15, 100, 9, 3],
|
|
|
|
[32, 1, 9, 3, 5],
|
|
|
|
[15, 29, 2, 6, 8, 7],
|
|
|
|
[7, 10]
|
|
|
|
]</code>
|
|
|
|
<strong>Output:</strong>
|
|
|
|
[
|
|
|
|
"0,1: 0.2500",
|
|
|
|
"0,2: 0.1000",
|
|
|
|
"2,3: 0.1429"
|
|
|
|
]</pre>
|
|
|
|
|
|
|
|
<p><strong>Note: </strong></p>
|
|
|
|
|
|
|
|
<ul>
|
|
|
|
<li><code>docs.length <= 500</code></li>
|
|
|
|
<li><code>docs[i].length <= 500</code></li>
|
|
|
|
<li>The number of document pairs with similarity greater than 0 will not exceed 1000.</li>
|
|
|
|
</ul>
|
|
|