164. Maximum Gap

时间:2022-09-19 10:53:39

题目:

Given an unsorted array, find the maximum difference between the successive elements in its sorted form.

Try to solve it in linear time/space.

Return 0 if the array contains less than 2 elements.

You may assume all elements in the array are non-negative integers and fit in the 32-bit signed integer range.

链接: http://leetcode.com/problems/maximum-gap/

题解:

给定未排序数组,求两个相邻元素间的差的最大值。想了一会不确定,于是去看了一眼Discuss的标题,果然是用Radix Sort。其实一开始看到题目里面所有数字都可以被表示为32-bit signed integer时,就应该要想到才对。 估计做法就是用Radix Sort,排序完毕后再扫描一边数组。正好加深一下对Radix Sort的理解和书写。要好好研究一下Key-indexed Counting,LSD,MSD和Bucket Sort。这道题用Bucket Sort的话就是把每个元素放入一个bucket中,然后计算非空bucket间的最大距离,原理都差不多。一刷用了LSD,代码大都参考Sedgewick大神,二刷的话要试一试MSD以及bucket sort。注意的一点是LSD处理最高8位bit时, 0 ~ 127为正, 128 ~ 255为负,所以要特殊处理一下accumulated count数组。做完这题以后就发现上过算法2真好...

LSD Radix Sort, Time Complexity - O(n),  Space Complexity - O(n)

public class Solution {
public int maximumGap(int[] nums) {
if(nums == null || nums.length < 2)
return 0;
lsdSort(nums);
int maxDiff = Integer.MIN_VALUE;
for(int i = 1; i < nums.length; i++)
maxDiff = Math.max(nums[i] - nums[i - 1], maxDiff); //may overflow
return maxDiff;
} private void lsdSort(int[] nums) {
int len = nums.length;
int[] aux = new int[nums.length];
int totalBits = 32;
int bitsPerByte = 8;
int window = totalBits / bitsPerByte;
int R = 1 << bitsPerByte; // R is radix
int mask = R - 1; // for(int d = 0; d < window; d++) { // for each window compared
int[] count = new int[R + 1]; for(int i = 0; i < len; i++) { // count least significant 8 bits of each num in nums
int c = (nums[i] >> (d * bitsPerByte)) & mask; // use mask to get least significant 8 bits
count[c + 1]++;
} for(int i = 0; i < R; i++) // count accumulated position for each nums
count[i + 1] += count[i]; if(d == window - 1) { // for most significant 8 bits, 0 ~ 127 is pos, 128 ~ 255 is neg, so 128 ~ 255 goes first
int shift1 = count[R] - count[R / 2];
int shift2 = count[R / 2];
for(int i = 0; i < R / 2; i++)
count[i] += shift1;
for(int i = R / 2; i < R; i++)
count[i] -= shift2;
} for(int i = 0; i < len; i++) { // move data
int c = (nums[i] >> (d * bitsPerByte)) & mask;
aux[count[c]++] = nums[i];
} for(int i = 0; i < len; i++) // copy back
nums[i] = aux[i];
} }
}

二刷:

依然使用了LSD radix sort,代码大都来自塞神的index counting。

  1. 我们把32位的数字分为4个batch,每个batch有8个digit,使用LSD的话我们是从最低一级的batch开始计算。这里我们要使用一个mask = 255,也就是二进制的11111111来找到每个batch。也要做一个auxiliary array - aux[] 用来保存临时结果
  2. 对于每个batch,我们都要做一次index counting sort,一共四次,从最低8位开始,到最高8位
    1. 每次确定了batch之后我们就可以创建buckets数组。这里buckets数组 count = new int[R + 1] , 这里的 + 1很重要
    2. 我们第一次遍历数组,根据最低8位,统计所有数字出现的频率,然后放入相应的bucket里,这里有个小trick很关键,我们虽然将低8位映射到从[0 - 255]这样的256个bucket里,但写入count数组的时候,我们有意把count[0]留下,把数字的count写到 [1 - 256]的bucket中。这样做的好处,要在后面才能看到,就是从经过aggregation后的count数组,将排序后的数字填入到aux[]数组中时。
    3. 然后,我们对统计好的数字和频率进行aggregation,利用count数组求一个accumulated count。经历过这一步以后,我们的count[i]代表的意思就是,有多少个数字应该被放在 i 之前的buckets里。
    4. 接下来,我们根据aggregation的结果,我们把这些数字填到aux[]里
    5. 最后把aux[]中的数字拷贝回nums中,继续计算下一个batch。 因为LSD的操作是stable的,所以我们进行完全部4个batch以后就会得到最终结果
  3. 这里要注意的一点是,32位数字最高8位digit的这个batch和其他三个不同。在这8位里0 ~ 127为正数,而128 ~ 255为负数。所以在这个batch里,计算完aggregation数组后,我们要做一个很巧妙的shift操作
    1. 先求出shift1 = count[R] - count[R / 2], 也就是在输入的nums[]中有多少个数字为负
    2. 再求出shift2 = count[R / 2], 输入的nums[]中有多少个正数
    3. 对于[0 - 127]的这些数字,他们都应该排在负数前面,所以我们对每一个count[i] 增加 shift1
    4. 对于[128 - 255]的这些数字,他们都应该排在正数后面,所以我们对每一个count[i] 减少 shift2
    5. 之后再根据新的count[]数组,把数字放入aux[]数组中,再拷贝回原数组nums[]里。

Java:

Time Complexity - O(n),  Space Complexity - O(n)

public class Solution {
public int maximumGap(int[] nums) { // LSD
LSDradix(nums);
int maxGap = 0;
for (int i = 1; i < nums.length; i++) maxGap = Math.max(maxGap, nums[i] - nums[i - 1]);
return maxGap;
} private void LSDradix(int[] nums) {
if (nums == null || nums.length == 0) return;
int batchSize = 8;
int batchNum = 32 / batchSize;
int R = 1 << batchSize; // radix, each bucket we consider only [0 - 255] , 256 numbers
int mask = R - 1; // 11111111, each batch 8 digits
int len = nums.length;
int[] aux = new int[len]; for (int d = 0; d < batchNum; d++) {
int[] count = new int[R + 1];
for (int num : nums) {
int idx = (num >> (d * batchSize)) & mask;
count[idx + 1]++; // index counting
}
for (int k = 1; k < count.length; k++) count[k] += count[k - 1]; // aggregation, find out the buckets boundaries
if (d == batchNum - 1) { // for first 8 digits 01111111 is max, 11111111 is min, we shift nums
int shift1 = count[R] - count[R / 2];
int shift2 = count[R / 2];
for (int k = 0; k < R / 2; k++) count[k] += shift1;
for (int k = R / 2; k < R; k++) count[k] -= shift2;
}
for (int num : nums) {
int idx = (num >> (d * batchSize)) & mask;
aux[count[idx]++] = num; // we place each num in each buckets, from the start slot of the bucket
}
for (int k = 0; k < len; k++) nums[k] = aux[k];
}
}
}

164. Maximum Gap

Reference:

https://www.cs.princeton.edu/~rs/AlgsDS07/18RadixSort.pdf

https://leetcode.com/discuss/18499/bucket-sort-java-solution-with-explanation-o-time-and-space

https://leetcode.com/discuss/18487/i-solved-it-using-radix-sort

https://leetcode.com/discuss/19951/use-average-gap-to-achieve-o-n-run-time-java-solution

https://leetcode.com/discuss/34289/pigeon-hole-principle

https://leetcode.com/discuss/53636/radix-sort-solution-in-java-with-explanation