leetcode Question: Maximum Gap

Maximum Gap

Given an unsorted array, find the maximum difference between the successive elements in its sorted form.
Try to solve it in linear time/space.
Return 0 if the array contains less than 2 elements.
You may assume all elements in the array are non-negative integers and fit in the 32-bit signed integer range.

Analysis:


It took me a long time figuring out how to solve it, because I was going to a wrong direction after seeing the linear time/space requirement.  To be honest, I was only thinking  of common sorting algorithms (merge, heap, quick).   However, I missed the another type of sorting algorithm ---- Distribution sort (bucket, counting, radix), which happens to be the key of this problem.

We all know that, merge, heap or quick sorting algorithm can achieve no better than O(n Log n) time complexity.  But for linear time requirement, bucket sort, counting sort and radix sort are often good options.

For this specific problem, bucket sort is not a good option because it usually works well on uniform distributions. Otherwise, in each bucket, the insertion sort may cause heavy burden on time complexity. Counting sort, has time complexity O(n + k), where k is the range of the elements. But when k > n^2, it is worse than the common sorting algorithms. So we use the radix sort for solving this problem.

The idea of radix sort is as follows:

  • Sort elements according to each digit (from lower to higher).
  • For each digit, use counting sort.

For example, A = [36 504 79 98 8 55 2 113].

The Radix sorting goes like following steps:

Sort 1st digit:
A = [2, 113, 504, 55, 36, 98, 8, 79]

Sort 2nd digit:
A = [2, 504, 8, 113, 36, 55, 79, 98]

Sort 3rd digit:
A = [2, 8, 36, 55, 79, 98, 113, 504]

Done.


To sort array according to each digit, counting sort is used. Note that, we only need to have a array of size 10 to store the frequencies of elements. This is because we only consider and sort a single digit in each iteration of Radix sort.

The general form of counting sort is:
(1) Count the frequencies of each elements.  (count[a[i]] ++, this also considers the duplicates)
(2) Get the index of each number in the output array. (count[i]+= count[i-1])
(3) Put numbers in order. (output[count[a[i]] = a[i], count[a[i]]-- to handle the duplicates)

For radix sort, there is minor modifications, details see the code below.


Code(C++):

class Solution {
public:
    int getMax(vector<int> num){
        int m = INT_MIN;
        for (int i=0;i<num.size();i++){
            m = max(m,num[i]);
        }
        return m;
    }
    
    void countSort(vector<int> &num, int nd){
        int n = num.size();
        vector<int> output(n,0);
        vector<int> count(10,0);
 
        for (int i = 0; i < n; i++){
            count[ (num[i]/nd)%10 ]++;
        }
 
        for (int i = 1; i < 10; i++){
            count[i] += count[i - 1];
        }
 
        for (int i = n - 1; i >= 0; i--){
            output[ count[(num[i]/nd)%10] - 1] = num[i];
            count[ (num[i]/nd)%10 ]--;
        }
 
        for (int i = 0; i < n; i++){
            num[i] = output[i];
        }
    }

    void radixsort(vector<int> &num){
         int max_n = getMax(num);
         for (int nd = 1; max_n/nd > 0; nd *= 10){
             countSort(num, nd);
         }
    }
    
    int maximumGap(vector<int> &num) {
        if (num.size()<2){
            return 0;
        }
        radixsort(num);
        int res = abs(num[1] - num[0]);
        for (int i=2;i<num.size();i++){
            if (num[i]-num[i-1]>res){
                res = abs(num[i] - num[i-1]);
            }
        }
        return res;
        
    }
};

Code(Python):

class Solution:
    # @param num, a list of integer
    # @return an integer
    
    def countsort(self, num, nd):
        count = [0]*10
        n = len(num)
        for i in xrange(0,n):
            count[ (num[i]/nd) %10] += 1
            
        for i in xrange(1,10):
            count[i] += count[i-1]
        
        output = [0]*n
        for i in xrange(n-1,-1,-1):
            output[count[ (num[i]/nd)%10 ] - 1 ] = num[i]
            count[ (num[i]/nd)%10 ] -= 1
        
        return output
    
    def radixsort(self, num):
        max_n = max(num)
        nd = 1
        while max_n / nd > 0:
            num = self.countsort(num, nd)
            nd = nd * 10
        return num
        
    def maximumGap(self, num):
        if len(num) < 2:
            return 0
        num = self.radixsort(num)
        res = abs(num[1] - num[0])
        for i in xrange(2,len(num)):
            res = max(res, abs(num[i] - num[i-1]))
        return res
        
        

6 comments:

  1. Actually, for maximum gap, we only need to maintain the minimum and maximum in each bucket (we don't need insertion sort in each bucket), thus bucket-sort based algorithm is faster than radix-sort based algorithms.

    ReplyDelete
  2. It seems radix sort is not O(n). Instead of A = [36 504 79 98 8 55 2 113], why not try A = [36 504 79 98 8 55 2 111111111111111111111111111111111113]

    ReplyDelete
    Replies
    1. The time complexity is O(mn), where m is the number of digits of the largest number in the array. For 32 bit signed integer, m <= 10. So the final time complexity is O(10n), which is still O(n). It's a very fast algorithm.

      Delete
  3. This comment has been removed by the author.

    ReplyDelete