How to use C++ STL heap? A toy example

How to use C++ STL heap?



In my previous post (click here), heap sort algorithm is presented You can also find the basic concept of the heap and how heap data structure is created and maintained. Generally speaking, heap is a tree structure where its parent node is the biggest(max heap) or smallest(min heap) of the values. Note that heap is NOT the BST, by traversing the tree cannot lead you to the sorted sequence as BST.


So, in our daily use or for the interview, sometime you don't have to implement the heap but using it is required. For example, let's see the toy example (this is also an interview question) below:



  • How to find pair with kth largest sum? Given two sorted arrays of numbers, we want to find the pair with the kth largest possible sum. (A pair is one element from the first array and one element from the second array). For example, with arrays:

          A [2, 3, 5, 8, 13]
          B [4, 8, 12, 16]
          The pairs with largest sums are
          13 + 16 = 29
          13 + 12 = 25
          8 + 16 = 24
          13 + 8 = 21
          8 + 12 = 20
         So the pair with the 4th largest sum is (13, 8). How to find the pair with the kth largest    possible sum?



Analysis:
When the problem requires the largest, smallest, most frequent, etc. element in a big range of data, or the data is not 'static' but in stream, using heap is a good direction to try (always with the hashmap). For this problem, a max heap can be created according to the sum value, which is the sum of the two values in A and B array respectively. For better illustration, we consider the sorted arrays are in descending order (A[13,9,5,3,2], B[16,12,8,4]).

The algorithm is straightforward it you are familiar with the heap data structure:


(1) Push the A[0],B[0] into the heap.
(2) Do the following k times:
                 Pop the top node (A[i],B[j]) in the heap and output it.
                 Push the A[i+1],B[j] and A[i],B[j+1] into the heap.
(3) Update the heap


In C++ STL, heap can be created using the make_heap , which is in the <algorithm> lib.
And pop_heap, push_heap is used to add and remove element in the heap.
Below I show the code for this problem and you can see how to use the heap with STL:

Code:
#include <iostream>
#include <vector>
#include <algorithm>

using namespace std;


bool _cmp(pair<pair<int,int>, int> a, pair<pair<int,int>, int> b){
    return a.second < b.second;
}

int main()
{
    //Initialize the data
    int a[5]= {13,2,3,8,5};
    int b[4] = {8,4,16,12};
    vector<int> A(a,a+5);
    vector<int> B(b,b+4);
    sort(A.rbegin(),A.rend());  // rbegin to get the descending order
    sort(B.rbegin(),B.rend());
    //Print the sorted array
    cout << "Original Arrays: "<< endl;
    for (int i=0;i<A.size();i++){cout << A[i] <<" ";}
    cout <<endl;
    for (int i=0;i<B.size();i++){cout << B[i] <<" ";}
    cout <<endl;

    //Use pair<<id_A,id_B>, A[id_A]+B[id_B]> as the node in the heap
    vector <pair<pair<int,int>, int> > C; // vector to maintain the heap.
    C.push_back(make_pair(make_pair(0,0),A[0]+B[0])); //push the first node
    make_heap(C.begin(),C.end(),_cmp); // make the heap

    int n =5; //get the first 5 biggest pairs
    for (int i=0;i<n;i++){
        pair<pair<int,int>, int> hroot = C.front(); // get the root of the heap (biggest sum)
        int maxi = hroot.first.first;
        int maxj = hroot.first.second;
        cout << "[" << A[maxi] << "," << B[maxj]<< "]," << hroot.second << endl;
        //pop the root node
        pop_heap(C.begin(),C.end(),_cmp);
        C.pop_back();

        //push the two new nodes
        C.push_back(make_pair(make_pair(maxi+1,maxj),A[maxi+1]+B[maxj]));
        push_heap(C.begin(),C.end(),_cmp);

        C.push_back(make_pair(make_pair(maxi,maxj+1),A[maxi]+B[maxj+1]));
        push_heap(C.begin(),C.end(),_cmp);

    }
    return 0;
}

No comments:

Post a Comment