Friday, April 20, 2018

Replicating Matlab sort function in c++

Like most scientist out there, I also use Matlab often. Sometimes for my papers, I need to replicate functionality of some Matlab functions. For a recent project, I needed to replicate the functionality of the Matlab sort function i.e. ([sorted, indices] = sort(data)) in C++. As always, I went online and searched Google, which gave me quite a few links but the best of them all which gives a very detailed explanation of the function was Alec Jacobsen's http://www.alecjacobson.com/weblog/?p=1527. The function he gives is almost correct however it has one slight issue. The indices array returned is not sorted on the range of repeats of the value which is how Matlab does in the sort function. 

To do this, the index array has to be range sorted (before the indices array is returned ) as follows:

//get index ranges and sort
int start = 0;
for (int i = 0; i < sorted_values.size() - 1; ++i) {
   if (sorted_values[i] != sorted_values[i + 1]) {
      sort(idx.begin() + start, idx.begin() + (i + 1));
      start = i + 1;
   }

}

This gives the correct result. This is added in the sort_indexes function.

To sum up, the following is the complete implementation. I am replicating my complete code here to keep this post self contained for reference later.

// This implementation is O(n), but also uses O(n) extra memory
template< class T >
void reorder(
    std::vector< T > & unordered,
    std::vector< size_t > const & index_map,
    std::vector< T > & ordered)
{
    // copy for the reorder according to index_map, because unsorted may also be
    // sorted
    std::vector< T > copy = unordered;
    ordered.resize(index_map.size());
    for (int i = 0; i < index_map.size(); i++)
    {
        ordered[i] = copy[index_map[i]];
    }
}


std::vector< size_t > sort_indexes(std::vector< float > &data,
                                 std::vector< float >& sorted_values)
{
    // initialize original index locations
    std::vector< size_t > idx(data.size());
    std::iota(idx.begin(), idx.end(), 0);

    //sort data
    //copy data
    std::copy(data.begin(), data.end(), std::back_inserter(sorted_values));
    sort(sorted_values.begin(), sorted_values.end(), [](float a, float b) {return a < b; });
   
    // sort indexes based on comparing values in data
    sort(idx.begin(), idx.end(),[data](size_t i1, size_t i2) {return data[i1] < data[i2]; });
       
    //get index ranges and sort
    int start = 0;
    for (int i = 0; i < sorted_values.size() - 1; ++i) {
        if (sorted_values[i] != sorted_values[i + 1]) {
            sort(idx.begin() + start, idx.begin() + (i + 1));
            start = i + 1;
        }
    }

    return idx;
}