Let\'s say I need to retrieve the median from a sequence of 1000000 random numeric values.
If using anything but std::list
, I have no (
Armadillo has an implementation that looks like the one in the answer https://stackoverflow.com/a/34077478 by https://stackoverflow.com/users/2608582/matthew-fioravante
It uses one call to nth_element
and one call to max_element
and it is here:
https://gitlab.com/conradsnicta/armadillo-code/-/blob/9.900.x/include/armadillo_bits/op_median_meat.hpp#L380
//! find the median value of a std::vector (contents is modified)
template<typename eT>
inline
eT
op_median::direct_median(std::vector<eT>& X)
{
arma_extra_debug_sigprint();
const uword n_elem = uword(X.size());
const uword half = n_elem/2;
typename std::vector<eT>::iterator first = X.begin();
typename std::vector<eT>::iterator nth = first + half;
typename std::vector<eT>::iterator pastlast = X.end();
std::nth_element(first, nth, pastlast);
if((n_elem % 2) == 0) // even number of elements
{
typename std::vector<eT>::iterator start = X.begin();
typename std::vector<eT>::iterator pastend = start + half;
const eT val1 = (*nth);
const eT val2 = (*(std::max_element(start, pastend)));
return op_mean::robust_mean(val1, val2);
}
else // odd number of elements
{
return (*nth);
}
}
you can use this approch. It also takes care of sliding window.
Here days are no of trailing elements for which we want to find median and this makes sure the original container is not changed
#include<bits/stdc++.h>
using namespace std;
int findMedian(vector<int> arr, vector<int> brr, int d, int i)
{
int x,y;
x= i-d;
y=d;
brr.assign(arr.begin()+x, arr.begin()+x+y);
sort(brr.begin(), brr.end());
if(d%2==0)
{
return((brr[d/2]+brr[d/2 -1]));
}
else
{
return (2*brr[d/2]);
}
// for (int i = 0; i < brr.size(); ++i)
// {
// cout<<brr[i]<<" ";
// }
return 0;
}
int main()
{
int n;
int days;
int input;
int median;
int count=0;
cin>>n>>days;
vector<int> arr;
vector<int> brr;
for (int i = 0; i < n; ++i)
{
cin>>input;
arr.push_back(input);
}
for (int i = days; i < n; ++i)
{
median=findMedian(arr,brr, days, i);
}
return 0;
}
putting together all the insights from this thread I ended up having this routine. it works with any stl-container or any class providing input iterators and handles odd- and even-sized containers. It also does its work on a copy of the container, to not modify the original content.
template <typename T = double, typename C>
inline const T median(const C &the_container)
{
std::vector<T> tmp_array(std::begin(the_container),
std::end(the_container));
size_t n = tmp_array.size() / 2;
std::nth_element(tmp_array.begin(), tmp_array.begin() + n, tmp_array.end());
if(tmp_array.size() % 2){ return tmp_array[n]; }
else
{
// even sized vector -> average the two middle values
auto max_it = std::max_element(tmp_array.begin(), tmp_array.begin() + n);
return (*max_it + tmp_array[n]) / 2.0;
}
}
Here is an answer that considers the suggestion by @MatthieuM. ie does not modify the input vector. It uses a single partial sort (on a vector of indices) for both ranges of even and odd cardinality, while empty ranges are handled with exceptions thrown by a vector's at
method:
double median(vector<int> const& v)
{
bool isEven = !(v.size() % 2);
size_t n = v.size() / 2;
vector<size_t> vi(v.size());
iota(vi.begin(), vi.end(), 0);
partial_sort(begin(vi), vi.begin() + n + 1, end(vi),
[&](size_t lhs, size_t rhs) { return v[lhs] < v[rhs]; });
return isEven ? 0.5 * (v[vi.at(n-1)] + v[vi.at(n)]) : v[vi.at(n)];
}
Demo