Calculate Mean and Standard Deviation from a Vector of Samples in C++ Using Boost

Calculate mean and standard deviation from a vector of samples in C++ using Boost

Using accumulators is the way to compute means and standard deviations in Boost.

accumulator_set<double, stats<tag::variance> > acc;
for_each(a_vec.begin(), a_vec.end(), bind<void>(ref(acc), _1));

cout << mean(acc) << endl;
cout << sqrt(variance(acc)) << endl;

 

Boost Accumulator to calculate mean and variance - Is it fast?

Yes it's fast. And it's correct. The latter being the important feature.

No, it would not be faster to store the information first (simple logic: you'll spend time allocating memory and traversing it again).

Calculating Standard Deviation & Variance in C++

As the other answer by horseshoe correctly suggests, you will have to use a loop to calculate variance otherwise the statement

var = ((Array[n] - mean) * (Array[n] - mean)) / numPoints;

will just consider a single element from the array.

Just improved horseshoe's suggested code:

var = 0;
for( n = 0; n < numPoints; n++ )
{
var += (Array[n] - mean) * (Array[n] - mean);
}
var /= numPoints;
sd = sqrt(var);

Your sum works fine even without using loop because you are using accumulate function which already has a loop inside it, but which is not evident in the code, take a look at the equivalent behavior of accumulate for a clear understanding of what it is doing.

Note: X ?= Y is short for X = X ? Y where ? can be any operator.
Also you can use pow(Array[n] - mean, 2) to take the square instead of multiplying it by itself making it more tidy.

Is it possible to use boost accumulators with vectors?

I've looked into your question a bit, and it seems to me that Boost.Accumulators already provides support for std::vector. Here is what I could find in a section of the user's guide :

Another example where the Numeric
Operators Sub-Library is useful is
when a type does not define the
operator overloads required to use it
for some statistical calculations.
For instance, std::vector<> does not overload any arithmetic operators, yet
it may be useful to use std::vector<>
as a sample or variate type. The
Numeric Operators Sub-Library defines
the necessary operator overloads in
the boost::numeric::operators
namespace, which is brought into scope
by the Accumulators Framework with a
using directive.

Indeed, after verification, the file boost/accumulators/numeric/functional/vector.hpp does contain the necessary operators for the 'naive' solution to work.

I believe you should try :

  • Including either

    • boost/accumulators/numeric/functional/vector.hpp before any other accumulators header
    • boost/accumulators/numeric/functional.hpp while defining BOOST_NUMERIC_FUNCTIONAL_STD_VECTOR_SUPPORT
  • Bringing the operators into scope with a using namespace boost::numeric::operators;.

There's only one last detail left : execution will break at runtime because the initial accumulated value is default-constructed, and an assertion will occur when trying to add a vector of size n to an empty vector. For this, it seems you should initialize the accumulator with (where n is the number of elements in your vector) :

accumulator_set<std::vector<double>, stats<tag::mean> > acc(std::vector<double>(n));

I tried the following code, mean gives me a std::vector of size 2 :

int main()
{
accumulator_set<std::vector<double>, stats<tag::mean> > acc(std::vector<double>(2));

const std::vector<double> v1 = boost::assign::list_of(1.)(2.);
const std::vector<double> v2 = boost::assign::list_of(2.)(3.);
const std::vector<double> v3 = boost::assign::list_of(3.)(4.);
acc(v1);
acc(v2);
acc(v3);

const std::vector<double> &meanVector = mean(acc);
}

I believe this is what you wanted ?

Getting the variance of a vector of long doubles

x < (v.size() - 1)? This skips the last element. Use <= or omit the - 1.

The index of the element is v.size() - 1, and since x must be less than that, the loop breaks before the last element is processed.

Calculate rolling / moving average in C++

You simply need a circular array (circular buffer) of 1000 elements, where you add the element to the previous element and store it.

It becomes an increasing sum, where you can always get the sum between any two pairs of elements, and divide by the number of elements between them, to yield the average.

What is the meaning of bindvoid(ref(acc), _1)?

I'm assuming that you understand what an iterator is; for_each takes a starting iterator, an ending iterator, and a function on which to call on the objects associated with the iterator. std::for_each

  1. bind<void>(ref(acc),_1) is functor (or function object - think of this like a function with internal state) that takes one double and returns nothing - roughly equivalent to void function(double));
  2. ref(acc) allows you to minimize the penalty for copying an object; ref
  3. acc in this case is an accumulator has the following function within its definition operator()(double value);
  4. _1 is known as a place holder (a little complex, but see placeholders) - roughly speaking placeholder acts a mechanism to pass a double into the functor.


Related Topics



Leave a reply



Submit