Apply Function to All Eigen Matrix Element

Apply function to all Eigen matrix element

Yes, use the Eigen::MatrixBase<>::unaryExpr() member function. Example:

#include <cmath>
#include <iostream>

#include <Eigen/Core>

double Exp(double x) // the functor we want to apply
{
return std::exp(x);
}

int main()
{
Eigen::MatrixXd m(2, 2);
m << 0, 1, 2, 3;
std::cout << m << std::endl << "becomes: ";
std::cout << std::endl << m.unaryExpr(&Exp) << std::endl;
}

Apply function to all elements in Eigen Matrix without loop

You can pass a lambda expression to unaryExpr, like so:

Eigen::Matrix<int,2,2> new_m = m.unaryExpr(
[](const Foo& x) {
return x.member_of_foo_returns_int();
});

If you can't use c++11, you need to write a small helper function:

int func_wrapper(const Foo& x) {
return x.member_of_foo_returns_int();
}

and pass that using std::ptr_fun:

Eigen::Matrix<int,2,2> new_m = m.unaryExpr(std::ptr_fun(func_wrapper));

For calling member functions there is actually a nice helper function already implemented named std::mem_fun_ref (this takes a member function pointer and returns a functor object which is accepted by unaryExpr):

Eigen::Matrix<int,2,2> new_m = m.unaryExpr(
std::mem_fun_ref(&Foo::member_of_foo_returns_int));

All these variants are type safe, i.e., trying to store the result in a non-int-Matrix will not compile.

Apply function in matrix elements of a list in R

We may use lapply to loop over the list and apply the function, extract the eigen values and then do the conversion to data.frame at the end

eigenvalues <- as.data.frame(do.call(cbind,
lapply(DATA, function(x) round(eigen(x)$values, 2))))

-output

> eigenvalues
V1 V2 V3 V4 V5
1 1.77+3.73i 5.33+0.00i 5.11+0.00i -2.52+3.53i -1.87+4.42i
2 1.77-3.73i 1.72+4.13i -5.08+0.00i -2.52-3.53i -1.87-4.42i
3 -0.50+3.97i 1.72-4.13i 2.41+3.87i 2.12+3.32i 2.96+3.44i
4 -0.50-3.97i -4.02+1.85i 2.41-3.87i 2.12-3.32i 2.96-3.44i
5 -3.38+2.06i -4.02-1.85i -2.60+3.46i 3.72+0.00i -4.15+0.00i
6 -3.38-2.06i -3.27+0.00i -2.60-3.46i -3.16+0.30i 1.67+3.35i
7 3.89+0.00i 1.48+2.89i 0.10+3.78i -3.16-0.30i 1.67-3.35i
8 -2.47+3.00i 1.48-2.89i 0.10-3.78i 2.50+1.89i 3.28+1.47i
9 -2.47-3.00i 3.05+0.00i 3.74+0.00i 2.50-1.89i 3.28-1.47i
10 3.51+0.00i -0.97+2.79i 2.38+2.10i -2.69+1.46i -2.88+1.40i
11 2.04+2.29i -0.97-2.79i 2.38-2.10i -2.69-1.46i -2.88-1.40i
12 2.04-2.29i -1.86+2.07i -2.44+0.01i -1.04+2.51i -1.32+2.89i
13 -3.03+0.00i -1.86-2.07i -2.44-0.01i -1.04-2.51i -1.32-2.89i
14 -1.97+1.67i -2.18+0.00i -1.52+1.78i 0.69+2.32i -0.77+2.12i
15 -1.97-1.67i 2.14+0.00i -1.52-1.78i 0.69-2.32i -0.77-2.12i
16 0.81+1.91i 1.61+0.77i 1.93+0.86i 2.23+0.85i 1.40+1.09i
17 0.81-1.91i 1.61-0.77i 1.93-0.86i 2.23-0.85i 1.40-1.09i
18 1.02+0.00i 0.14+1.55i -0.04+1.88i -0.77+0.57i 0.65+0.35i
19 -0.57+0.47i 0.14-1.55i -0.04-1.88i -0.77-0.57i 0.65-0.35i
20 -0.57-0.47i -0.99+0.00i 0.26+0.00i 0.67+0.00i 0.58+0.00i

Lambda Function Elementwise in Eigen

As o11c noticed, this is indeed a nullary-expression, and there is almost the exact same example in the doc. I copied it for convenience:

#include <Eigen/Core>
#include <iostream>
#include <random>
using namespace Eigen;
int main() {
std::default_random_engine generator;
std::poisson_distribution<int> distribution(4.1);
auto poisson = [&] () {return distribution(generator);};
RowVectorXi v = RowVectorXi::NullaryExpr(10, poisson );
std::cout << v << "\n";
}

Eigen 3.3.x: How to lamba-operate across all rows?

With the development branch of Eigen (and the upcoming 3.4 version) you can access the elements of Eigen objects using std-compatible iterators. To iterate row-wise through an Eigen expression, you need to write data.rowwise().begin() and data.rowwise().end() -- (to iterate column-wise, you need to write .colwise().begin() of course). And you can directly pass these to the constructor of std::vector (if the types are compatible).

Since in your example you also need to convert between double and float you can write something like this:

auto const& data_ = data.cast<float>().rowwise();
std::vector<Obj> payloads(data_.begin(), data_.end());

Full working example: https://godbolt.org/z/2x2jFJ

What is the most efficient way to repeat elements in a vector and apply a set of different functions across all elements using Eigen?

So, if I understand you correctly, you don't want to replicate (in terms of Eigen methods) the vector, you want to apply different methods to the same elements and store the result for each, correct?

In this case, computing it sequentially once per function is the easiest route. Most CPUs can only do one (vector) memory store per clock cycle, anyway. So for simple unary or binary operations, your gains have an upper bound.

Still, you are correct that one load is technically always better than two and it is a limitation of Eigen that there is no good way of achieving this.

Know that even if you manually write a loop that would generate multiple outputs, you should limit yourself in the number of outputs. CPUs have a limited number of line-fill buffers. IIRC Intel recommended using less than 10 "output streams" in tight loops, otherwise you could stall the CPU on those.

Another aspect is that C++'s weak aliasing restrictions make it hard for compilers to vectorize code with multiple outputs. So it might even be detrimental.

How I would structure this code

Remember that Eigen is column-major, just like Matlab. Therefore use one column per output function. Or just use separate vectors to begin with.

Eigen::VectorXd v = ...;
Eigen::MatrixX2d out(v.size(), 2);
out.col(0) = v.array().floor();
out.col(1) = v.array().ceil();

Following the KISS principle, this is good enough. You will not gain much if anything by doing something more complicated. A bit of multithreading might gain you something (less than factor 2 I would guess) because a single CPU thread is not enough to max out memory bandwidth but that's about it.

Some benchmarking

This is my baseline:

int main()
{
int rows = 100013, repetitions = 100000;
Eigen::VectorXd v = Eigen::VectorXd::Random(rows);
Eigen::MatrixX2d out(rows, 2);
for(int i = 0; i < repetitions; ++i) {
out.col(0) = v.array().floor();
out.col(1) = v.array().ceil();
}
}

Compiled with gcc-11, -O3 -mavx2 -fno-math-errno I get ca. 5.7 seconds.

Inspecting the assembler code finds good vectorization.

Plain old C++ version:

    double* outfloor = out.data();
double* outceil = outfloor + out.outerStride();
const double* inarr = v.data();
for(std::ptrdiff_t j = 0; j < rows; ++j) {
const double vj = inarr[j];
outfloor[j] = std::floor(vj);
outceil[j] = std::ceil(vj);
}

40 seconds instead of 5! This version actually does not vectorize because the compiler cannot prove that the arrays don't alias each other.

Next, let's use fixed size Eigen vectors to get the compiler to generate vectorized code:

    double* outfloor = out.data();
double* outceil = outfloor + out.outerStride();
const double* inarr = v.data();
std::ptrdiff_t j;
for(j = 0; j + 4 <= rows; j += 4) {
const Eigen::Vector4d vj = Eigen::Vector4d::Map(inarr + j);
const auto floorval = vj.array().floor();
const auto ceilval = vj.array().ceil();
Eigen::Vector4d::Map(outfloor + j) = floorval;
Eigen::Vector4d::Map(outceil + j) = ceilval;;
}
if(j + 2 <= rows) {
const Eigen::Vector2d vj = Eigen::Vector2d::MapAligned(inarr + j);
const auto floorval = vj.array().floor();
const auto ceilval = vj.array().ceil();
Eigen::Vector2d::Map(outfloor + j) = floorval;
Eigen::Vector2d::Map(outceil + j) = ceilval;;
j += 2;
}
if(j < rows) {
const double vj = inarr[j];
outfloor[j] = std::floor(vj);
outceil[j] = std::ceil(vj);
}

7.5 seconds. The assembler looks fine, fully vectorized. I'm not sure why performance is lower. Maybe cache line aliasing?

Last attempt: We don't try to avoid re-reading the vector but we re-read it blockwise so that it will be in cache by the time we read it a second time.

    const int blocksize = 64 * 1024 / sizeof(double);
std::ptrdiff_t j;
for(j = 0; j + blocksize <= rows; j += blocksize) {
const auto& vj = v.segment(j, blocksize);
auto outj = out.middleRows(j, blocksize);
outj.col(0) = vj.array().floor();
outj.col(1) = vj.array().ceil();
}
const auto& vj = v.tail(rows - j);
auto outj = out.bottomRows(rows - j);
outj.col(0) = vj.array().floor();
outj.col(1) = vj.array().ceil();

5.4 seconds. So there is some gain here but not nearly enough to justify the added complexity.



Related Topics



Leave a reply



Submit