Fixing templates with constexpr’s

For my hundredth (and a bit) c++ post I decided to do something I never did before: fix my old code!

A long time ago I wrote about template metaprogramming devices. There, I tried to explain that many atrocities have been commited in the name of performance and “compile time evaluation”. Template metaprogramming is probably one of the worse culprits of job security. Its (ab)use can create monstrosities, all in the name of runtime performance. Like, for example, my template device to calculate e. Let’s remember what that atrocious code looks like (follow the link if you want an explanation on how this works):

template <int N, int D> struct Frak {
        static const long Num = N;
        static const long Den = D;
};
  
template <int N, typename F> struct ScalarMultiplication {
    typedef Frak<N*F::Num, N*F::Den> result;
};
  
template <int X, int Y>   struct MCD {
        static const long result = MCD<Y, X % Y>::result;
};
  
template <int X> struct MCD<X, 0> {
        static const long result = X;
};
  
template <class F> struct Simpl {
        static const long mcd = MCD<F::Num, F::Den>::result;
        typedef Frak< F::Num / mcd, F::Den / mcd > result;
};
  
template <typename X1, typename Y1> struct SameBase {
    typedef typename ScalarMultiplication< Y1::Den, X1>::result X;
    typedef typename ScalarMultiplication< X1::Den, Y1>::result Y;
};
  
template <typename X, typename Y> struct Sum {
    typedef SameBase<X, Y> B;
    static const long Num = B::X::Num + B::Y::Num;
    static const long Den = B::Y::Den; // == B::X::Den
    typedef typename Simpl< Frak<Num, Den> >::result result;
};

template <int N> struct Fact {
    static const long result = N * Fact<N-1>::result;
};
template <> struct Fact<0> {
    static const long result = 1;
};
 
template <int N> struct E {
    // e = S(1/n!) = 1/0! + 1/1! + 1/2! + ...
    static const long Den = Fact<N>::result;
    typedef Frak< 1, Den > term;
    typedef typename E<N-1>::result next_term;
    typedef typename Sum< term, next_term >::result result;
};
template <> struct E<0> {
    typedef Frak<1, 1> result;
};
 
int main() {
  typedef E<8>::result X;
  std::cout << "e = " << (1.0 * X::Num / X::Den) << "\n";
  std::cout << "e = " << X::Num <<"/"<< X::Den << "\n";
  return 0;
}

While this is just a toy example to play with templates, it does illustrate code I’ve seen in the wild. Would this look cleaner in c++11? Yes, it would. Constexprs are, in my opinion, one of the most overlooked “killer” features of c++11.

Starting with a simple example:

constexpr int foo(int a, int b) { return a+b; }
static constexpr int n = foo(1, 2);
int bar() { return n; }

Try to compile it with “g++ -std=c++11 -fverbose-asm -O0 -c -S -o /dev/stdout” and see what happens. You should get the equivalent of “return 3” – just as anyone would expect – but note that no optimizations were enabled. What about loops? Let’s try this:

constexpr int f(int n) {
    return (n<2)? 1 : n + f(n-1);
}

constexpr int n = f(999);

You’ll probably get an error about maximum depth exceeded, but that’s alright: we have loops in constexprs too! (note that some of these restrictions have been relaxed in C++17).

In general, if you can express your function as a single const return statement, it should be a valid constexpr. With this new knowledge, let’s convert the template meta-atrocity above to something a bit less hideous:

struct PodFrac {
    int num;
    int den;
};

constexpr int mcd(int a, int b) {
    return (b==0)? a : mcd(b, a%b);
}

constexpr PodFrac simpl(const PodFrac &f) {
    return PodFrac{f.num / mcd(f.num, f.den), f.den / mcd(f.num, f.den)};
}

constexpr PodFrac sum(const PodFrac &a, const PodFrac &b) {
    return simpl(PodFrac{a.num*b.den + b.num*a.den, a.den*b.den});
}

constexpr int fact(int n) {
    return (n==0)? 1 : n*fact(n-1);
}

constexpr PodFrac e(int n) {
    return (n==0)? PodFrac{1, 1} :
                   sum(PodFrac{1, fact(n)}, e(n-1));
}

constexpr float e_num = 1.0 * e(8).num / e(8).den;

float get_e() {
    return e_num;
}

Disclaimer: while I explicitly stated this multiple times in my “C++ template metaprogramming introduction” article, it’s worth re-stating it: this code is meant as an example to showcase a c++ feature, not as a proper way of deriving a mathematical constant in production code.

First thoughts after comparing the two versions: much, much [, much]*100 cleaner.

As you may notice, all constexprs need to be a return statement. There’s no multi-statement constexpr in c++11, which explains why loops are not really supported. For the same reason the implementation of e() is a bit hindered by this limitation: its code would be much more readable splitting it in a few lines with proper names. Good news: some of these restrictions have been relaxed in C++17.

Note that if you analyze your compiler’s output when building without optimizations, you may see either a const with e’s value, or a static initializer that does some trivial operation, like loading e’s value from a fraction: gcc seems to get tired of constexpr evaluation after a few recursive calls, so your results may vary (slightly).

I called constexpr’s one of c++11’s killer features, and hopefully you can see why I’m so enthusiastic about them now: there’s much less incentive for people to write horrible template metaprogramming devices when simply adding a little keyword to a normal function has the same effect, only cleaner.

Advertisements

Quick refresher: argument dependent lookup

Since I wasted a few precious minutes stuck on an ADL problem, I figured I needed a quick reminder on how they work. Check this code: does it compile?

namespace N {
    int foo() {
    }
}

int main() {
    return foo();
}

Of course it doesn’t! You’d expect a ‘foo’ not declared/out of scope error from your compiler. What about this other example?

namespace N {
    struct Dummy;

    int foo(Dummy*) {
        return 0;
    }

    int foo() {
    }
}

int main() {
    return foo((N::Dummy*)0);
}

You’d be tempted to say it won’t work either. (Un?)fortunately, ‘argument dependant lookup’ is a thing, and the second code sample works. How? The compiler will look for ‘foo’ in the global namespace, and also in the namespace of the arguments to ‘foo’. Seeing ‘N::Dummy’ in there, the compiler is allowed to peak into the namespace N for method ‘foo’. Why? Short: operator overloading. Long: check here (the ‘Why ADL’ section is very good).