Friday, December 28, 2012

Clang and Generic (Polymorphic) Lambdas.

Recently a Faisal Vali put forth an implementation of n3418, which he co-authored with Herb Stutter and Dave Abraham,  allowing generic lambdas using a fork of clang. It also includes auto type deduction, which I wrote about being implemented in gcc 4.8. There are a few caveats before continuing: This has not been merged into the mainline. It has a few bugs, but Vali is quick to fix them if you point one out. The implementation itself is a proof of concept (similar to automatic type deduction) and so it isn't unreasonable to assume some things might change. Section 2.4 of the proposal (named lambdas) has not yet been implemented. And while this doesn't allow us to do many things that were previously impossible, the possible used to be so verbose that no one would want to do it!

Generic lambdas are profound and may have a great impact on the style of code written in C++. Consider this a light (and lacking) overview of what is possible.

Before continuing, I want to note that I found evidence that some GCC developers had begun working on generic lambdas (from the mailing list: Latest experimental polymorphic lambda patches), however, I cannot find anything more recent than 2009 discussing this, and code using auto and lambdas does not compile.


Being terse.

This patch allows the writing of incredibly terse polymorphic functions, such as these:

auto add = [](auto x, auto y) x + y;
auto times = [](auto x, auto y) x * y; 
auto print_int = [](int x) void(std::cout << x);

No {}, no return, auto-deduced types, and void can even be used to throw away the value of state-full operations. x and y can be anything and the return type is entirely dependent on them. Why is this interesting? Say you want to find the product of a vector.

auto prod = std::accumulate ( 
    v.begin(), v.end(), 1,
    []( int x, int y ){ return x * y; }
);

Bit of a mouthful, right? v might store ints today, but tomorrow, maybe it will store long long ints to avoid overflow or just unsigned ints to avoid negatives. When the vector's declaration changes, the lambda's argument types need to change, which is a maintenance problem. I currently solve this by writing polymorphic function objects.

constexpr struct {
    template< class X, class Y >
    auto operator () ( X x, Y y )
        -> decltype(x*y)
    {
        return x * y;
    }
} timesObj{}; 

But the above and times are mostly equivalent. (A lambda can be thought of as an automatic function object. It even has a unique type. (see below: Overloading))

auto prod = std::accumulate ( 
    v.begin(), v.end(), 1,
    times
);

This never needs to change unless the underlying operation (multiplication) changes.

Sometimes, an underlying operation is consistent across types. Using zip_tuple as an example from my article "Zipping and Mapping Tuples", one could write:

std::tuple<int,std::string> a{1,"hello "}, 
                            b{2,"world!"};

// r = {3,"hello world"}
auto r = zipTuple( ([](auto x, auto y) x + y), a, b );

Because of the comma operator, we must put the lambda in parentheses to make clear where it ends. 

Up until now, things like composition could not be lambdas.

template< class F, class G > struct Composition {
    F f;
    G g;

    template< class ...X >
    auto operator () ( X&& ...x ) 
        -> decltype( f(g(std::forward<X>(x)...)) )
    {
        return f( g( std::forward<X>(x)... ) );
    }
};

template< class F, class G, class C = Composition<F,G> >
C compose( const F& f, const G& g ) {
    return C{f,g};
}

int main () {
    auto add1 = []( auto x ) x + 1;
    auto Char = []( auto c ) char(c);
    // Prints "a + 1 = b"
    std::cout << "a + 1 = " << compose(Char,add1)('a') << std::endl;
}

compose is generic and it returns a generic function object. Generic lambdas make the same possible by returning another generic lambda that captures f and g.

auto compose = [](auto f, auto g) 
    [=](auto x) f(g(x));

However, this version of compose only allows one argument. Luckily, generic lambdas can be fully templated, variadic, and perfect forwarding.

auto compose = 
    []( auto f, auto g )
        [=]<class ...X>( X&& ...x ) 
            f( g(std::forward<X>(x)...) );

However, the syntax for these lambdas is so convenient, one might as well drop the functional programming theory and write

auto f = [](char c) char(c+1);

For an example of the power of nested lambdas, consider currying:

auto curry3 = 
    []( auto f )
        [=](auto x) [=](auto y) [=](auto z) 
            f(x,y,z);

auto sum3 = [](auto x, auto y, auto z) x + y + z;
auto ten = curry3(sum3)(2)(3)(5)

Nested lambdas especially aid in the use of monads, as I have written about previously ("Monads in C++").

auto justThree = Just(1) >>= [](auto x)
                 Just(2) >>= [](auto y)
                 mreturn<std::unique_ptr>( x + y ); // Or Just(x+y).

This also takes care of the return mreturn problem I discussed in that article.

Overloading

Overloading functions is obviously useful, but impossible with lambdas alone. To fully take advantage of their brevity, we must have a way to overload them. In the proposal, Mathius Gaunard is attributed with the following:

template<class F1, class F2> struct overloaded : F1, F2
{
    overloaded(F1 x1, F2 x2) : F1(x1), F2(x2) {}
    using F1::operator();
    using F2::operator();
};

template<class F1, class F2>
overloaded<F1, F2> overload(F1 f1, F2 f2)
{ return overloaded<F1, F2>(f1, f2); } 

(See also: "Unifying Generic Functions and Function Objects")

This works because lambdas are function objects with a unique type, and can therefore act as the base class for overloaded. This is an unlikely solution because this fact is so rarely taken advantage of, however there is much advantage to take!

Unfortunately, one cannot inherit from function pointers, so, in order to overload lambdas and regular functions together requires a little more work. First, we must define a base type that can handle both function pointers and function objects. It's job is to just forward the arguments to its function.

template< class F > struct Forwarder : F {
    constexpr Forwarder( const F& f ) : F(f) { }
};

template< class R, class ...X > struct Forwarder<R(*)(X...)> {
    using type = R(*)(X...);
    type f;

    constexpr Forwarder( type f ) : f(f) { }

    constexpr R operator () ( X... x ) {
        return f(x...);
    }
};

template< class R, class ...X > 
struct Forwarder<R(X...)> : Forwarder<R(*)(X...)>
{
    using type = R(*)(X...);
    constexpr Forwarder( type f ) 
        : Forwarder<R(*)(X...)>(f)
    {
    }
};

Function pointers get two specializations because decltype(f)=R(X) and decltype(&f)=R(*)(X). It makes the most sense to specialize for pointers, but only doing so would require we take the address of our functions when we call overload.

Next, Overloaded inherits from two Forwarders.

template< class F, class G >
struct Overloaded : Forwarder<F>, Forwarder<G> {
    constexpr Overloaded( const F& f, const G& g )
        : Forwarder<F>(f), Forwarder<G>(g)
    {
    }
};

template< class F > F overload( F&& f ) {
    return std::forward<F>(f);
}

template< class F, class G, class ...H,
          class O1 = Overloaded<F,G> > 
auto overload( const F& f, const G& g, const H& ...h ) 
    -> decltype( overload(O1(f,g),h...) )
{
    return overload( O1(f,g), h... );
}

Overloads can be of arity and domain (or argument type). The simplest example, for demonstration purposes, is a printing function.

auto prnt = overload (
    // Anything cout is already defined for.
    []( const auto& x ) 
        -> decltype( void(std::cout << x) )
    { std::cout << x; },

    // Any STL sequence.
    []<class Sequence>( const Sequence& s )
        -> decltype( void(std::begin(s)) )
    {
        std::cout << "[ ";
        for( const auto& x : s ) 
            std::cout << x << ' ';
        std::cout << ']';
    },

    // These are both sequences for which cout is defined.
    // Specializing disambiguates this.
    []( const char* const s ) { std::cout << s; },
    []( const std::string& s ) { std::cout << s; }
);

// ...
prnt("abc"); // Prints abc.
prnt(std::vector<int>{1,2,3}); // Prints [ 1 2 3 ]. 

Although defining all overloads in a single statement is an annoyance, they are grouped together, they don't require a template<...> line, and the visual clutter is overall less than if prnt were defined as the equivalent (explicit) function object.

Perhaps a function must be overloaded, but decltype or std::enable_if is too accepting and specializing for each type is redundant. For example, one might be annoyed by the last two string specializations of prnt. One solution is to define yet another overload type.

template< class X, class F >
struct UnaryOverload {
    F f;
    UnaryOverload( const F& f ) : f(f) { }

    using result_type = typename std::result_of< F(X) >::type;

    result_type operator () ( X x ) const {
        return f(x);
    }
};

template< class ...X, class F >
auto unary_overload_set( const F& f ) 
    -> decltype( overload(UnaryOverload<X,F>(f)...) )
{
    return overload( UnaryOverload<X,F>(f)... );
}

auto prnt = overload (
    // ...

    unary_overload_set<const char* const,
                       const std::string&>(
        []( auto&& s ) { std::cout << s; }
    )
);

One might write an overloading class to specialize for specific types, or a category of types, or more generally, a class might be written to do type conversion before calling the inner function, to prepare the output, or whatever one's needs may be. An overloading type might even select one of two functions based on an enable_if.

    // Where pred is a templated type defining pred<X>::value.
    auto h = enable_when<pred>( f, g );

The downsides of overloading function objects include that each overload must be defined all at once and none can be added. That isn't too bad since the point of lambdas is to be brief, but one should be mindful of extensibility when writing generic functions. (In other words, if an overload must be added, is it OK to modify the function object, or must the user be able to add overloads later?)


Recursion.

Without generic lambdas, recursion is possible.

std::function<int(int)> fact = [&]( int x ) x * fact(x-1);

Or, with function pointers, which a lambda may implicitly convert to.

// In global scope:
using IntToInt = int(*)(int);
IntToInt fact = []( auto x ) not x ? 1 : x * fact(x-1);

With generic lambdas, we could write it like this:

auto fact1 = []( auto f, int n ) -> int 
    not n ? 1 : f(f,n-1) * n;
auto fact = []( int x ) -> int 
    fact1( fact1, x );

One might notice that the Fibonacci sequence could be implemented in a similar fashion. In researching recursive lambdas, I came across the fixed-point combinator. Haskell has fix, which can be implemented like this:

auto fix = []( auto f )
    [=]( auto x ) -> decltype(x) f( f, x );

auto fact = fix (
    []( auto f, int n ) -> int
    not n ? 1 : f(f,n-1) * n
);

auto fib = fix (
    []( int f, int n ) -> int
    n == 0 ? 0 :
    n == 1 ? 1 :
    f(f,n-1) + f(f,n-2)
); 

fix is a generalization of a certain type of recursion. For an idea of how one would implement fix without lambdas, see this Stack Overflow post.

Making prnt above variadic requires a different kind of recursion.

// Variadic void unary.
auto vvu_impl = overload (
    [] (auto,auto) {},
    []<class X, class ...Y>( const auto& self, const auto& u, 
                             X&& x, Y&& ...y ) 
    {
        u( std::forward<X>(x) );
        self( self, u, std::forward<Y>(y)... );
    }
);

auto vvu = []( const auto& u )
    [&]<class ...x>( const x& ...x )
        vvu_impl( vvu_impl, u, x... );

// Variadic print.
// vprint(x,y...) = prnt(x); prnt(y)...
auto vprint = vvu( prnt );

auto print_line = []<class ...X>( const X& ...x ) 
    vprint( x..., '\n' );
 
print_line( "abc ", 123, std::vector<int>{1} ); // prints "abc 123 [1]\n" 

We can generalize left-associativity as well.

auto chainl_impl = overload (
    []( auto self, auto b, auto r ) { return r; },
    []<class ...Z>( auto self, auto b, auto x, auto y, Z ...z ) 
        self( self, b, b(x,y), z... )
);

auto chainl = []( auto b )
    [=]<class ...X>( const X& ...x )
        chainl_impl( chainl_impl, b, x... );

auto sum = chainl(add);
auto three = sum(1,1,1);

// Variadic compose.
auto vcompose = chainl( compose );

auto inc = []( auto x ) ++x;
auto addThree = vcompose( inc, inc, inc );

A good exercise might be to (a) write a variadic version of fix and (b) use that version to reimplement chainl and vprint.

There are, of course, many types of recursion. Implementing recursive lambdas is more complicated than for regular functions, not by too much.


Conclusions.

Polymorphic (generic) lambdas are very powerful indeed, but it may take a while before GCC, MSVC, and others catch up, much yet before Faisal Vali's branch is merged back into Clang. Still they may have a strong impact on code written in C++ in the future. Some thought that templates relieved a sort of functional language in C++, and others thought the same of constexpr. Generic lambdas reveal another, but more flexible one.

Lambdas cannot be marked constexpr. In terms of efficiency, I do not think this matters. They are implicitly inlined, so the compiler may still take advantage of any compile-time information it can gather. However, the result of a lambda expression could never be used in a template parameter, for example, which means they don't replace generic constexpr function objects.

Also, explicitly specifying a type is more verbose because the rules are the same as for template member functions, so lambdas can't replace template functions that require explicit parameters.

    auto f = []<class X>() { return x(); }; 
    f.operator()<int>(); // bad

The proposal to add polymorphic lambdas to C++ is not finalized and a few things are up in the air. For example, can we elide auto and just write [](x) f(x)? Should we be allowed to elid the enclosing braces and return? Are the implemented parts of this proposal useful? Remember that the standardization process is open to the public and that we can make our voices heard about the features that impact us.

Personally, I like the proposal as implemented currently. (Requiring auto, but eliding { return ... }.) I would go a step further and say that auto should be extended to allow variadic parameters. (i.e. [](auto ...x) f(x...)) And named lambdas (section 2.4) will be a very nice addition.

What are your thoughts?




Source for this article: https://gist.github.com/4347130 and https://gist.github.com/4381376
Another (google translated) article on generic lambdas: http://translate.google.com/translate?hl=en&sl=ja&u=http://d.hatena.ne.jp/osyo-manga/20121225/1356368196&prev=/search%3Fq%3Dgeneric%2Blambda%2Bc%252B%252B%2Bclang%26start%3D10%26hl%3Den%26safe%3Doff%26client%3Dubuntu%26sa%3DN%26tbo%3Dd%26channel%3Dfs%26biw%3D1535%26bih%3D870&sa=X&ei=v-LbUOG5BOmQ2gXw7ICABg&ved=0CFsQ7gEwBjgK
A long and thoughou article on the fixed-point combinator: http://mvanier.livejournal.com/2897.html

Friday, December 14, 2012

Quick and Easy -- Manipulating C++ Containers Functionally.

Update: Added examples for dup and foldMap.

Probably the most useful parts of the standard C++ library would be container and algorithms support. Who has worked in C++ for any non-trivial amount of time without using std::vector, list, map, or any of the others? <algorithm>, on the other hand, is more something everyone should know. It solves many of the problems that C++ developers encounter on a daily basis.

    "How do I test if there exists an element, x, where p(x) is true?" : std::any_of
    "How do I copy each element, x, where p(x)?" : std::copy_if
    "How do I removed each element, x, where p(x)?" : std::remove_if
    "How do I move elements from one container to another?" : std::move, <algorithm> version.
    "How do I find a subsequence?" : std::search
    "How do I sort an array?" std::sort
    "How do I find the sum of an array?" : std::accumulate

Any programmer worth half their salt could write any of these functions in their sleep--they're basic--and the thing is that these algorithms do get written, over and over and over again. Either because one does not realize a specific <algorithm> function exists, or because one is thinking on a low level and unable to see the higher level abstractions.

What I like most about the STL is that the only requirements for adapting any data type to a sequence are (1) define an iterator, and (2) define begin() and end(). After that, all (if not most) of <algorithm> becomes instantly usable with that type. (As well as the range-based for loop.) This makes it incredibly generic and useful.

What I dislike is its verbosity. For example:

    std::transform( xs.begin(), xs.end(), xs.begin(), f );

Wouldn't this be more clear if written...


    xs = std::transform(xs,f);

And this allows us to compose functions.

    std::transform( xs.begin(), xs.end(), xs.begin(), f );
    std::transform( xs.begin(), xs.end(), xs.begin(), g );

    // vs
    xs = std::transform( std::transform(xs,f), g );

    // Or, using actual composition:
    xs = std::transform( xs, compose(g,f) );

That's what this article will be about. An abstraction over the STL that lends itself to writing more terse, concise code without losing any clarity. This abstraction is less general, by design, because it works on entire containers, not iterators. I am not writing about a replacement for any <algorithm> functions, but an alternative inspired by functional programming.

However, I do go over many <algorithm> functions, so this can also be thought of as a review.


Filtering, Taking, and Dropping: Collecting data.

I've always found the erase-remove idiom an unintuitive solution to such a common problem. I certainly would not have figured it out on my own without the help of the C++ community to point it out. Requiring containers to define a predicated erase wouldn't be generic, and <algorithm> knows only of iterators, not containers, so the standard library can't offer anything simpler. filter fills this gap by combining its knowledge of containers and iterators.

template< class P, class S >
S filter( const P& p, S s ) {
    using F = std::function< bool(typename S::value_type) >;

    s.erase (
        std::remove_if( std::begin(s), std::end(s), std::not1(F(p)) ),
        std::end(s)
    );

    return s;
}

// ...

std::vector<int> naturals = {1,2,3,4,5,6,7,8,9/*,...*/};
auto evens = filter( [](int x){ return x%2==0; }, naturals );

See also: std::not1.

It does two things: First, it inverses the predicate meaning we can use positive logic (defining what we want to keep, rather than throw away) and second, it abstracts the erase-remove idiom.

Using filter, we can write a rather quick-and-dirty qsort.

// For each x in s, returns pair( p(x), not p(x) ).
template< class P, class S >
std::pair<S,S> partition( const P& p, S s ) {
    using F = std::function< bool(typename S::value_type) >;

    // There does exist std::partition_copy, 
    // however this function demonstrates a use of filter.
    return std::make_pair ( 
        filter( p,    s ),
        filter( std::not1(F(p)), s )
    );
}

// Fake Quick-Sort: A non-in-place, qsort-inspired function.
template< class S >
S fake_qsort( S s ) {
    using X = typename S::value_type;

    if( s.size() < 2 )
        return s;

    X pivot = s.back();
    s.pop_back();

    S low, high; 
    std::tie(low,high) = partition (
        [&]( const X& x ) { return x <= pivot; },
        std::move(s)
    );

    low = fake_qsort( std::move(low) );
    low.push_back( pivot );
    
    // Append the sorted high to the sorted low.
    high = fake_qsort( std::move(high) );
    std::move( std::begin(high), std::end(high), 
               std::back_inserter(low) );

    return low;
}
See also: std::partition, std::partition_copy, and std::sort.

take is a function that may seem entirely trivial, at least at first.

template< class S, class _X = decltype( *std::begin(std::declval<S>()) ),
          class X = typename std::decay<_X>::type >
std::vector<X> take( size_t n, const S& s ) {
    std::vector<X> r;
    std::copy_n( std::begin(s), n, std::back_inserter(r) );
    return r;
}

template< class P, class S, 
          class _X = decltype( *std::begin(std::declval<S>()) ), 
          class X  = typename std::decay<_X>::type >
std::vector<X> takeWhile( const P& p, const S& s ) {
    std::vector<X> r;
    std::copy( std::begin(s), 
               std::find_if( std::begin(s), std::end(s), p ),
               std::back_inserter(r) );
    return r;
}

It also breaks the convention of returning s's type. There's a reason for that. Infinite lists. Consider this Haskell code:

    take 10 [1..] == [1,2,3,4,5,6,7,8,9,10]

[1...] is an infinite list, starting at one. Obviously, it doesn't actually exist in memory. take returns a finite list that does.

The concept of iterators that represent infinite ranges in C++ isn't new, but neither is it common. std::insert_iterator could insert a theoretically infinite number of elements into a container. std::istream_ and ostream_iterator may read from or write to a file infinitely.

We can create pseudo-containers to represent infinite ranges and plug them into take.

template< class X > struct Reader {
    using iterator = std::istream_iterator<X>;

    iterator b;
    iterator e;

    Reader( iterator b = iterator( std::cin ), 
            iterator e = iterator() )
        : b(b), e(e)
    {
    }

    iterator begin() const { return b; }
    iterator end()   const { return e; }
};

// Probably ill-advised, 
// but this is /one/ way of doing IO before main().
std::vector<int> three = take( 3, Reader<int>() );

Sometimes we want to take the contents of an entire container, so dup may be helpful.

std::vector<X> dup( const S& s ) {
    std::vector<X> r;
    std::copy( std::begin(s), 
               std::end(s),
               std::back_inserter(r) );
    return r;
}

std::ifstream in( "in.txt" );
// Reader's constructor implicitly converts in to an iterator.
// Some may consider this bad style and require the constructor be "explicit".
std::vector<int> contents = dup( Reader<int>(in) );

The counterpart to take is drop, but it does not have take's quirks.

template< class S >
S drop( size_t n, const S& s ) {
    return S (
        std::next( std::begin(s), n ),
        std::end(s)
    );
}

// Predicate version
template< class P, class S >
S dropWhile( const P& p, const S& s ) {
    return S (
        std::find_if_not( std::begin(s), std::end(s), p ),
        std::end(s)
    );
}

Reader<int> r = drop( 2, Reader<int>() );

drop makes no promises about infinite lists, but unlike most container- or range-based algorithms, it can work on them. In the above example, two integers are read from std::cin, and their values lost.

For another example of the use of pseudo-containers, consider this solution the the first Euler Project problem using boost::irange.

#include <boost/range/irange.hpp>
void euler1() {
    // multiples of...
    auto three = boost::irange( 3, 1001, 3 );
    auto five  = boost::irange( 5, 1001, 5 );

    // Ensure that the final sum has no duplicates.
    std::vector<int> all;
    std::set_union( std::begin(three), std::end(three),
                    std::begin(five),  std::end(five),
                    std::back_inserter(all) );

    std::cout << "The sum of every multiple of 3 or 5 bellow 1000 :" 
        << std::accumulate( std::begin(all), std::end(all), 0 ) 
        << std::endl;
}


Folding: Reducing a list from many to one. (std::accumulate)

Accumulating is the "imperative" description of folding. Historically, you'd call the variable you update with the results of each calculation the accumulator. To accumulate, then, is to iterate through a sequence, updating the accumulator with each iteration.

Folding is another way to think of it. A fold is a transformation from a list of values to just one value. Haskell defines foldl and foldr, meaning "fold left" and "right".

template< class F, class X, class S >
constexpr X foldl( F&& f, X x, const S& s ) {
    return std::accumulate (
        std::begin(s), std::end(s),
        std::move(x), std::forward<F>(f) 
    );
}

int main() {
    std::vector<int> v = { 5, 3, 2 };
    std::cout << "((10 - 5) - 3) - 2) = " << foldl( std::minus<int>(), 10, v ) << std::endl;
}

foldl is really just another name for accumulate. The accumulation function (here, std::minus) expects the accumulator as the left argument and value to accumulate as its right. foldr is reversed: Not only does it iterate in reverse, but expects the accumulator in the right-hand argument.

// A function wrapper that flips the argument order.
template< class F > struct Flip {
    F f = F();

    constexpr Flip( F f ) : f(std::move(f)) { }

    template< class X, class Y >
    constexpr auto operator () ( X&& x, Y&& y )
        -> typename std::result_of< F(Y,X) >::type
    {
        return f( std::forward<Y>(y), std::forward<X>(x) );
    }
};

template< class F, class X, class S >
constexpr X foldr( F&& f, X x, const S& s ) {
    using It = decltype(std::begin(s));
    using RIt = std::reverse_iterator<It>;
    return std::accumulate (
        // Just foldl in reverse.
        RIt(std::end(s)), RIt(std::begin(s)),
        std::move(x), 
        Flip<F>(std::forward<F>(f))
    );
}

int main() {
    std::vector<int> v = { 5, 3, 2 };
    std::cout << "(2 - (3 - (5-10))) = " << foldr( std::minus<int>(), 10, v ) << std::endl;
}

Folding is great for monoids; types that have a binary operation with, often, the the signature "X(const X&, const X&)".

std::vector<std::string> strs = { "st", "ri", "ng" };
// std::string associated with (+) is a monoid.
std::cout << "'st' + 'ri' + 'ng' = " << 
    foldl( std::plus<std::string>(), std::string(), strs ) << std::endl;

using Func = std::function< int(int) >;

auto comp = []( Func f, Func g ) {
    return [f,g]( int x ){ return f(g(x)); };
};

auto inc = []( int x ) { return x+1; };
auto id  = []( int x ) { return x;   };

std::vector<Func> incs = { inc, inc, inc };
// Functions can be monoids under composition.
std::cout << "(inc . inc . inc)(1) = " << 
    foldl( comp, Func(id), incs )(1) << std::endl;

Functional programmers also like to build lists using fold. They build lists starting at the tail, so they typically prefer foldr to foldl. std::forward_list works like [] in Haskell and linked lists in other functional languages. This snippet simply copies the values from the std::vector, v.

using List = std::forward_list<int>;
auto cons = []( List l, int x ) {
    l.push_front( x );
    return std::move(l);
};

List l = foldr( cons, List(), v );

Note: This one is not an example of a monoid.


Zip and Map: many to many. (std::transform)

To zip two sequences together by some function is the same as calling std::transform. Transform implies modifying each member by some function. Zip implies the same, but with the visual metaphor of combining two lists into one, starting at one end and working up.

template< class F, template<class...>class S, class X, class Y,
          class Res = typename std::result_of< F(X,Y) >::type >
S<Res> zip( F&& f, const S<X>& v, const S<Y>& w ) {
    S<Res> r;
    std::transform( std::begin(v), std::end(v),
                    std::begin(w), 
                    std::back_inserter(r),
                    std::forward<F>(f) );
    return r;
}

int main() {
    std::vector<int> v = { 5, 3, 2 };
    auto doubleV = zip( std::plus<int>(), v, v );
}

Note: The only way I have discovered to write zip variadically is with tuples. Since this article is not on tuples, refer to the definition of transform in "Zipping and Mapping Tuples".

Note2: An in-place version of this function is possible, but showing both general and optimized versions of each function would be redundant, and the topic of optimization is worth discussing on its own.

Mapping is similar to zipping--in fact the two-argument forms of zip(f,xs) and map(f,xs) should be equivalent. The three argument form, like map(f,xs,ys), applies f to every combination of x and y.

    map(f,{x,y},{a,b}) == { f(x,a), f(x,b), f(y,a), f(y,b) }

If xs is size N and ys is of size M, then map(f,xs,ys) returns a sequence of size N x M.

template< class F, template<class...>class S, class X,
          class Res = typename std::result_of< F(X) >::type >
S<Res> map( const F& f, const S<X>& s ) {
    S<Res> r;
    std::transform( std::begin(s), std::end(s),
                    std::back_inserter(r),
                    std::forward<F>(f) );
    return r;
}

template< class F, template<class...>class S, class X, class Y,
          class Res = typename std::result_of< F(X,Y) >::type >
S<Res> map( const F& f, const S<X>& xs, const S<Y>& ys ) {
    S<Res> r;

    for( const X& x : xs ) 
        for( const Y& y : ys )
            r.emplace_back( f(x,y) );

    return r;
}

int main() {
    std::vector<int> v = { 5, 3, 2 };
    std::vector<int> w = { 9, 8, 7 };
    auto sums = map( std::plus<int>(), v, w );
}

map is a bread and butter function in functional programming.

    // Convert a sequence from one type to another:
    auto ints = map( toInt, floats );

    // In a game loop:
    actors = map( update, actors ); 

    // A deck of cards (four suites with twelve values).
    auto deck = map( make_card, suites, value );

    // Making variants of the same thing from simpler data.
    auto inits = { 1, 2, 3, 4 };
    auto cs = map (
        []( int i ) { return std::complex<float>(i,0.1); },
        inits
    );

    // Checking for collisions:
    ColisionObject collisions = map( make_collision, actors, actors );

    // AI:
    states = map( successor, actions, states );

One downfall of map is that it may create redundancies, which makes filter useful in conjunction.

    states = filter (
        state_is_valid,
        map( successor, actions, states )
    );

While this may turn an algorithm from one-pass (update and add if valid) to two-pass (update all states, then filter), it also makes simpler algorithms that can be optimized more easily by the compiler at times. For example,

    for( auto x : xs ) {
        for( auto y : ys ) {
            z = x * y;
            if( pred(z) ) r.push_back(z);
        }
    }

    // or:
    auto r = filter( pred, map(std::multiplies<int>(),xs,ys) );
    
While only profiling can tell in any given instance, the second example may be faster under some circumstances. The compiler may be able to vectorize the call to map, but have difficulties applying the same optimization to the first because it cannot evaluate both the multiplication and predicate in one vectorized step.

Sometimes, the goal is to calculate something given the data, rather than map it. Naively, one might write something like

    auto r = fold( f, map(g,xs) );

But isn't creating the new container inefficient? What if an in-place version of map were implemented, wouldn't transforming xs before folding still be inefficient? Thus, foldMap is useful.
  
template< class Fold, class Map, class X, class S >
X foldMap( const Fold& f, const Map& m, X x, const S& s ) {
    for( auto& y : s )
        x = f( std::move(x), m(y) );
    return x;
}

#include <cctype>
int main() {
    const char* names[] = { "jonh", "mary", "cary" };
    auto appendNames = []( std::string x, std::string y ) {
        return x + " " + y; 
    };
    auto capitolizeName = []( std::string name ) {
        name[0] = std::toupper( name[0] );
        return name;
    };
    std::cout << "Names : " 
        << foldMap (
            appendNames,
            capitolizeName,
            std::string(),
            names
        ) << std::endl;
}



Conclusions.

Haskell's Data.List is actually a lot like <algorithm>, though on a higher level of abstraction. There are some things that can only be done with iterators, but many that can also only be done with whole containers. Data.List gives some good inspiration for helpful algorithms, even in C++.

But unlike in C++, Haskell uses simple linked lists by default and all of Data.List's function work only on linked lists. This gives both Haskell and functional programming a bad name when people compare Haskell code using linked lists to C++ code using std::vector. (See "C++ Benchmark -- std::vector vs. std::list vs. std::deque") When libraries are written to optimize inefficiencies in the linked list, like Data.Text, they re-implement Data.List's interface and often achieve equivalent efficiency to well-optimized C, but not without plenty of redundancy.

In C++, we can write one static interface that is both generic and efficient. Writing functional code does not mean writing slow code. The mathematical nature of these operations can even help the compiler optimize. The high-level interface of Data.List fits snugly atop of the low-level interface of iterators.


Source for this article: https://gist.github.com/4290166

Wednesday, December 12, 2012

Zipping and Mapping tuples.

Previously, I discussed some basic things that can be done with tuples. I showed how a tuple can be applied to a function, however I did not show how member-wise transformations could be done.

The code of this article builds on the code of the prior.


Zipping.

If we have several tuples, what if we want to apply a function to the nth element of each one?

template< template<size_t> class Fi = Get, size_t i,
          class F, class ...T >
constexpr auto zipRow( const F& f, T&& ...t )
    -> decltype( f(Fi<i>()(std::forward<T>(t))...) )
{
    return f( Fi<i>()( std::forward<T>(t) )... );
}

Not hard at all! It basically squishes that row (thinking of t as a column and ti... as a row), using f, into one value. Now, let's say we want to zip together t... into one tuple.

template< template<size_t> class Fi = Get, size_t ...i,
          class Ret, class F, class ...T >
constexpr auto zipIndexList( IndexList<i...>, 
                             const Ret& r, const F& f, T&& ...t )
    -> decltype( r(zipRow<Fi,i>(f,std::forward<T>(t)...)...) )
{
    return r( zipRow< Fi, i >( f, std::forward<T>(t)... )... );
}

template< template<size_t> class Fi = Get,
          class Ret, class F, class T, class ...U,
          class _T = typename std::decay<T>::type,
          class IL = typename IListFrom<_T>::type >
constexpr auto zipTupleTo( const Ret& r, const F& f, T&& t, U&& ...u )
    -> decltype( zipIndexList<Fi>(IL(),r,f,std::forward<T>(t),std::forward<U>(u)...) )
{
    return zipIndexList<Fi>( IL(), r, f, std::forward<T>(t), std::forward<U>(u)... );
}

int main() {
    auto zipped = zipTupleTo( tuple, std::plus<int>(), tuple(1,10), 
                                                       tuple(2,20) );
    std::cout << " 1 +  2 = " << std::get<0>(zipped) << std::endl;
    std::cout << "10 + 20 = " << std::get<1>(zipped) << std::endl;
}

In zipIndexList, r represents the function defining how the output is returned. tuple (gist), from the previous article, is just a function object form of std::make_tuple that can be passed to higher order functions. By supplying it as our r, we're saying "just make it a tuple again."

Since most often, we want to zip back into a tuple, it makes sense to define zipTuple like so:

template< template<size_t> class Fi = Get,
          class F, class ...T >
constexpr auto zipTuple( const F& f, T&& ...t )
    -> decltype( zipTupleTo<Fi>(tuple,f,std::forward<T>(t)...) )
{
    return zipTupleTo<Fi>( tuple, f, std::forward<T>(t)... );
}

zipTuple is to tuples what std::transform is to sequences. The drawback of std::transform is that it only allows for a unary transformation or binary. Let's write a version that accepts any number of arguments.

// We require these polymorphic function objects.
constexpr struct Inc {
    template< class X >
    constexpr X operator () ( X x ) { return ++x; }
} inc{};

constexpr struct Eq {
    template< class X >
    constexpr bool operator () ( const X& a, const X& b ) 
    { return a == b; }
} eq{};

struct Or {
    template< class X >
    constexpr bool operator () ( const X& a, const X& b ) 
    { return a || b; }
};

// Wrapper to dereference arguments before applying.
// indirect : (a -> b) -> (a* -> b)
template< class F > struct Indirect {
    F f = F();

    constexpr Indirect( F f ) : f(std::move(f)) { }

    template< class ...It >
    constexpr auto operator () ( It ...it )
        -> decltype( f(*it...) )
    {
        return f( *it... );
    }
};

template< class F, class I = Indirect<F> > 
constexpr I indirect( F f ) {
    return I( std::move(f) );
}

#include <vector>
#include <algorithm>
template< class F, class ...X,
          class Result = typename std::result_of<F(X...)>::type,
          class Ret = std::vector<Result> >
Ret transform( const F& f, const std::vector<X>& ...vs )
{
    Ret r;

    const auto ends = tuple( vs.end()... );

    // Iterate through each vector in parallel.
    for( auto its  = tuple( vs.begin()... ); 
         // This unrolls to: not (it0==end0 || it1==end1 || ...)
         not foldl( Or(), zipTuple(eq,its,ends) );
         // Increment each iterator.
         its = zipTuple( inc, its ) )
    {
        r.emplace_back (
            applyTuple( indirect(f), its )
        );
    }

    return r;
}

int main() {
    std::vector<int> v = {1,10,100},
                     w = {2,20,200},
                     x = {3,30,300};

    auto vw = transform (
        [](int x, int y, int z){ return x+y+z; }, 
        v, w, x 
    );
    std::cout << "  1 +   2 +   3 = " << vw[0] << std::endl;
    std::cout << " 10 +  20 +  30 = " << vw[1] << std::endl;
    std::cout << "100 + 200 + 300 = " << vw[2] << std::endl;
}

Note: foldl (gist).

Mapping.

Suppose we want to know the results of adding every combination of {1,2,3} with {9,8,7}. We could write a function that cross-applied every variable from each tuple, but slightly more generally, we can start by taking the Cartesian product.

// product : {x,y} x {a,b} -> {{x,a},{x,b},{y,a},{y,b}}
constexpr struct Product {
    // {...xi...} x {...aj...} -> {xi,aj}
    template< size_t i, size_t j, class T, class U >
    static constexpr auto zip( const T& t, const U& u )
        -> decltype( tuple(std::get<i>(t),std::get<j>(u)) )
    {
        return tuple( std::get<i>(t), std::get<j>(u) );
    }

    // {...xi...} x {a0,a1,a2...} -> { {xi,a0}, {xi,a1}, ... }
    template< size_t i, size_t ...j, class T, class U >
    static constexpr auto withI( IndexList<j...>, const T& t, const U& u )
        -> decltype( tuple(zip<i,j>(t,u)...) )
    {
        return tuple( zip<i,j>(t,u)... );
    }
        
    // {x...} x {a...} -> { {x,a}... }
    template< size_t ...i, size_t ...j, class T, class U >
    static constexpr auto withIndexes( IndexList<i...>, IndexList<j...> js,
                                       const T& t, const U& u )
        -> decltype( std::tuple_cat(withI<i>(js,t,u)...) )
    {
        return std::tuple_cat( withI<i>(js,t,u)... );
    }

    template< class T, class U,
              class IL  = typename IListFrom<T>::type,
              class IL2 = typename IListFrom<U>::type >
    constexpr auto operator () ( const T& t, const U& u )
        -> decltype( withIndexes(IL(),IL2(),t,u) )
    {
        return withIndexes( IL(), IL2(), t, u );
    }
} product{};

We can now define a map operation to apply the product.

template< class F > struct ApplyF {
    F f = F();

    constexpr ApplyF( F f ) : f(std::move(f)) { }

    template< class T >
    constexpr auto operator () ( T&& t ) 
        -> decltype( applyTuple(f,std::forward<T>(t)) )
    {
        return applyTuple( f, std::forward<T>(t) );
    }
};

template< class F > 
constexpr ApplyF<F> applyF( F f ) {
    return ApplyF<F>(std::move(f));
}

constexpr struct MapTuple {
    template< class F, class T, class U >
    constexpr auto operator () ( const F& f, const T& t, const U& u )
        -> decltype( zipTuple(applyF(f),product(t,u)) )
    {
        return zipTuple( applyF(f), product(t,u) );
    }
} mapTuple{};

int main() {
    auto sums = mapTuple( std::plus<int>(), tuple(1,2,3), tuple(7,8,9) );
    std::cout << "map (+) (1,2,3) (7,8,9) = ";
    forEach( printItem, sums );
    std::cout << std::endl;
}

This prints out:

map (+) (1,2,3) (7,8,9) = 8 9 10 9 10 11 10 11 12 

Zipping applies elements across from each other. Mapping applies everything to everything. (Note: a unary definition of map would be equivalent to a unary definition of zip.)


Tuples as function environments.

This might seem a little off topic, but Haskell has this neat function, id. It works like this:

    id x = x

Simple, right?

    (id f) x y = f x y = id f x y

 id has this neat property that if applied multiple arguments, it applies the tail arguments to the first. This is an artifact of Haskell's curried notation, but we can emulate this behaviour:

constexpr struct Id {
    template< class X >
    constexpr X operator () ( X&& x ) {
        return std::forward<X>(x);
    }

    template< class F, class X, class ...Y >
    constexpr auto operator () ( const F& f, X&& x, Y&& ...y )
        -> typename std::result_of< F(X,Y...) >::type
    {
        return f( std::forward<X>(x), std::forward<Y>(y)... );
    }
} id{};
 
And now tuples take on a new role: Contained function environments. Consider:

    applyTuple( id, tuple(std::plus<int>(),1,2) );

What does this output? How about

    mapTuple( id, tuple(inc,dec), tuple(5,9) );

    auto pm = tuple(std::plus<int>(),std::minus<int>());
    zipTuple( id, pm, tuple(10,5), tuple(10,5) );

 Or:

    mapTuple( id, pm, tuple(1,2), tuple(3,4);

I leave implementing the three-tuple version of mapTuple as an exercise, but here's a hint: cross( cross({f},{x}), {y}) = {{{f,x},{y}}}, but you need to take it from that to {{f,x,y}}. (Another good exercise might be to write zipTuple in terms of transposition (wiki).)


Conclusions.

This flushes out some basic applications of tuples to functions. applyTuple unpacks a tuple and applies it to a function. foldl and foldr let one apply binary functions to nary tuples, or even singletons (maths concept, not design pattern). zipTuple transforms multiples tuples by a functions, member-wise. mapTuple performs a function for every combination of arguments.

Tuples have unusual mathematical properties compared to other data structures due to the profundity of what they generalize. They can help us shorthand functions to operate in parallel (zip), be passed around as partial or complete function environments, control variadic template parameters, and much, much more that I have either not discussed or yet realized.

One use I haven't discussed, for example, is as a relation, but for an example of this use of tuples, look no further than std::map.

I hope this post has been interesting. Happy coding!


Source for this article: https://gist.github.com/4268029

Monday, December 10, 2012

Fun with tuples.

std::tuple is an odd-but-fun part of the new standard. A lot can be done with them. The members of a tuple can be applied to functions, both in groups and individually. The tuple itself can be treated as a function environment. They can be reorganized, appended, and truncated. The list goes on.

The thing is, what can tuples be used for? Any POD struct or class can be a tuple instead, although this may or may not be desirable. Still, we can write generic algorithms with tuples, whereas we cannot with structs. If we wrote a function that printed any tuple, then any POD we turned into a tuple would suddenly become printable.


Indexing.

Tuples are indexed, which makes accessing them odd.

// A normal struct
struct Vec { int x, y; };
Vec a = { 1, 2 };
a.x = 2; // Normal access.

std::tuple<int,int> b(1,2);
std::get<0>(b) = 2; // Weird access. 

One might think "I'll just write an accessor function!", and that certainly would work. It becomes easier if std::get is made into a function object.

// std::tuple_element<i,T> does not perfect forward.
template< size_t i, class T >
using Elem = decltype( std::get<i>(std::declval<T>()) );

template< size_t i > struct Get {
    template< class T >
    constexpr auto operator () ( T&& t ) 
        -> Elem< i, T >
    {
        return std::get<i>( std::forward<T>(t) );
    }
};

constexpr auto getx = Get<0>();
constexpr auto gety = Get<1>();

getx(b) = 2; // Less weird.

I define Elem because using std::tuple_element returns what the tuple actually holds. std::get<0>(b) would return an int&, but tuple_element<0,decltype(b)>::type would be int.

One might find it useful to index the tuple backwards, so let's define a function, rget.

 template< size_t i, class T, 
          class _T = typename std::decay<T>::type,
          size_t N = std::tuple_size<_T>::value - 1, // Highest index
          size_t j = N - i >
constexpr auto rget( T&& t ) 
    -> Elem< j, T >
{
    return std::get<j>( std::forward<T>(t) );
}

Now we can also define functions to get the first and last elements of any tuple.

constexpr auto head = Get<0>();
constexpr auto last = RGet<0>(); // RGet is defined similarly to Get.

int main() {
    constexpr auto t = std::make_tuple( 1, 'a', "str" );
    std::cout << "head = " << head(t) << std::endl; // prints 1
    std::cout << "last = " << last(t) << std::endl; // prints str
}

Just for fun, let's also write a function that, if the index is too high, wraps around to the begining.

template< size_t i, class T, 
          class _T = typename std::decay<T>::type,
          size_t N = std::tuple_size<_T>::value,
          size_t j = i % N >
constexpr auto mod_get( T&& t ) 
    -> Elem< j, T >
{
    return std::get<j>( std::forward<T>(t) );
} 

Now, let's say we want to call a function for every member of a tuple. We need a way of indexing it and applying some function for each index. It starts with the definition of a type to "hold" a list of indexes.

template< size_t ...i > struct IndexList {};

Now, if given a tuple of size 3, we want an IndexList<0,1,2> to represent each index. There are many solutions for how to do this, but they all have in common being obtuse or difficult to understand. The following solution is designed first and foremost to be obvious and intuitive.

template< size_t ... > struct EnumBuilder;

// Increment cur until cur == end.
template< size_t end, size_t cur, size_t ...i >
struct EnumBuilder< end, cur, i... > 
    // Recurse, adding cur to i...
    : EnumBuilder< end, cur+1, i..., cur >
{
};

// cur == end; the list has been built.
template< size_t end, size_t ...i >
struct EnumBuilder< end, end, i... > {
    using type = IndexList< i... >;
};

template< size_t b, size_t e >
struct Enumerate {
    using type = typename EnumBuilder< e, b >::type;
};

template< class > struct IListFrom;

template< class ...X > 
struct IListFrom< std::tuple<X...> > {
    static constexpr size_t N = sizeof ...(X);
    using type = typename Enumerate< 0, N >::type;
};


Now, a function that applies each index, and one that prints a tuple's elements for testing.

template< size_t i, size_t ...j, class F, class T >
void forEachIndex( IndexList<i,j...>, const F& f, const T& t ) {
    f( std::get<i>(t) );
    
    // Recurs, removing the first index.
    forEachIndex( IndexList<j...>(), f, t ); 
}

template< class F, class T >
void forEachIndex( IndexList<>, const F& f, const T& t ) {
    // No more indexes.
}

template< class F, class T >
void forEach( const F& f, const T& t ) {
    constexpr size_t N = std::tuple_size<T>::value;
    using IL = typename Enumerate<0,N>::type;
    forEachIndex( IL(), f, t );
}

constexpr struct PrintItem {
    template< class X >
    void operator () ( const X& x ) const {
        std::cout << x << ' ';
    }
} printItem{};

int main() {
    constexpr auto t = std::make_tuple( 1, "and", 2 );
    std::cout << "t = "; 
    forEach( printItem, t ); // Prints "1 and 2"
    std::cout << std::endl;
}

Applying a tuple to a function.

This is probably one of the most pondered questions about tuples. "How do I apply one to a function?" This is easy since we already have IndexList defined.

template< size_t ...i, class F, class T >
constexpr auto applyIndexList( IndexList<i...>, F f, T&& t )
    -> typename std::result_of< F( Elem<i,T>... ) >::type
{
    return f( std::get<i>(std::forward<T>(t))... );
}

template< class F, class T,
          class _T = typename std::decay<T>::type;
          class IL = typename IListFrom<_T>::type >
constexpr auto applyTuple( const F& f, T&& t )
    -> decltype( applyIndexList(IL(),f,std::forward<T>(t)) )
{
    return applyIndexList( IL(), f, std::forward<T>(t) );
}

// Functional programmers may recognize this as cons.
constexpr struct PushFront {
    template< class ...X, class Y >
    constexpr auto operator () ( std::tuple<X...> t, Y y )
        -> std::tuple< Y, X... >
    {
        return std::tuple_cat( tuple(std::move(y)), std::move(t) );
    }
} pushFront{};

// Chain Left.
constexpr struct ChainL {
    template< class F, class X >
    constexpr X operator () ( const F&, X x ) {
        return x;
    }

    template< class F, class X, class Y, class ...Z >
    constexpr auto operator () ( const F& b, const X& x, const Y& y, const Z& ...z) 
        -> decltype( (*this)(b, b(x,y), z... ) )
    {
        return (*this)(b, b(x,y), z... );
    }
} chainl{};

// Fold Left.
constexpr struct FoldL {
    // Given f and {x,y,z}, returns f( f(x,y), z ).
    template< class F, class T >
    constexpr auto operator () ( const F& f, const T& t ) 
        -> decltype( applyTuple(chainl,pushFront(t,f)) )
    {
        return applyTuple( chainl, pushFront(t,f) );
    }
} foldl{};

auto ten = foldl( std::plus<int>(), std::make_tuple(1,2,3,4) );

We can call applyIndexList with different index lists to get interesting results.

// Because std::make_tuple can't be passed 
// to higher order functions.
constexpr struct MakeTuple {
    template< class ...X >
    constexpr std::tuple<X...> operator () ( X ...x ) {
        return std::tuple<X...>( std::move(x)... );
    }
} tuple{}; // function tuple that construct std::tuples.

// Returns the initial elements. (All but the last.)
// init( {1,2,3} ) = {1,2}
template< class T,
          size_t N = std::tuple_size<T>::value, 
          class IL = typename Enumerate< 0, N-1 >::type >
constexpr auto init( const T& t )
    -> decltype( applyIndexList(IL(),tuple,t) )
{
     // Construct a new tuple from the initial indexes.
     return applyIndexList( IL(), tuple, t );
}

// Returns a new tuple with every value from t except the first.
// tail( {1,2,3} ) = {2,3}
template< class T,
          size_t N = std::tuple_size<T>::value, 
          class IL = typename Enumerate< 1, N >::type >
constexpr auto tail( const T& t )
    -> decltype( applyIndexList(IL(),tuple,t) )
{
    return applyIndexList( IL(), tuple, t );
} 

Remember Get and RGet from above? They're templated function objects based on an index. We can write a more generic applyIndexList that allows specifying such a function and without losing the default behaviour.

template< template<size_t> class Fi = Get, size_t ...i, class F, class T >
constexpr auto applyIndexList( IndexList<i...>, const F& f, T&& t )
    -> typename std::result_of< F( 
        typename std::result_of< Fi<i>(T) >::type...
    ) >::type
{
    return f( Fi<i>()(std::forward<T>(t))... );
}

template< template<size_t> class Fi = Get, class F, class T,
          class _T = typename std::decay<T>::type,
          class IL = typename IListFrom<_T>::type >
constexpr auto applyTuple( const F& f, T&& t )
    -> decltype( applyIndexList<Fi>(IL(),f,std::forward<T>(t)) )
{
    return applyIndexList<Fi>( IL(), f, std::forward<T>(t) );
}

// Reconstruct t in reverse.
template< class T >
constexpr auto reverse( const T& t ) 
    -> decltype( applyTuple<RGet>(tuple,t) )
{
    return applyTuple< RGet >( tuple, t );
}

// Fold Right.
constexpr struct FoldR {
    // Given f and {x,y,z}, returns f( f(z,y), x ).
    template< class F, class T >
    constexpr auto operator () ( const F& f, const T& t ) 
        -> decltype( foldl(f,reverse(t)) )
    {
        return foldl( f, reverse(t) );
    }
} foldr{};


This leaves us with two ways of transforming tuples: modifying the index list and defining an alternative get function. foldr and foldl give us a way to fold a tuple into one value.


Tuples and functions.

Perhaps we have a function that takes a tuple, but its arguments are not tuples. Haskell defined a function, curry, to transform the function from a pair-taking one to a two-argument function. Haskell does not have the same expressiveness with variadic types, so they can't write this more general C++ version.

// curry : ( (a,b...) -> c ) x a x b... -> c
template< class F, class ...X >
constexpr auto curry( const F& f, X&& ...x )
    -> decltype( f( std::forward_as_tuple(std::forward<X>(x)...) ) )
{
    return f( std::forward_as_tuple(std::forward<X>(x)...) );
}

// Pair Sum.
// psum : (int,int) -> int
unsigned int psum( std::tuple<int,int> p ) {
    return head(p) + last(p);
}

auto five = curry( psum, 3, 2 );
Haskell also defines uncurry, which is the same as applyTuple. The most important distinction between Haskell's curry and uncurry and this is that Haskell's curry is a unary higher order function, whereas this is binary. However, the two are the same if one considers the unary version a partial application of the binary one.

Tuples can be used as a sort of partial application on their own. One might store some values in a tuple, and add more arguments later to be applied to some function. For a somewhat contrived example, consider the following:

constexpr struct PushBack {
    template< class ...X, class Y >
    constexpr auto operator () ( std::tuple<X...> t, Y y )
        -> std::tuple< X..., Y >
    {
        return std::tuple_cat( std::move(t), tuple(std::move(y)) );
    }
} pushBack{};

#include <cmath>

// Quadratic root.
constexpr struct QRoot {
    using result = std::tuple<float,float>;

    result operator () ( float a, float b, float c ) const {
        float root = std::sqrt( b*b - 4*a*c );
        float den  = 2 * a;
        return result( (-b+root)/den, (-b-root)/den );
    }
} qroot{};

std::ostream& operator << ( std::ostream& os, const QRoot::result r ) {
    return os << std::get<0>(r) << " or " << std::get<1>(r);
}

int main() {
    auto ab = std::make_tuple( 1, 3 );
    auto qroot_ab = [&] ( float c ) {
        return applyTuple( qroot, pushBack(ab,c) );
    };
    std::cout << "qroot(1,3,-4) = " << qroot_ab(-4) << std::endl;
    std::cout << "qroot(1,3,-5) = " << qroot_ab(-5) << std::endl;

    auto bc = std::make_tuple( 3, -4 );
    auto qroot_bc = [&] ( float a ) {
        return applyTuple( qroot, pushFront(bc,a) );
    };
    std::cout << "qroot(1,3,-4) = " << qroot_bc(1) << std::endl;
    std::cout << "qroot(1,3,-5) = " << qroot_bc(2) << std::endl;
}

One persuasive use-case of tuples is being able to freely manipulate variadic parameters.

template< class ...X >
constexpr auto third_arg( X&& ...x )
    -> Elem< 2, std::tuple<X...> >
{
    return std::get<2>( std::forward_as_tuple(std::forward<X>(x)...) );
}

Variadic parameters can be forwarded into a tuple, meaning anything we can do with tuples, we can do with variadic parameters. Arguments can be compacted into tuples, modified, reordered, and re-expanded into functions. One annoyance with the ... is that it eats up everything to the right. The following is ill-formed.

template< class ...X, class F >
constexpr auto backwards( X&& ...x, const F& f )
    -> typename std::result_of< F(X...) >::type
{
    return f( std::forward<X>(x)... )
}

I leave the solution as an exercise for the reader.


Conclusions.

This article is introductory, at best. I must admit I have much less practice with them than other C++11 features, but this seems to be true of GCC, too! Attempting to compile the following will cause an internal error since at least 4.7.

#include <tuple>

template< class X > struct Hold {
 X x;
 constexpr Hold( X x ) : x(std::move(x)) { }
};

constexpr auto a = Hold<int>( 1 ); //ok
auto b = Hold<std::tuple<char>>( std::tuple<char>('c') ); // not
constexpr auto c = Hold<std::tuple<char>>( std::tuple<char>('c') ); // not

int main() {} 

I believe it's quite possible that whole programs could be written with tuples and basic data types, whether or not it should be preferred. We could look at them as a general utility class, but I think it would be appropriate to see them as lightweight, generalized, composable, structs. I plan on writing a follow-up to cover applying multiple tuples, transforming tuples by functions, and a few other things. If anyone has any interesting use-cases or neat tricks for tuples, I'd love to hear about it in the comments.  

The next article covers element-wise application, applying multiple tuples, and more: "Zipping and Mapping Tuples".


Source code: https://gist.github.com/4256092  
Tuple reference: http://en.cppreference.com/w/cpp/utility/tuple

Sunday, December 9, 2012

GCC 4.8 Has Automatic Return Type Deduction.

Note: GCC 4.8 is still in development; this article is based on Ubuntu's snapshot package of 4.8. I do not know about availability on other platforms. I say "has" because it does work and code can be written using it right now, even if it's in testing.  

Update: It turns out this feature has been implemented to test n3386. You can read the discussion and even see the patch on the mailing list: http://gcc.gnu.org/ml/gcc-patches/2012-03/msg01599.html

Are your two favourite C++11 features decltype and declval? I have mixed feelings. On one hand, it lets me write code like this

template< class X, class Y >
constexpr auto add( X&& x, Y&& y )
    -> decltype( std::declval<X>() + std::declval<Y>() )
{
    return std::forward<X>(x) + std::forward<Y>(y);
}

and know that it will work on any type and do the optimal thing if x or y should be moved or copied (like if X=std::string). On the other hand, it's tedious. "forward" and "declval" are both seven letter words that have to be typed out every time for every function, per variable. Then there's the std:: prefix and <X>(x) suffix. The only benefit of using declval over forward is a savings of one letter not typed.

But someone must have realized there's a better way. If the function is only one statement, and the return type is the declval of that statement, couldn't the compiler just figure out what I mean when I say this?

template< class X, class Y >
constexpr auto add( X&& x, Y&& y ) {
    return std::forward<X>(x) + std::forward<Y>(y);
}

March of this year, one Jason Merrill proposed just this (n3386) and GCC has already implemented it in 4.8 (change log), though it requires compiling with -std=c++1y. One can play with 4.8 on Ubuntu with gcc-snapshot. (Note that it doesn't modify your existing gcc install(s) and puts it in /usr/lib/gcc-snapshot/bin/g++. Also, I have been unable to install any but the third-to-most recent package.) I hope it is not too much trouble to install on other distros/platforms.

So if your favourite c++11 feature is decltype and declval, prepare to never use them again. The compiler can deduce the type for you implicitly, and better, and it works even if the function is longer than one line. Take for example, reasonably complex template functions like the liftM function I wrote for "Arrows and Keisli":

constexpr struct LiftM {
    template< class F, class M, class R = Return<typename std::decay<M>::type> >
    constexpr auto operator () ( F&& f, M&& m )
        -> decltype( std::declval<M>() >>= compose(R(),std::declval<F>()) )
    {
        return std::forward<M>(m) >>= compose( R(), std::forward<F>(f) );
    }

    template< class F, class A, class B >
    constexpr auto operator () ( F&& f, A&& a, B&& b )
        -> decltype( std::declval<A>() >>= compose (
                rcloset( LiftM(), std::declval<B>() ),
                closet(closet,std::declval<F>())
            ) )
    {
        return std::forward<A>(a) >>= compose (
            rcloset( LiftM(), std::forward<B>(b) ),
            closet(closet,std::forward<F>(f))
        );
    }
} liftM{};

Could be written instead:

constexpr struct LiftM {
    template< class F, class M >
    constexpr auto operator () ( F&& f, M&& m ) {
        using R = Return< typename std::decay<M>::type >;
        return std::forward<M>(m) >>= compose( R(), std::forward<F>(f) );
    }

    template< class F, class A, class B >
    constexpr auto operator () ( F&& f, A&& a, B&& b ) {
        return std::forward<A>(a) >>= compose (
            rcloset( LiftM(), std::forward<B>(b) ),
            closet(closet,std::forward<F>(f))
        );
    }
} liftM{};

Automatic type deduction didn't exactly make this function more obvious or simple, but it did remove the visual cruft and duplication of the definition. Now, if I improve this function to make it more clear, I won't have a decltype expression to have to also edit.

To be fair, this doesn't entirely replace decltype. auto doesn't perfect forward. But it seems to work as expected, most of the time.

For another example of the use-case of auto return type deduction, consider this program:

#include <tuple>

int main() {
    auto x = std::get<0>( std::tuple<>() );
}

This, small, simple, and obviously wrong program generates an error message 95 lines long. Why? GCC has to check make sure this isn't valid for the std::pair and std::array versions of get, and when it checks the tuple version, it has to instantiate std::tuple_element recursively to find the type of the element. It actually checks for the pair version first, so one has to search the message for the obviously correct version and figure out why it failed. A simple one-off bug in your program could cause a massive and difficult to parse error message. We can improve this simply.

#include <tuple>

template< unsigned int i, class T >
auto get( T&& t ) {
    using Tup = typename std::decay<T>::type;
    static_assert( i < std::tuple_size<Tup>::value,
                   "get: Index too high!" );
    return std::get<i>( std::forward<T>(t) );
}

int main() {
    int x = get<0>( std::tuple<>() );
}

How much does this shrink the error by? Actually, it grew to 112 lines, but right at the top is

auto.cpp: In instantiation of 'auto get(T&&) [with unsigned int i = 0u; T = std::tuple<>]':
auto.cpp:13:36:   required from here
auto.cpp:7:5: error: static assertion failed: get: Index too high!

The error message might be a little bigger, but it tells you right off the bat what the problem is, meaning one has less to parse.

Similar to this static_assert example, typedefs done as default template arguments can be moved to the function body in many cases.

template< class X, class Y = A<X>, class Z = B<Y> >
Z f( X x ) {
    Z z;
    ...
    return z;
}

can now be written

template< class X > // Simpler signature.
auto f( X x ) {
    using Y = A<X>; // Type instantiation logic
    using Z = B<Y>; // done in-function.
    Z z;
    ...
    return z;
}

Looking forward.

This release of GCC also implements inheriting constructors, alignas, and attribute syntax. It also may have introduced a few bugs; for example, my library, which compiles with 4.7, does not with 4.8, producing many undefined references.

The other features of this release might not be quite so compelling, but automatic type deduction alone is one powerful enough to change the way coding is done in C++--again. Heavily templated code will become a breeze to write and maintain as figuring out the return type is often the hardest part. I find it encouraging that this has been implemented so quickly. Of coarse, it's not standard, at least not yet.

On a final note, I wonder how this will interact with the static if. It would be nice, were the following well-formed:

template< class X >
auto f( X x ) {
    if( std::is_same<int,X>::value )
        return "Hi"; // Int? Return string.
    else
        return x / 2; // Other? halve.
}

Saturday, December 1, 2012

std::move and lambda? It's just partial application!

So, "Learn how to capture by move" was posted on isocpp.org and then this morning, "Another Alternative to lambda move capture" was on reddit. These are both impressive solutions and I look forward to seeing what other creative answers the C++ community will come up with. I also want to go over how this problem can be solved with partial application.

First, here's the definition.

template< class F, class X >
struct Part {
    F f;
    X x;

    template< class _F, class _X >
    constexpr Part( _F&& f, _X&& x )
        : f(forward<_F>(f)), x(forward<_X>(x))
    {
    }

    template< class ... Xs >
    auto operator () ( Xs&& ...xs ) 
        -> decltype( f(x,declval<Xs>()...) )
    {
        return f( x, forward<Xs>(xs)... );
    }
};

template< class F, class X >
constexpr Part<F,X> closet( F f, X x ) {
    return Part<F,X>( std::move(f), std::move(x) );
}

So, there's this thing called the upward funarg problem. Let's say we wrote a function like this:

std::function<std::vector<int>()> bad_originate() {
    std::vector<int> v = { 1, 2, 3, 4, 5 };
    return [&]{ return v; };
}
 
auto bad = bad_originate();
bad(); // Runtime error!

This fails because the reference to v becomes invalidated when it goes out of scope. We could copy v into the lambda, but our goal is a move.

So can we fix this with partial application? Let's add a bit to the example by including a function to partially apply.

std::vector<int> add_all( std::vector<int>& v, int x ) {
    // Arbitrarily move v.
    auto w = std::move(v);

    std::transform( w.begin(), w.end(), w.begin(),
                    closet(std::plus<int>(),x) );

    return w;
}

using Origin = Part<decltype(add_all)*,std::vector<int>>;
Origin originate() {
    std::vector<int> v = { 1, 2, 3, 4, 5 };
    return closet( add_all, std::move(v) );
}

And it ain't hard at all.

Origin f = originate();
print_vec( "original : ", f.x );
print_vec( "added 10 : ", f(10) );
print_vec( "original : ", f.x );

will output

    original :  = 1 2 3 4 5
    added 10 :  = 11 12 13 14 15
    original :  =


Maybe one would prefer that add_all take its argument by value; now, composition becomes useful.

template< class F, class G > struct Composition {
    F f = F();
    G g = G();

    Composition() { }
    Composition( F f, G g) 
        : f(move(f)), g(move(g)) { }

    template< class X, class ...Y >
    auto operator () ( X&& x, Y&& ...y ) 
        -> decltype( f(g(declval<X>()), declval<Y>()...) )
    {
        return f( g( forward<X>(x) ), forward<Y>(y)... );
    }
};

template< class F, class G > 
Composition<F,G> comp( F f, G g ) {
    return Composition<F,G>( std::move(f), std::move(g) );
}

constexpr struct Mover {
    template< class X >
    X&& operator () ( X& x ) const {
        return std::move(x);
    }
} mover{}; 

std::vector<int> add_all2( std::vector<int> w, int x ) {
    std::transform( w.begin(), w.end(), w.begin(),
                    closet(std::plus<int>(),x) );

    return w;
}
 
using MoveAddAll = decltype( comp(add_all2,mover) );
using Origin2 = Part< MoveAddAll, std::vector<int> >;
Origin2 originate2() {
    std::vector<int> v = { 1, 2, 3, 4, 5 };
    return Origin2( comp(add_all2,mover), std::move(v) );
}

and

Origin2 g = originate2();
print_vec( "original : ", g.x );
print_vec( "added 10 : ", g(10) );
print_vec( "original : ", g.x );

will output the same as above.

The code to this mini-article can be found here: https://gist.github.com/4182570

Friday, November 30, 2012

Arrows and Kleisli in C++

Control.Arrow shows an odd-but-interesting part of Haskell. Arrows are functions; composable and callable. Arrows and Monads sometimes seem to be referred to as alternatives to each other. Perhaps now would be a good time to relate some category theory.

From A to B.

A category theorist might view functions as a transformation from one type to another. For example,

    std::to_string : X -> std::string

would mean that to_string is a function that maps an X to a string.

In "Generic Function Objects", I talked about type constructors.

/* MakeT T : X -> T<X> */
template< template<class...> class T > struct MakeT {
    template< class ...X, class R = T< typename std::decay<X>::type... > >
    constexpr R operator () ( X&& ...x ) {
        return R( std::forward<X>(x)... );
    }
};

/* pair : X x Y -> std::pair<X,Y> */
constexpr auto pair = MakeT<std::pair>();

I use the notation X x Y to mean that MakeT<pair> takes two arguments. Though, in Haskell, it would actually look like this:

    make_pair :: X -> Y -> (X,Y)

Haskell uses curried notation.

    g : pair<X,A> -> pair<X,B>

Here, g is a function that transforms the second element of the pair to a B, but leaves the first as an X. Although this cannot be inferred from the definition, let's assume that the first value (X) is not transformed in any way. There is a function that represents this non-transformation.

/* id : X -> X */
constexpr struct Id {
    template< class X >
    constexpr X operator () ( X&& x ) {
        return std::forward<X>(x);
    }
} id{};

Arrows provide the tools to take a normal function, f : A -> B, and convert it to a function like g. This is sort of like composition

/* compose : (B -> C) x (A -> B) -> (A -> C) */
template< class F, class G >
struct Composition
{
    F f; G g;

    template< class _F, class _G >
    constexpr Composition( _F&& f, _G&& g ) 
        : f(std::forward<_F>(f)), g(std::forward<_G>(g)) { }

    template< class X >
    constexpr auto operator() ( X&& x) 
        -> decltype( f(g(std::declval<X>())) )
    {
        return f( g( std::forward<X>(x) ) );
    }
};

constexpr auto compose = MakeT<Composition>();

If we look at A -> B as a type itself, say AB, and B -> C as a type, BC, then it is clear that composition is BC x AB -> AC. Functions themselves are values with types and composition creates a new value and type. We can think of them as being on a geometric plain with the points A, B, and C. Functions are connections from point-to-point. Similarly, we can think of AB, BC, and AC as points on a different plain with arrows connecting AB and BC to AC (though AB and BC cannot be connected).

 

Functions as Arrows.

Arrows have several operations. The first is arr, which transforms a function to an arrow; but since functions are arrows, it isn't useful, right off the bat. first and second take functions, A -> B, and lift them to pair-oriented functions. fan takes two functions, one A -> B, another A -> C, and returns a function to pair<B,C>. split takes two functions as well, A -> B and X -> Y, and returns a function pair<A,X> -> pair<B,Y>.

    arr : (A -> B) -> (A -> B)
    first : (A -> B) -> (pair<A,X> -> pair<B,X>)
    second : (A -> B) -> (pair<X,A> -> pair<X,B>)
    split : (A -> B) x (X -> Y) -> (pair<A,X> -> pair<B,Y>)
    fan : (A -> B) x (A -> C) -> (A -> pair<B,C>)

First, we fill in the declarations.

template< class A, class F, class Arr = Arrow< Cat<A> > >
constexpr auto arr( F&& f ) ->  decltype( Arr::arr( std::declval<F>() ) )
{
    return Arr::arr( std::forward<F>(f) );
}

template< class A > struct Arr {
    template< class F >
    constexpr auto operator () ( F&& f ) -> decltype( arr(std::declval<F>()) )
    {
        return arr( std::forward<F>(f) );
    }
};

constexpr struct Split {
    template< class F, class G, class A = Arrow<Cat<F>> >
    constexpr auto operator () ( F&& f, G&& g )
        -> decltype( A::split(std::declval<F>(), std::declval<G>()) )
    {
        return A::split( std::forward<F>(f), std::forward<G>(g) );
    }
} split{};

constexpr struct Fan {
    template< class F, class G, class A = Arrow<Cat<F>> >
    constexpr auto operator () ( F&& f, G&& g )
        -> decltype( A::fan(std::declval<F>(),std::declval<G>()) )
    {
        return A::fan( std::forward<F>(f), std::forward<G>(g) );
    }
} fan{};

constexpr struct First {
    template< class F, class A = Arrow<Cat<F>> >
    constexpr auto operator () ( F&& f ) 
        -> decltype( A::first(std::declval<F>()) ) 
    {
        return A::first( std::forward<F>(f) );
    }
} first{};

constexpr struct Second {
    template< class F, class A = Arrow<Cat<F>> >
    constexpr auto operator () ( F&& f ) -> decltype( A::second(std::declval<F>()) ) {
        return A::second( std::forward<F>(f) );
    }
} second{};

arr will be trivial to implement, but the others are tricky. Luckily, we can define it mostly in terms of split--it looks like this:

/* pairCompose : (A -> B) x (X -> Y) -> (pair<A,X> -> pair<B,Y>) */
template< class F, class G > struct PairComposition {
    F f; G g;

    template< class _F, class _G >
    constexpr PairComposition( _F&& f, _G&& g )
        : f(std::forward<_F>(f)), g(std::forward<_G>(g))
    {
    }

    template< class P/*air*/ >
    constexpr auto operator() ( const P& p ) 
        -> decltype( std::make_pair( f(std::get<0>(p)), g(std::get<1>(p)) ) )
    {
        return std::make_pair( f(std::get<0>(p)), g(std::get<1>(p)) );
    }
};

constexpr auto pairCompose = MakeT<PairComposition>();

pairCompose returns a function expecting a pair and returns a pair, threading the first value through f and the second through g. We can compose PairCompositions.

namespace std {
std::string to_string( const std::string& s );
template< class X, class Y >
std::string to_string( const std::pair<X,Y>& p );

std::string to_string( const std::string& s ) {
    return "\"" + s + "\"";
}

template< class X, class Y >
std::string to_string( const std::pair<X,Y>& p ) {
    return "(" + to_string(p.first) + "," + to_string(p.second) + ")";
}
}

constexpr struct ToString {
    template< class X >
    std::string operator () ( const X& x ) const {
        return std::to_string(x);
    }
} string{};

int main() {
    using std::cin;
    using std::cout;
    using std::endl;

    auto plus2 = []( int x ){ return x+2; };

    std::pair<int,int> p( 1, 1 );

    cout << "((+2) . string, (+4))( 1, 1 ) = " 
         << compose( pairCompose(string,plus2), 
                     pairCompose(plus2, plus2) )(p) << endl;
}

This will output ("3",5). It's easiest to look at this as p.first and p.second being on two separate paths. I have written it such that the individual paths are verticle. The first path starts at 1, and goes to plus2(1), to show(plus2(1)). The second path starts at 2 and ends at plus2(plus2(2)). The odd part is that we're defining both paths at once.

 Observe that first(f) = split(f,id). Proof?

    first : (A -> B) -> (pair<A,X> -> pair<B,X>)
    split : (A -> B) x (X -> Y) -> (pair<A,X> -> pair<B,Y>)  
    id : X -> X
    f : A -> B 
    split(f,id) : pair<A,X> -> pair<B,X>

Since we know first(f) = split(f,id), we can intuit that second(f) = split(id,f) and also, split(f,g) = compose( first(f), second(g) ).

fan represents a fork in the road. One variable gets fed into two functions, the results of which get zipped into a pair. duplicate will do the splitting, but we'll rely on split to implement fan.

constexpr struct Duplicate {
    template< class X, class P = std::pair<X,X> > 
    constexpr P operator() ( const X& x ) {
        return P( x, x );
    }
} duplicate{};


With this, we can say that fan(f,g)(x) = split(f,g)( duplicate(x) ). Since we know what everything looks like, we can define Arrow<F>.

template< class Func > struct Arrow<Func> {
    template< class F >
    static constexpr F arr( F&& f ) { return std::forward<F>(f); }

    /* split(f,g)(x,y) = { f(x), g(y) } */
    template< class F, class G >
    static constexpr auto split( F f, G g ) 
        -> PairComposition<F,G>
    {
        return pairCompose( std::move(f), std::move(g) );
    }

    /*
     * first(f)(x,y)  = { f(x), y }
     * second(f)(x,y) = { x, f(y) }
     */
     
    template< class F >
    static constexpr auto first( F f ) 
        -> decltype( split(std::move(f),id) )
    {
        return split( std::move(f), id );
    }

    template< class F >
    static constexpr auto second( F f ) 
        -> decltype( split(id,std::move(f)) )
    {
        return split( id, std::move(f) );
    }

    /* fan(f,g)(x) = { f(x), g(x) } */
    template< class F, class G >
    static constexpr auto fan( F f, G g ) 
        -> decltype( compose( split(std::move(f),std::move(g)), duplicate ) )
    {
        return compose( split(std::move(f),std::move(g)), duplicate );
    }
};

Now, we can rewrite the above example:

int main() {
    auto plus2 = []( int x ){ return x+2; };
 
     cout << "(+2) *** (+2) >>> string *** (+2) $ 1 = " 
         << compose( split(string, plus2), 
                     fan(  plus2,  plus2) )(1) << endl;
}

One way to think of Arrows is as paths. fan represents a fork in the road, split defines two separate, but parallel, paths at once, and first and second allow progress on one of the paths. For an arbitrary example, consider a branching path that hides some treasure. What sequence of moves are required to recover it?

/* 
 * Get 0 : (X,Y) -> X
 * Get 1 : (X,Y) -> Y
 */
template< size_t N > struct Get {
    template< class P >
    constexpr auto operator () ( P&& p )
        -> decltype( std::get<N>(std::declval<P>()) )
    {
        return std::get<N>( std::forward<P>(p) );
    }
};

int main() {
    constexpr auto fst = Get<0>();
    constexpr auto snd = Get<1>();
    constexpr auto oneHundred = []( int x ){ return 100; };
    
    // Hide the hundred.
    // hidden = pair( pair( 0, pair(100,0) ), 0 )
    auto hidden = fan( fan(id,fan(oneHundred,id)), id )( 0 );
    // Function to find it again.
    auto find = compose( fst, compose(snd,fst) );
    
    cout << "I found " << find(hidden) << "!" << endl;
}


Enter Kleisli.

 This all works great for free functions, but some functions look like this:

    f : X -> M<Y> -- Where M is some Monad

 This f is a member of the Kleisli category. It accepts a normal value and returns a Monadic one. Examples of Kelisli functions (psuedocode):

    Just : X -> unique_ptr<X>
    Just = MakeT<unique_ptr>();

    Seq : X -> std::vector<X>
    Seq(x) = std::vector<X>{x} 

    mreturn<M> : X -> M<X>

unique_ptrs and vectors are monads, so any function that produces one from some x can be considered in the Kleisli category. Though, to the compiler, there's no logic to that, so we define a wrapper type around the function. This is a fairly common pattern, so I define a base class for it.

template< class F > struct Forwarder {
    using function = typename std::decay<F>::type;
    function f = F();

    template< class ...G >
    constexpr Forwarder( G&& ...g ) : f( std::forward<G>(g)... ) { }

    template< class ...X >
    constexpr auto operator() ( X&& ...x )
        -> decltype( f(std::declval<X>()...) )
    {
        return f( std::forward<X>(x)... );
    }

    constexpr operator function () { return f; }
};

The Kleisli itself can be implemented two ways. The Haskell way would be something like this:

/* Kleisli M A B : A -> M<B> */
template< template<class...> class M, class A, class B,
          class F = std::function<M<A>(B)> >
struct Kleisli : Forwarder<F> {
    template< class ...G >
    constexpr Kleisli( G&& ...g ) : Forwarder<F>(std::forward<G>(g)...) { }
};

This is faithful to how Haskell defines it.

    newtype Kleisli m a b = Kleisli { runKleisli :: a -> m b }

But just think about this for a moment. A Kleisli is an Arrow, right? What would Arrow<Kleisli>::first return?

    first : (A -> B) -> (pair<A,X> -> pair<B,X>)
    Kleisli f = A -> M<B>
    first(Kleisli f) : (A -> M B) -> (pair<A,X> -> M pair<B,X>)

What's the type of X? It's truly impossible to know because it depends on what gets passed in, which is what the notation above means.

Is it impossible to define Kleisli in this way? I don't know. I attempted to specialize its composition function for when A or B were pair types, but there are four combinations of whether A or B or both is a pair. I tried assigning X the type of std::placeholders::_1, but none of my attempts to make it really work compiled. (It was horrible.)

But we don't have any of that trouble if we define Kleisli differently.


Kleisli<F>.

/* Kleisli M F : F -- where F : A -> M<B> */ 
template< template<class...> class M, class F = Id >
struct Kleisli : Forwarder<F> {
    template< class ...G >
    constexpr Kleisli( G&& ...g ) : Forwarder<F>(std::forward<G>(g)...) { }
};

An implicit requirement of arrows is that they can be composed, but Kleislies?

    compseKleisli : Kleisli(B -> M<C>) x Kleisli(A -> M<B>) -> (...?)

We can reasonably assume that it should be Kleisli(A -> M<C>), but our naive definition of composition must be specialized. Literally.

/* Composition : Kleisli(B -> M<C>) x Kleisli(A -> M<B>) -> (A -> M<C>) */
template< template<class...> class M, class F, class G >
struct Composition<Kleisli<M,F>,Kleisli<M,G>>
{
    Kleisli<M,F> f; 
    Kleisli<M,G> g;

    template< class _F, class _G >
    constexpr Composition( _F&& f, _G&& g ) 
        : f(std::forward<_F>(f)), g(std::forward<_G>(g)) { }

    template< class X >
    constexpr auto operator() ( X&& x ) 
     -> decltype( g(std::forward<X>(x)) >>= f )
    {
        return g(std::forward<X>(x)) >>= f;
    }
};

/* kcompose : Kleisli(B -> M<C>) x Kleisli(A -> M<B>) -> Kleisli(A -> M<C>) */
constexpr struct KCompose {
    template< template<class...> class M, class F, class G >
    constexpr auto operator () ( Kleisli<M,F> f, Kleisli<M,G> g )
        -> Kleisli< M, Composition<Kleisli<M,F>,Kleisli<M,G> >
    {
        return kleisli<M> ( 
            compose( std::move(f), std::move(g) )
        );
    }
} kcompose{};
 
int main() {
    auto stars = kleisli<std::basic_string> (
        [] (char c) -> std::string { 
            return c == '*' ? "++" :
                   c == '+' ? "* *" : std::string{c};
        }
    );
    
    auto starsSqr = kcompose( stars, stars );
    
    auto starsCube = kcompose( starsSqr, stars );
    
    cout << "stars of '*' : " << stars('*') << endl;
    cout << "stars^2 of '*' : " << starsSqr('*') << endl;
    cout << "stars^3 of '*' : " << starsCube('*') << endl;
}

This outputs  

    stars of '*' : "++"
    stars^2 of '*' : "* ** *"
    stars^3 of '*' : "++ ++++ ++"


Like before, it would be most convenient to define Arrow<Kleisli> in terms of split. split(f,g), given the pair {x,y}, will have to pass x into f and y into g, both of which will return Monadic values. Finally, a pair will have to be constructed from the values extracted from f(x) and g(y).

    split: Kleisli (A -> M B) x Kleisli (X -> M Y) -> Kleisli (pair<A,X> -> M pair<B,Y>)

To extract the values from f(x) and g(y), we need to call mbind on each, which in Haskell might look like this:

    f(x) >>= (\x' -> g(y) >>= (\y' -> return (x,y)))

Or, Control.Monad defines liftM.

    liftM2 (,) f(x) g(y) -- where (,) is equivalent to make_pair

template< class M > struct Return {
    template< class X >
    constexpr auto operator () ( X&& x ) 
        -> decltype( mreturn<M>(std::declval<X>()) )
    {
        return mreturn<M>( std::forward<X>(x) );
    }
}; 
 
/*
 * liftM : (A -> B) x M<A> -> M<B>
 * liftM : (A x B -> C) x M<A> x M<B> -> M<C>
 */
constexpr struct LiftM {
    template< class F, class M, class R = Return<typename std::decay<M>::type> >
    constexpr auto operator () ( F&& f, M&& m )
        -> decltype( std::declval<M>() >>= compose(R(),std::declval<F>()) )
    {
        return std::forward<M>(m) >>= compose( R(), std::forward<F>(f) );
    }

    template< class F, class A, class B >
    constexpr auto operator () ( F&& f, A&& a, B&& b )
        -> decltype( std::declval<A>() >>= compose (
                rcloset( LiftM(), std::declval<B>() ),
                closet(closet,std::declval<F>())
            ) )
    {
        return std::forward<A>(a) >>= compose (
            rcloset( LiftM(), std::forward<B>(b) ),
            closet(closet,std::forward<F>(f))
        );
    }
} liftM{};

It acts basically as a n-ary mbind, though one could also define an n-ary mbind! liftM works even if you don't.

Finally, we have all the pieces in place to implement KleisliSplit.

/* kleisliSplit : Kleisli(A -> M<B>) x Kleisli(X -> M<Y>) -> (piar<A,X> -> M<pair<B,Y>>) */ 
template< template<class...> class M, class F, class G >
struct KleisliSplit {
    F f;
    G g;

    constexpr KleisliSplit( F f, G g ) : f(std::move(f)), g(std::move(g)) { }

    template< class X, class Y >
    constexpr auto operator () ( const std::pair<X,Y>& p )
        -> decltype( liftM(pair,f(std::get<0>(p)),g(std::get<1>(p))) )
    {
        return liftM (
            pair, 
            f( std::get<0>(p) ), 
            g( std::get<1>(p) )
        );
    }
};

The final note before moving on: arr. Kleisli's arr is like a Monad's mreturn.

    arr : (A -> B) -> Kleisli(A -> M<B>)
    arr(f)(x) = mreturn<M>( f(x) )

or:

     arr = liftM

template< template<class...> class M, class F >
struct Arrow< Kleisli<M,F> > {

    template< class G >
    using K = Kleisli<M,G>;

    template< class G >
    static constexpr auto arr( G g ) -> Kleisli< M, Part<LiftM,G> > {
        return kleisli<M>( closet(liftM,std::move(g)) );
    }

    template< class G >
    static constexpr auto first( G g ) 
        -> K<decltype( ::split(std::move(g),arr(id)) )> 
    {
     // id is not a Kleisli. 
     // The call to arr refers to the arr above, not ::arr.
     // arr(id) : Kleisli(X -> M<X>)
        return ::split( std::move(g), arr(id) );
    }

    template< class G >
    static constexpr auto second( G g) 
        -> K<decltype( ::split(arr(id),std::move(g)) )>
    {
        return ::split( arr(id), std::move(g) );
    }

    template< class G >
    static constexpr auto split( Kleisli<M,F> f, Kleisli<M,G> g )
        -> K< KleisliSplit<M,F,G> >
    {
        return KleisliSplit<M,F,G>( std::move(f.f), std::move(g.f) );
    }

    template< class _F, class _G >
    static constexpr auto fan( _F&& f, _G&& g )
        -> decltype( kcompose(split(std::declval<_F>(),std::declval<_G>()),arr(duplicate)) )
    {
        return kcompose( split(std::forward<_F>(f),std::forward<_G>(g)), arr(duplicate) );
    }
};
 
int main() {
    auto stars = kleisli<std::vector> (
        [] (char c) { 
            return c == '*' ? std::vector<char>{'+','+'} :
                   c == '+' ? std::vector<char>{'*',' ','*'} : std::vector<char>{c};
        }
    );
    
    auto hairs = kleisli<std::vector> (
        [] (char c) -> std::vector<char> { 
            return c == '*' ? std::vector<char>{'\'','"','\''} :
                   c == '+' ? std::vector<char>{'"',' ','"'} :
                   c == '"' ? std::vector<char>{'\''} :
                   c == '\'' ? std::vector<char>{} :
                   std::vector<char>{c};
        }
    );

    cout << "hairs of '*' : " << hairs('*') << endl;
    cout << "hairs^2 of '*' : " << compose(hairs,hairs)('*') << endl;
    
    cout << "split(stars,hairs) (*,*) = " << split(stars,hairs)(pair('*','*')) << endl;
    cout << "fan(stars,hairs)    (*)  = " << fan(stars,hairs)('*') << endl;
    cout << "fan(hairs,stars)    (*)  = " << fan(hairs,stars)('*') << endl;
    cout << "split(hairs,stars) . fan(stars,hairs) = " 
      << compose( split(hairs,stars), fan(stars,hairs) )('*') << endl;
}
 
Finally, this outputs

    hairs of '*' : [',",']
    hairs^2 of '*' : [']
    split(stars,hairs) (*,*) = [(+,'),(+,"),(+,'),(+,'),(+,"),(+,')]
    fan(stars,hairs)    (*)  = [(+,'),(+,"),(+,'),(+,'),(+,"),(+,')]
    fan(hairs,stars)    (*)  = [(',+),(',+),(",+),(",+),(',+),(',+)]
    split(hairs,stars) . fan(stars,hairs) = [(",'),( ,'),(",'),(","),( ,"),(","),(",'),( ,'),(",'),(",'),( ,'),(",'),(","),( ,"),(","),(",'),( ,'),(",')]




Extras -- Convenience and Pitfalls.

Perhaps the most important parts of Control.Arrow are here, but there are still a few things that can be added. For example, compose is backwards! compose(f,g) means "do g, then do f." Because of this we have to read it backwards. fomp is useful, if only because we read left-to-right, not the other way around.

/* fcomp : (A -> B) x (B -> C) -> (A -> C) */
constexpr struct FComp {
    template< class G, class F, class C = Composition<F,G> >
    constexpr C operator () ( G g, F f ) {
        return C(std::move(f),std::move(g));
    }
} fcomp{};

Another tool is precompose and postcompose. Perhaps one wants to compose something that is not a Kleisli with something that is.

/* prefcomp : (A -> B) x (Arrow X Y) -> Arrow A Y */ 
constexpr struct PreFComp {
    template< class F, class A >
    constexpr auto operator () ( F&& f, A&& a )
        -> decltype( arr<A>(declval<F>()) > declval<A>() )
    {
        return arr<A>(forward<F>(f)) > forward<A>(a);
    }
} prefcomp{};

/* postfcomp : Arrow A B x (X -> Y) -> Arrow A Y */ 
constexpr struct PostFComp {
    template< class A, class F >
    constexpr auto operator () ( A&& a, F&& f )
        -> decltype( declval<A>() > arr<A>(declval<F>()) )
    {
        return forward<A>(a) > arr<A>(forward<F>(f));
    }
} postfcomp{};

Beyond that, there's ArrowZero and ArrowPlus to work with Monoids, ArrowChoice based on Either, ArrowApply and ArrowLoop.

ArrowZero, defining zeroArrow, shows the downside to implementing Kleisli as K<M,F> instead of K<M,A,B>. It is defined like so:

    zeroArrow = Kleisli (\_ -> mzero)

The type mzero needs to return is M<B>. Its argument, the _, will be of type A. So we have this problem where zeroArrow(k) (for some Kleisli, k) doesn't know what type to return and can't deduce it!

If we had implemented Kleisli with A and B, this would be no problem, but then it would have been impossible(?) to implement Arrow<Kleisli>::first. In Haskell, all functions carry around information about their domain and codomain, so there is no difference between passing around A and B or just F.

The easiest solution is to diverge from Haskell's definition and require zeroArrow to take an explicit result-type parameter.


Conclusions.

Arrows let us do interesting types of composition that one would not normally likely think of. They fall somewhere between Functors and Monads (this article has a graph). My Monadic parser could have also been written with Arrows.

I realize this has been a very long post. Hopefully all the code examples work, there are no inaccuracies, and it is understandable, but with an article this large, something has probably been fudged. Due to the large amount of material, it may be a good idea to split this article up and go more in depth. Please speak up if something seems off.


Source code: https://gist.github.com/4176121
Haskell reference: Control.Arrow