Thursday, August 23, 2012

Implementing the Maybe Monad in C++.

Update: See "fmap in C++" for a wider look on functors (like Maybe).

I decided to take the day off and write this little post about implementing Haskell's Maybe in C++. Haskell is not a requirement, though I will be showing some source code, so some familiarity may help. Those who already know know it may note my code examples for fmap and >>= aren't exactly the same as Haskell's. All of the following code is completely hypothetical and overly simplifies so we can just focus on the concepts.

I have come to see a few Maybe implementations in C++ around the net, but Maybe by itself is entirely useless. It doesn't really do anything on its own. In Haskell, it exists inside an ecosystem of other functions it operates well with. Outside of Haskell, it's quite bare. Let's look at some examples of Maybe:

    x = Just 2
    y = Nothing
    z = fromJust x -- 2

    -- Defin a function, g, that returns a string.
    let g m = case m of
        Just _ -> "Has a value"
        Nothing -> "Has none."

    g x -- has a value
    g y -- has none


All very simple stuff here. Make a Maybe, extract a value. A small observation: A Maybe has a value, but also has a boolean "is" or "isn't" property to it. There's something in C and C++ that acts almost exactly like this. (Try and guess before reading on.)

The Maybe types


Pointers. "x = Just 2" in C++ would be "unique_ptr<int> x = new int(2);". A pointer either points to null (Nothing) or something (Just). A bool with a value! Now let's look at a really silly example of a Maybe implementations that ignores this similarity.

template< class X > struct Maybe {
    X x;
    bool valid;

    Maybe( const X& x ) : x(x), valid(true) { }
    Maybe() : valid(false) { }
};

template< class X > Maybe<X> Just( const X& x ) {
    return Maybe<X>(x);
}

template< class X > Maybe<X> Nothing() { return Maybe<X>(); }

...

auto a = Just(1); // Maybe<int>
auto b = Nothing<int>();

This works fine, and avoids the allocation so it's more efficient, but the problem comes when we write fromJust. We could return x if valid==true, but what when valid==false? You see, another interesting property of Maybes that this ignores: for any value, x, Just x != Nothing. If fromJust returns x with no checks, then fromJust(Just(x)) == fromJust(Nothing()), which seems to violate the above rule. The other option is to throw an exception, but i generally like to avoid that.

This is assuming X is default constructible. It may not be, or it may have an expensive initialization. The above may simply not work for some types. Simply choosing a different implementation seems most obvious to me, so let's try this:

template< class X >
using Maybe = std::unique_ptr<X>; // C++11 syntax

template< class X > Maybe<X> Just( const X& x ) {
    return Maybe<X>( new X(x) );
}

template< class X > Maybe<X> Nothing() { return Maybe<X>(nullptr); }

template< class X > const X& fromJust( const Maybe<X>& m ) {
    return *m;
}

The fromJust problem is solved, and no value Just(x) equals Nothing. fromJust on a null pointer will fail, like fromJust on a Nothing in Haskell. What's more is that we can get rid of the fromJust which makes the unsafeness of the action more obvious. I can think of at least one improvement, but this will do for now.

Motivation


So far, we have not done a single thing useful with Maybe, but Maybe in Haskell works inside an ecosystem, so let's implement a part of it.

/*
 * fmap f (Just x) = Just (f x)
 * fmap f Nothing = Nothing
 */
template< class F, class X,
    // decltype and declval are good to know if you don't already.
    class R=decltype( declval<F>()(declval<X>()) ) >
std::unique_ptr<R> fmap( const F& f, const Maybe<X>& m ) {
    return m ? Just( f(*m) ) : Nothing<R>();
}


int plus_two( int x ) { return x + 2; }
int minus_two( int x ) { return x - 2; }

...


fmap( plus_two, Just(2) ); // Just 4
fmap( minus_two, fmap(plus_two, Just(2)) ) // Just 2
fmap( minus_two, fmap(plus_two, Nothing<int>()) ) // Nothing

 When I learned of fmap, things suddenly started to make more sense. I have a Maybe Int value and want the Int part transformed through various functions to produce some arbitrary value that gets wrapped in a Maybe. I don't have to constantly check whether the value is valid or not, that happens automatically.

Where code used to look like...

void f() {
    X* x = return_some_pointer();
    if( x ) {
        Y* y = new Y(transform_value(*x));
        if( y ) {
            // do something
        }
    }
}

it can now be rewritten

void f() {
    fmap (
        [](... y){ /*do something*/ },
        fmap( transform_value, return_some_pointer() )
    );
}

Certainly more concise, although whether this reads more clearly might be a little subjective. Now that we have fmap, I want to also implement a monadic operation on Maybe values.

// Just x >>= f = f x -- where f returns a Maybe.
// Nothing >>= f = Nothing
template< class X, class F, class MR=decltype( declval<F>()(declval<X>()) ) >
MR operator>>= ( const Maybe<X>& a, const F& f ) {
    return a ? f(*a) : MR();
}

int twoOver( int x ) { return 2/x; }

Maybe<int> twoOver( int x ) { return x ? Just(2/x) : Nothing<int>(); }

...

Maybe<int> x = Just(2);
fmap( plus_two, x >>= twoOver ); // Just 3

Here, x >>= twoOver returns Just(2/2=1), then fmap applies +2 on it to get 3. We can use fmap to apply normal functions and >>= to apply Maybe-returning functions. One might think of fmap as an operation lifting functions into the Maybe category and >>= working on functions already there. (Though, nothing actually stops us from using >>= to apply non-Maybe returning functions, except the honor system ;)

Here's some code i wrote at one point for an attempt at a Webkit-based internet browser.

     WebKitWebFrame* frame        = webkit_web_view_get_main_frame( web_view );

    WebKitWebDataSource* source  = webkit_web_frame_get_data_source( frame );
    GString*             content = webkit_web_data_source_get_data( source );
  
    WebKitWebDataSource* reSource   = webkit_web_frame_get_provisional_data_source( frame );
    GString*             reContents = reSource? webkit_web_data_source_get_data( reSource ) : 0;

I could theoretically rewrite that...

    WebKitWebFrame* frame = web_view >>= webkit_web_view_get_main_frame;

    GString* content = frame >>= webkit_web_frame_get_data_source
                              >>= webkit_web_data_source_get_data;
   
    GString* reContents = webkit_web_frame_get_provisional_data_source
        >>= webkit_web_data_source_get_data;


More practical examples come from using >>= and fmap together. Haskell provides a really handy operator, <$>, that aliases as an fmap. Take a look at this line:

(+2) . read <$> getLine >>= print

This first gets the line as a string, calls read which converts it to an int, then adds 2 and calls print. The dot operator between (+2) and read represents a function composition, g(str) = plus_two(atoi(str)). That is also not difficult to implement in C++, but maybe another day.

Unfortunately, we are more limited to our expressiveness with symbolic operators in C++. We have to choose from, at least the fallowing: ^ << <<= % || && | & * / + -, which all have either mathematical or logical meaning. Whether being able to create arbitrary symbolic operators would benefit or harm C++ code is an interesting topic for debate. It can certainly be difficult to justify the use of an operator for a non-standard function because of our limited options.


Conclusions



I hope I have demonstrated at least one thing successfully here: Maybe, on its own, is useless. The truly interesting aspects of it only come out in conjunction with other functional operations. fmap and >>= make a good start, there's also >>, liftA2, ap, function composition, and a whole slue of useful tools that can be used on pointers, but also arbitrary data structures for which they make sense. One interesting thing to note is that Haskell doesn't as easily allow for functions that take a variable number of arguments, so we can do some things more generically than in Haskell. (For example, liftA could be a var-arg funcion; perhaps even fmap.)

Now, remember that I said there was one improvement on implementing Maybe to make? I use the Maybe concept in my library, Pure, and what I do differently is that I do not actually implement it! Instead, I use tag dispatching to determine if something's a Maybe type, and implement generic versions of fmap and >>= (which i call mbind). Even defining Maybe as an std::unique_ptr like above restricts fmap from working on std::shared_ptrs, and regular pointers--we have a rich variety of ways in expressing possible values in C++.


Further reading

http://www.generic-programming.org/languages/cpp/techniques.php#tag_dispatching
A good technique for template specialization.

http://members.chello.nl/hjgtuyl/tourdemonad.html
A good (short) list of important monad functions.

Sunday, July 22, 2012

"I just figured out how to do this!" does not constitute a tutorial.

This is my biggest blog-reading pet peeve; authors who show little, say much, or seem oblivious to alternative solutions--in some way or another, they do not provide a good argument. Yet, they still expect you to agree with them. The worst part, for me, is when i see "tutorial" in the title. Often fallowing that, long paragraphs explaining how and why the fallowing code does what it does, is what it is, and on and on. We have all probably read something like this at one point or another and the format has a lot of problems.

It may not be honest. The author may have no authority on the subject. If you write a library, you may write a small tutorial that succinctly shows its usefulness--you certainly are the authority of your own library. In doing so you should be able to provide short examples that get the user from point A to point B, and expose enough of the API for the user to go beyond the tutorial. But after finishing a tutorial on "how to build a tile-based map" or "how to calculate the area of a polygon", i generally only expect to leave with the knowledge of one person's idea that they had for solving the issues they needed solved for their implementation, without the ability to expand on their ideas or understand the problem domain and invent new solutions. These types of articles rarely instill a good understanding of the theory they profess and instead show off just one implementation. If such a tutorial can't be copied and pasted, then one needs to write their own implementation and understand the underlying theory anyway.

The sharing of such information is in no way a bad thing, but it's not really a tutorial about how to get from A to B, is it? It's really about how the author got from A to B--her or his personal journey. The authors should therefor not spend time in theory and skip straight to implementation. What I see is the steps to reproduce an experiment.
  1. Observe some problem, or behaviour. (observation)
  2. Try to understand or describe the problem. (hypothesis)
  3. Try to solve the problem, based on your understanding. "If i write this code, then ... should happen" (predictions)
  4. Compile and run. (testing)
  5. Debug (go to step 2) until it works how you expect. (analysis) 
(For a graph, see: http://physics.ucr.edu/~wudka/Physics7/Notes_www/node6.html)

That people want to share their experiments is terrific. It's how scientific communities thrive. They just aren't tutorials.

Readers of these articles may be beginners, in which case we have the blind leading the blind. The given solution may lack an idiomatic solution and its readers may not know any better. If a considerable proportion of tutorials were written by new programmers, then the experience, wisdom, and knowledge of the old programmers would be lost. I see this especially in newer languages where few or no veterans exist that maintain accessible blogs, but there is no shortage of new programmers, excited ones who want to learn everything.

Overly verbose instructions make them hard to fallow. The worst part of these blogs and articles is their verbosity. If an article wants to offer something useful, show, don't tell. Requiring long explanations of abstract concepts preliminary to understanding the fallowing code, or seeing why it's sane, indicates how little usefulness the concept really has. The author is attempting to do two things at once: teach you fundamental concepts and show you how to use them. If the author doesn't successfully give you all the knowledge you need to know, then the code is useless. For example, i would not recommend explaining how to make a binary tree using templates and std::unique_ptr by first explaining how to use templates, then how to use std::unique_ptr, then explaining binary trees.

It's pretty fucking egotistical. The article is all about the author's code, of course. It's supposed to be. There is good content in there, but the reader has to sift through the author's long paragraphs about how/why the code works. This basically gives him or her free licence to spend time talking about her or his own code. That's fine, but not when it goes into length about theory, it goes back to being dishonest.

At the end of the day, it's a good thing that we share our experiments, ideas, and beliefs with each other, but we shouldn't treat showing off our code as instructional. We should instead admit that we like showing off code for the sake of showing off code. We like to point at problems and say "look how i solved that!". Let's be more honest about it.

Friday, June 15, 2012

Mapgen in C, C++, LISP, Haskell, Factor, and Forth

I set out to pick up a few new languages (the first being LISP), but my overall goals require time, too. I want to make a simple roguelike and one simple sub-problem in doing so is generating a map. Rather than writing a library to generate a map, i'd prefer to have an application that does it instead.

Here's a sample:

mapgen$ ./mapgen -n 3 -d '(30,30)'                                                                                                


##############################
##############################
##############################
##############################
#######################.....##
#######################.....##
#######################.....##
###############.............##
###############.#######.....##
###############.#######.....##
###############.#######.....##
###############.########.#####
###############.########.#####
###############.########.#####
###############.########.#####
###############.########.#####
###############.########.#####
###############.########.#####
###############.########.#####
###############.####.....#####
###########......###.....#####
###########......###.....#####
###########......###.....#####
###########......###....######
###########......###....######
####################....######
####################....######
####################....######
##############################
##############################





For this simple test, i don't care that the rooms overlap--this just puts (-n) 3 boxes of '.' chars in a map that's (-d) 10x10 tiles. What fallows is a quick look at how the experiment went. You can look at all the implementations here. Among them, C and C++ were my only string languages. I started on a Python version (my first language), but it felt unnecessary. I'd appreciate feedback on any of these examples so that they may be improved.

The problem is overly simplified and some of the implementations are more complete than others, but i thought someone might find it useful to see how one can solve the same problem with a bunch of different languages. Rosetta Code lets you do this, but only for examples even more trivial.

Size and Complexity

Line counts are notoriously bad at measuring code quality, but it does give some useful information.

  • C++ : 374 (574 if you count my pre-built vector class)
  • C : 176
  • LISP: 75
  • Haskell: 125
  • Forth: 96 (incomplete)
  • Factor: 123
mapgen.cpp actually has 134 lines, but the overhead of writing a generic Grid class (which might get used later) and iterators to go along with it had a cost. However, some functions were almost entirely reduced, thanks to this method. For example, dig_room() takes a room (a struct with the fields left, right, up, down) and replaces every wall tile with a floor tile. In C:

void dig_room( Grid g, const Room r )
{
    int x, y;
    for( x = r.left; x < r.right; x++ ) {
        for( y = r.up; y < r.down; y++ ) {
            Vector v = { x, y };
            *grid_get( g, v ) = '.';
        }
    }
}

And C++

void dig_room( const Room& r )
{
    std::fill( mgMap.reg_begin(r), mgMap.reg_end(r), '.' );
}

In Grid.h, i was able to define an iterator the incremented over a room and any function that requires such an iteration will be able to use it easily. However, in C, i will probably use the above loop in every necessary instance.

Theoretically, the templated Grid and generic iterators should save lines of code and developer time down the road, but without the ability to run wild with that in C, i made a very succinct Grid that required almost no time to make at all (in fact, mapgen.c only took an hour to write, if that).


void init_grid( Grid* g, Vector dims )
{
    const int AREA = dims.x * dims.y;
    
    g->dimensions = dims;
    g->tiles = malloc( AREA );
    memset( g->tiles, '#', AREA );
}


void destroy_grid( Grid* g )
{
    free( g->tiles );
    g->tiles = 0;
}


char* grid_get( Grid g, Vector p )
{
    return g.tiles + g.dimensions.x*p.y + p.x;
}


Comparing LISP and Haskell, i find that Haskell's purity added to the amount of work done. 



splitGap :: Int -> Int -> [a] -> ([a],[a],[a])
splitGap start size lst = (before, middle, after)
  where 
    (before,rest) = splitAt start lst
    (middle,after) = splitAt (abs size) rest


digRow :: Range -> MRow -> MRow
digRow (start,end) row = 
  before ++ replicate size TFloor ++ after
  where
    size = end - start + 1
    (before,_,after) = splitGap start size row


digRoom :: RMap -> Area -> RMap
digRoom rmap ((x,y),(u,v)) =
  ybefore ++ map (digRow (x,u)) rows ++ yend
  where 
    (ybefore,rows,yend) = splitGap y (v-y+1) rmap

verses


(defun dig-at (m x y)
  (setf (row-major-aref m (+ x (* y (array-dimension m 0)))) #\.))


(defun dig-room (map room)
    (loop for j from (area-up room) below (1+ (area-down room)) do
          (loop for i from (area-left room) below (1+ (area-right room))
                do (dig-at map i j))))

While Haskell's random monad also produces several extra lines, like in C++, some algorithms get reduced in code-size thanks to higher-level featuers:


(defun splatter-pattern (map n)
  (let* ((height (array-dimension map 1))
         (width  (array-dimension map 0))
         (rooms (loop for i from 1 to n 
                      collect (random-area width height))))
    (loop for i from 0 below (length rooms) do
          (dig-room map (elt rooms i))
          (when (> (-(length rooms)i) 1)
            (let* ((a (random-point (elt rooms i)))
                   (b (random-point
                       (elt rooms (randr (1+ i)
                                         (- (length rooms) 1))))))
              (dig-hallway map a b))))))

verses



splatter :: RandomGen r => Int -> r -> RMap -> RMap
splatter n gen m = 
  digRandomHallways (foldl digRoom m rooms) g2 rooms
  where 
    (g1,g2) = split gen
    rooms = take n $ randomRooms g1 (length (m!!0),length m)
    center ((x,y),(u,v)) = ((x+u) `quot` 2, (y+v) `quot` 2)

Next: Forth and Factor. For those who don't know, Forth is from the 70's and Factor is based on a recent academic language, Concat. Concat seems very Forth based, and so Factor and Forth seemed like logical analogues. It seems like this type of language gets ignored almost entirely by the mainstream community, and in research, but i find fascination in them.  Factor, in particular, which shows a working high-level, functional Forth, with first-class functions.

Unfortunately, one unfamiliar with these languages will find the fallowing source code indecipherable.

Forth:

char . CONSTANT FLOOR
...



: ij>tile *width* * + *tiles* + ; 
...


: (dig) ij>tile FLOOR swap C! ;
: dig { room }
    room down  @ 1+ room  up  @ DO 
    room right @ 1+ room left @ DO 
        I J (dig)
    LOOP LOOP ;

Factor: (comments removed) 
: ij>tile ( i j -- i' map   ) *width* * + *map* get ;
...

: left-right ( room -- l r ) [ left ] [ right ] bi ;
: up-down    ( room -- u d ) [  up  ] [ down  ] bi ;
...

: (room-map) ( room quot: ( i j -- ) -- )
    [ [ up-down [a,b] ] keep ] dip 
    '[  _ left-right [a,b] swap
        _ curry each ] 
    each ; inline recursive

: (dig) ( i  j -- ) [ FLOOR ] 2dip ij>tile set-nth ;
: dig   ( room -- ) [ (dig) ] (room-map)  ;

Again, we see that we have written more lines in the higher level language! However, Factor's left-right is used in several places and adds clarity, though admittedly, (room-map) just felt like a good thing to implement when i could have simplified it by not making a generic map function. (The idea that something like that would be useful violates the YAGNI principal.) 

But both my Factor and Forth solutions could possibly be improved, i didn't invest as much time as i would have liked to.

Even though higher level languages tend towards more lines in this small example, their theory lies more in scalability. But then again, in compared to C, there's no reason i couldn't have done the exact same thing in C++.

Future Plans

For my roguelike, mapgen will read stdin to initialize the options for constructing the map. For now, the mapgen repository acts as a scratchpad for trying different approaches. C has probably been my favorite so far. While i enjoy sculpting a clean interface in C++, C shines in its simplicity. Russia's pencil to NASA's zero-gravity pen. (But you can't blame me for thinking a zero-gravity pen is neat!) 

I want to continue to develop the C and C++ versions in parallel, but at least one other as well. Of all the languages i tried, i had the most hope for Factor. Stack-based languages take a unique approach that piques my interest. Programmers commonly write procedures that take one variable, convert it to an intermediate, and then an output value. In an imperative language, it might look like this:

a = f()
b = g(a)
c = h(b)

Functionally,

c = h( g( f(a) ) )

But, stack based:

a f g h c

This leads to interesting quirks, and it's fun to play around with. Stack manipulation is like a puzzle on top of your otherwise normal, every-day algorithm. It can lead to cryptic functions and terse generalizations, but still, i think deserves much more attention.

And yet i don't plan to continue to continue implementing either Forth or Factor. I will learn them and keep an eye of the development of concatenative languages.

Conclusions

If high level languages do provide productivity gains, then they should scale better than lower levels. (I.E., given enough complexity, the C implementation would grow larger than C++, LISP larger than Haskell, and Forth larger than Factor.) Still, for a short program, low-level languages show an elegant brevity.