bartosz, posts by tag: programming - LiveJournal

So far I have audited two Computer Science grad courses at the UW. The first one was about Distributed Systems. It turned out to be very useful in my work at Reliable Software, since we have implemented a distributed peer-to-peer version control system.

I was able to position our system among other distributed systems. I saw how other people solved similar problems in various context, and how unique our solution was. Essentially, one could try to build a distributed VCS on top of a distributed database, and it would work, but progress would be hard to make, since distributed transactions could wait for a very long time until they fully commit or abort. We have to be able to make progress based on transactions that haven't committed yet. Very interesting problem, as it turns out. And our solution seems to be very close to optimal.

The second course was completely different. The title, Concepts of Programming Languages , sounds deceptively vague, but it was hard-core CS. One of the side effects of the course was that I learned functional programming in OCaml. There is something very elegant and, at the same time, very unnatural about functional programming. No matter what some people say, recursive thinking is not built into our reptilian brains. You can learn it and even get used to it, but it doesn't just flow naturally from your brain.

But functional programming is not only about recursion. It's also about using functions as first-class objects. Constructing them on the fly, passing them back and forth, storing them in containers, etc.

This course turned out to be very useful in my work on D. I started implementing a thread library in D. The cool thing about the D programming language is that it makes generic programming easy. I would not have been able to implement such a library in C++.

For instance, how do you create a thread? Thread creation has to be parametrized by a thread function--each thread must execute its own function. When the function returns, the thread dies. But the realm of all possible functions is huge! Functions are parametrized by the set of arguments they take and by the return value (possibly void). And you are supposed to write a generic function, newThread, to which you can pass any thread function and the values with which it should be called.

Imagine a C++ implementation. You'd have to write separate templates for newThread0 (thread function taking zero arguments), newThread1, newThread2, and so on, up to some magic maximum number of arguments. Then you would have to parametrize these function by passing the return type of the function. Finally, you'd want it to work not only with function pointers, but also with functors (objects that have operator() defined). It's a monumental task.

In D it took me a few hours to write one generic function, newThread, that works with an arbitrary thread function, functor, delegate, or closure. Ta dah! Here it is:

ThreadHandle newThread (F, A...) (F f, A a)
{
    alias ReturnType!(F) RetType;
    alias Thunk!(F, A) ThreadThunk;

    auto thunk = new ThreadThunk;
    thunk.Init (f, a);

    uint tid;
    
    auto h = _beginthreadex (
        null, // security attribute
        0, // stack size (default)
        &ThreadFunction !(ThreadThunk),
        thunk,
        CreationFlags.CREATE_SUSPENDED,
        & tid);

    if (IsNull (h))
        throw new Errno ("Thread creation failed");

    auto handle = ThreadHandle (h, tid);
    handle.Resume;
    return handle;
}

Here F is the type of the function (or function-like object), and A... is the (possibly empty) list of parameter types. Not only is this code simple, but it's very easy to use. Here's how you can call it:

    // execute a closure returning void
    int shared;
    void fvoid ()
    {
       shared = 13;
    }
    auto h = newThread (&fvoid);
    h.WaitForDeath ();
    assert (shared == 13);

In this case the thread runs a closure that takes zero arguments and returns nothing. You can tell it's a closure, because it grabs the caller's environment with it, in this case the variable "shared". Once the function is executed in a separate thread, this variable has a new value. Doesn't it just blow your mind?

And I'm still not totally happy with the language. There are some rough corners in the definition of Thunk.

struct Thunk (F, A...)
{
    F _f;
    A _a;
    static if (is (ReturnType!(F): void))
    {
        void Exec () { _f (_a); }
    }
    else
    {
        ReturnType!(F) _r;
        void Exec () { _r = _f (_a); }
        ReturnType!(F) Result () { return _r; }
    }

    // Revisit: turn into constructor
    void Init (F f, A a)
    {
        _f = f;
        //_a = a;
        foreach (i, T; A)
            _a [i] = a [i];
    }
}

static extern (Windows) uint ThreadFunction (ThunkType) (void * p)
{
    auto thunk = cast(ThunkType *) p;
    thunk.Exec ();
    return 0;
}

I don't like the "static if" (compile-time if). Here it is used to special-case the void return type. I have already submitted the proposal that would eliminate the special case and treat void just like any other type (and it was already accepted in our meeting last night). The Init function is used in place of a constructor, since structs in D don't yet have constructors or destructors (coming up soon, though). Consequently, I couldn't implement reference counting for thread handles, but that is also coming soon in D (I could have used classes rather than structs, since they are garbage collected, but for thread handles you really need better deallocation control).

I'm having a lot of fun with D!

Tags: programming