chaource: (Default)
[personal profile] chaource
Continuing my exploration of OCAML, I found that there is an extension called JoCaml, which supports concurrency. I don't yet know if it really supports multicore computing - probably not, a real shame! But JoCaml's view of concurrent execution is very nice.

Now, there is almost no tutorial documentation either on the formal model (the "join calculus") or on the JoCaml, except for the JoCaml manual. Other tutorials and presentations about the join calculus merely confused me after a few pages (and probably will just as well confuse you). Things were quite unclear to me until I discovered the following formulation, in the introduction of one of the quite formal early papers on the join calculus. Now I would like to write down a distilled explanation that makes sense to me.

"Join calculus", also known as the "Reflexive Chemical Abstract Machine", is a model of concurrent computations that seems to be a lot easier to think about than anything I've seen so far on the topic of concurrency. It really looks like we can forget all that synchronize-mutex-semaphore-exception-fork-thread-race-condition-deadlock baggage and just get down to coding!

Ordinary, non-concurrent programming is viewed in OCAML as evaluation of expressions, possibly with side-effects. Expressions always have a result value. For example, if sum is a function that computes the sum of numbers in a list, then we can write sum [1; 2; 3]. This expression evaluates to the result value 6.

(The syntax of OCAML resembles that of ordinary mathematics, where we can write cos 0 and get the result value 1; we do not have to separate the function from its argument by parentheses.)

Roughly speaking, an OCAML program is a single (but large) expression that needs to be evaluated. The result value of this single expression is most often discarded by the computer. This is so because the usefulness of the program is, in most cases, not in delivering this single result value, but in doing something else (producing "side effects") while evaluating the large expression. The side effects could be, for example, drawing something into a window on the screen, writing files, and so on. Nevertheless, computation proceeds by evaluating an expression.

Given this view of computation, what should a "concurrent" program do? It should evaluate several expressions in parallel, - that is, concurrently and in arbitrary order. The task now is to initiate and to organize this kind of computation.

In the "join calculus", concurrent computations are conceptualized through the following "chemical" analogy, which I find very helpful, and I hope you do too.

The "chemical abstract machine"

Imagine that we prepare a large tank of water and dissolve many different chemical substances in it. Many different chemical reactions then become possible in this "chemical soup". The participants of these reactions are dissolved molecules and radicals that we could denote by a, b, c, ... Then we can symbolically list all the possible reactions, for example by simply listing the input and the output reagents:

a & b = c (* reaction 1 *)

a = c & d & e (* reaction 2 *)

Here, reaction (1) means that a and b instantaneously disappear and instead c appears. I will call a, b, c, ... reagents for now, although this is perhaps not the word used in chemistry.

It is important to know that the reactions do not all happen at once, because any given reaction can happen only when several suitable reagents come together. For example, the first reaction happens only when the reagents a and b come together. The second reaction happens to the reagent a alone, and produces c, d, e.

The input reagents are separated by the symbol & in order to stress that all of these input reagents have to be present together in the soup. In real life, we cannot force this "meeting of reagents" to happen exactly when we want. All we can do is to stir the soup, giving different molecules a chance to move around and meet other molecules. So in our system we also cannot predict when particular reactions will happen.

As an example, consider a chemical machine with four reactions:
a & b = c
b = a
c = d
d = c

If initially the soup contains a & b and the first reaction occurs then we will have c in the soup. Once we have c, there will be infinitely many transitions between c and d, so the reactions will never stop. If, on the other hand, the first reaction was b=a, then we will have two a's in the soup, and no further reactions will ever occur. In this situation, the chemical machine will stop. So the machine can either stop or go on forever, depending on the (unpredictable) choice of the initial reaction.

This is our imaginary "chemical abstract machine". In order to fully describe the machine, we need to write only two things:

  • A list of all possible reactions, like the reactions (1) and (2) above.

  • A list of reagents that are initially present in the soup. (For instance, we could specify that initially there are ten a's and five b's, or whatever.)

    The machine runs for a while, and some reactions occur, perhaps producing new reagents, which then enter new reactions, and so on. The machine will stop when no more reactions can occur with any reagents present in the soup.

    It is possible, of course, that some reactions will indefinitely keep happening. This depends on the specific definition of our "abstract chemistry". We cannot, however, control the order in which reactions will occur. So the end state of the machine could be unpredictable. For instance, some reactions can become impossible if the required reagents disappear from the soup, but we cannot predict with certainty whether they will disappear, because this depends on the order of other reactions.

    Using the chemical machine for doing computations

    Now, let us assume that this "chemical abstract machine" has been already implemented (this is done in the JoCaml system). We would like to use this machine for running some concurrent computations. The basic idea is this: we are going to give the machine some extra expressions to be computed whenever reactions occur.

    Specifically, we are now granted the following modifications to the "chemical abstract machine":

  • We are allowed to "decorate" every reaction with an OCAML expression. This expression becomes the "computational payload" of the "chemical reaction". The abstract machine will evaluate this "payload" expression whenever the reaction occurs, before producing the output reagents. (So reactions are not instantaneous any more, because evaluating the "payload" expressions will take time.)

    For example, instead of reaction (1) we might write:

    a & b = print_string "Reaction 1\n" ; c

    Every time this reaction occurs, the machine will print the text "Reaction 1" before destroying the reagents a, b and creating c.

    A side note: The whole right-hand side of a reaction now looks like a single OCAML expression, e.g. we can write a reaction such as

    a & b = f(); g(); h(); c

    where f(), g(), h() are some OCAML function calls. Indeed, this is an OCAML expression; but the result value on the right-hand side must be a set of output reagents. For example, reaction (2) could be decorated by an expression like this:

    a = print_string "reaction 2\n"; c & d & e

  • We are allowed to "decorate" the reagents with arbitrary OCAML values. The "decoration" value can be written in parentheses next to the name of the reagent.

    For example, we could write a(1) or b("xyz") or c([0;1;2;3;4;5]) instead of just a, b, c. This means that the reagent a carries the integer 1 as its "decoration", the reagent b carries the string "xyz", and the reagent c carries the list [0;1;2;3;4;5].

  • We can now use the decoration values from all the input reagents as known values in the "payload" expression. We can use these decoration values both for performing side effects and, importantly, for computing the decoration values of the newly produced output reagents.

    For example, we could write the following reaction:

    a(x) & b(y) =
    let z = x+y in ( Printf.printf "reaction 1 yields c(%d)" z; c(z) )
    (* reaction 3 *)

    Each of the reagents a,b,c is here decorated with an integer value. Whenever, say, a(2) and b(3) meet in the soup, they will react, and the machine will evaluate the expression let z = x+y in ..., producing the value 5 for the temporary variable z, performing the side effect (printing the message "reaction 1 yields c(5)"), and putting a new reagent c(5) into the soup.

  • A reaction can be such that it produces no output reagents but just consumes the input reagents. To denote the absence of output reagents, the symbol 0 is used on the right-hand side of a reaction.

    For example, we can specify that the reagent c simply prints its decoration value and disappears from the soup:

    c(x) = Printf.printf "reaction 5 consumes c(%d)\n" x; 0 (* reaction 4 *)

    It looks as if c produces the output reagent 0. However, there is no such reagent; so you cannot use the symbol "0" as an input reagent. The meaning of this "0" is to denote the absence of any output reagents in this reaction. (Writing the output reagents as 0 & a is the same as writing the reagent a.)

  • We may inject arbitrary new reagents to the soup at any time (not only at the initial time).

    That's all!

    Example

    Since the "chemical machine" requires two parts for its description - the list of "chemical reactions" and the list of initial "reagents" - the JoCaml system adds two new keywords to OCAML. These keywords are "def" for defining a chemical reaction, and "spawn" for injecting reagents to the soup. It is not necessary to define the names of the reagents separately; they are already defined by listing the reactions. The JoCaml system uses the symbol & to separate reagents in reactions, and also for "spawn" (injecting several reagents at the same time).

    To test that my understanding is correct, I just tried to run reactions (3) - (4) in the JoCaml system. Here is the output (one needs a double semicolon to force evaluation in the interactive OCAML).

    $ jocaml
    JoCaml version 3.12.1

    # def c(z) = Printf.printf "reaction 5 consumes c(%d)\n" z; 0
    ;;
    def a(x) & b(y) = let z = x+y in ( Printf.printf "reaction 1 yields c(%d)\n" z; c(z) )
    ;;
    val c : int Join.chan = <abstr>
    val a : int Join.chan = <abstr>
    val b : int Join.chan = <abstr>
    # spawn a(2) & b(3) & c(4) ;;
    - : unit = ()
    #
    reaction 5 consumes c(4)
    reaction 1 yields c(5)
    reaction 5 consumes c(5)


    (There are two further facilities in JoCaml for defining reactions: one can define several reactions at once, so to say, "recursively", and one can define several variations of reactions with the same reagents. For now, let's skip those details.)

    So this is the view of concurrent computations known as the "join calculus" in the research literature: First, we define the reactions and the reagents allowed in our "abstract chemistry". We also define the computational payloads for reactions. Then we start the machine and inject some initial reagents into it. The machine automatically creates and synchronizes all the required parallel processes for us. The processes wait until suitable "reagents" are present in the soup, then automatically evaluate the "payload" expressions, then perhaps go to sleep or create new processes, according to the custom rules of the "abstract chemistry". In this way, quite a complicated system of interacting processes can be conceptualized as a "chemical machine".

    In order to implement a particular concurrent program in the join calculus, we must somehow invent the appropriate "reagents" and design "reactions" that will create and destroy the reagents in such a way that the "payload" computations are performed correctly.

    It remains to see how we can use the "chemical machine" for performing some concurrent computations. For instance, to sort arrays or to apply functions to lists, in parallel. Another interesting application would be a concurrent GUI interaction together with some jobs in the background.
    (To be continued.)

  • Profile

    chaource: (Default)
    chaource

    February 2026

    S M T W T F S
    123 45 67
    8 910 11121314
    151617181920 21
    22232425262728

    Most Popular Tags

    Style Credit

    Expand Cut Tags

    No cut tags
    Page generated Feb. 25th, 2026 04:34 am
    Powered by Dreamwidth Studios