chaource: (Default)
[personal profile] chaource
I described in Part I the "chemical abstract machine" model of concurrent computation. In that model, one imagines a soup of "abstract chemical reagents" that interact randomly among themselves, according to a predefined set of possible "chemical reactions". Each "reaction" carries a computational "payload", which is an expression computed when the reaction takes place. We can organize a complicated parallel computation by designing an appropriate rules of an "abstract chemistry" and by running the machine with chosen initial "reagents".

The JoCaml system adds several other important features to this model, and I would like to interpret all of JoCaml using the "chemical" view since it makes everything much simpler. (Other tutorials talk about "channels", "sending messages", "processes" and so on, which is quite confusing. In particular, I think that the "channels/messages" metaphor does not work well in this context.)

The features of JoCaml that I did not yet talk about are (in the order of decreasing complexity):
  • Synchronous (or "instant") reagents (the reply .. to construction),
  • Different reactions defined on the same reagents,
  • OCAML expressions involving reagent constructors and reagents as first-class values,
  • "Remote injection of reagents": TCP-IP socket connections to other "chemical machines" running either on the same physical computer or on remote computers.

    Synchronous ("instant") reagents

    Normally, all reactions occur asynchronously, that is, at random times. Suppose we define this simple reaction,
    def a(x) & b(y) = print_int (x+y); 0
    We can now inject some a and b reagents into the soup:
    spawn a(1) & b(2)
    The injection of these reagents is a non-blocking function. In other words, the spawn function returns right away with the empty value, and our program continues to run. Perhaps, quite soon the reaction will start, the payload will be computed, and we will get 3 printed. However, this computation is asynchronous. In other words, we will have to wait for an unknown duration of time until the "3" is printed. (If the chemical machine is very busy running lots of other reactions, we might have to wait quite a bit until our computation even begins.)

    JoCaml introduces a special kind of reagents that are synchronous. Perhaps a fitting word for them is "instant reagents". Instant reagents are different from the usual (asynchronous, or "slow") reagents in several respects:
  • instant reagents are injected into the soup without the spawn keyword: instead of spawn a(3) we simply write a(3) if "a" is an instant reagent.
  • instant reagents have a result value (in addition to the decoration value): so "a(3)" returns a value, - just like a simple function call!
  • instant reagents are always immediately consumed by the reaction, i.e. they cannot remain in the soup when the reaction is finished.

    The major difference is that injecting an instant eagent into the soup is a blocking function call. Namely, when we inject an instant reagent, the chemical machine will immediately check whether a reaction involving this reagent is possible. If yes, the machine will immediately run this reaction and deliver the result value to the injecting expression. If no reaction is possible, the injecting call will block and wait until a reaction becomes possible (e.g. until some other required reagent appears in the soup).

    How does JoCaml define the result value of an instant reagent? Let us imagine a reaction where the reagent "a(x)" is "slow" and the reagent "s(y)" is "instant". We would like the reaction between "a" and "s" to produce some other slow reagents "b" and "c", and at the same time we would like the reagent "s" to return the value x+y after the reaction. Our wish can be written informally (i.e., not yet with the correct JoCaml syntax) as
    a(x) & s(y) = b & c & "s returns x+y"
    We can then use this reaction by calling s(3), which looks like a synchronous function that blocks until someone (say, another reaction) injects, say, "a(2)" into the soup. Then "s(3)" will return the value "5", while the reagents "b" and "c" will be injected into the soup.

    JoCaml allows us to implement this reaction by using the following syntax,
    a(x) & s(y) = b() & c() & reply x+y to s
    Once we have used the construction reply ... to with the name "s", the JoCaml system recognizes the reagent "s" as synchronous ("instant").

    In my view, the choice of the new keywords "reply ... to" is confusing because the value "x+y" is not our "reply to s", neither is this value "sent to s". This value is actually sent to the expression that called "s" as the result of evaluating "s(3)". It would be easier to remember the correct semantics if the new keywords were something like "finish s returning x+y" instead.

    It is important to note that the synchronous reagent "s" must be removed from the soup after the reaction is finished. It does not make sense for "s" to remain in the soup. If a reagent remains in the soup after a reaction, it means that the reagent is going to be available for some later reaction without blocking its injecting call; but this is the behavior of an asynchronous reagent.

    To summarize: Instant reagents are like functions except that they will block when their reactions are not available.

    Same reagents with different reactions

    JoCaml allows simultaneous definitions of reactions with the same reagents, for example,

    def a(x) & b(y) = c(x+y)
    or a(x) & c(y) = a (x+y)

    These two reactions cannot occur in parallel; either one of them occurs or the other (even if lots of "a" reagents are available).

    These two definitions actually specify not only two reactions but also the fact that "a", "b", and "c" are (asynchronous) reagents. This is so because "a", "b", and "c" all occur on the left-hand side of the definitions.

    We can also use the "and" keyword to define reactions simultaneously defining all the reagents:
    def a(x) & b(y) = c(x+y)
    and c(x) = print_int x ; a(x)

    The "and" construction is necessary: the first reaction uses the "c" reagent, and we have to define it as a reagent. We could define the "c" reaction first, but it uses the "a" reagent. So these two reactions must be defined simultaneously. (This is similar to defining mutually recursive functions.)

    Finally, the standard OCAML pattern matching is available for choosing reactions. For instance, the decorating values could be a variant, a list, etc., and a reaction can require a particular matching value. An example with the Option type:
    def a(Some x) & b(_) = b(x) or a(None) = 0

    Using reagents in OCAML expressions

    JoCaml is a superset of OCAML and uses the existing OCAML type system. What are the OCAML types of the reagents?

    A reagent must have the syntactic form of a function call with a single argument, like "a(2)". Here "2" is the "decorating value" of the reagent. Reagents without decorating values are not allowed, so we need to use the empty value (), e.g. "b()". (When the decorating value is not empty, the parentheses are optional; we could write "a 2" instead of "a(2)". But I find it initially helpful to write parentheses with reagents.)

    Now, we have two kinds of reagents: instant and slow.

    "Instant" reagents have the type of functions, returning a value. If "s(2)" is an instant reagent returning an integer then "s(2)" has the type int and "s" has the type int->int.

    However, the story becomes more complicated with "slow" (i.e. asynchronous) reagents. A slow reagent such as "b(2)" is not an OCAML value at all! Slow reagents cannot be used in ordinary OCAML expressions; they cannot be, say, stored in an array or passed as arguments to functions.

    # def b(x) = print_int x; 0;;
    val b : int Join.chan = <abstr>
    # b(2);;
    Characters 0-1:
    b(2);;
    ^
    Error: This expression is not a function; it cannot be applied
    # b;;
    - : int Join.chan = <abstr>


    We see that the "b" in "b(2)" is an ordinary OCAML value that has the type int Join.chan. But this is not a function type, so "b(2)" is not a valid OCAML expression.

    Nevertheless, we can regard "b" as the "constructor" for slow reagents of the form "b(x)". The OCAML value "b" can be passed to functions, stored in an array, or even used as the "decorating value" of another reagent. For example, this code creates a "higher-order" slow reagent that injects two copies of a given other reagent into the soup:
    # def a(b,x) = b(x-1) & b(x-2);;
    val a : (int Join.chan * int) Join.chan = <abstr>


    Here we are using a tuple as the decorating value of the reagent "a".

    In this way we can also implement a "continuation-passing" style, passing reagent constructors to each other.

    While reagent constructors are ordinary OCAML values and can be used in any expression, a fully constructed slow reagent like "a(2)" can be used only within a "reagent expression". This is a new type of OCAML expression introduced by the JoCaml system. A "reagent expression" describes a slow reagent (or, more generally, a set of slow reagents) that must be injected into the soup. Thus, a "reagent expression" may be used only where it is appropriate to describe newly injected reagents: either after spawn or on the right-hand side of a def.

    Note that spawn NNN is an ordinary OCAML expression that immediately returns the empty value (), while def is a binding construction similar to let. The right-hand side of a "def" is a "reagent expression" that describes at once the payload expression and the output reagents of a reaction.

    A "reagent expression" can be:

  • the special empty reagent, "0"
  • a single slow reagent like "a(2)"
  • the reply ... to construction, describing the result value of an instant reagent
  • several reagent expressions separated by "&": a(2) & b(3)
  • an arbitrary OCAML expression (a "payload expression") that evaluates to a reagent: for example f(); g(); a(2) or if f() then a(2) else b(3) where "f" and "g" are ordinary OCAML functions. Of course, all OCAML features like let ... in, pattern matching, etc. can be used for building the OCAML expression, as long as its final value is a reagent.
  • a while loop or a for loop containing reagents, e.g. for i = 1 to 10 do a(i) done -- note that all these ten reagents will be injected into the soup simultaneously rather than sequentially.

    Reactions and reagents can be defined locally, lexically scoped within a "reagent expression". For example:
    spawn (def a(x)=b(x) in a(2))
    The chemical machine will, of course, see all defined reactions and reagents, incuding "a". But the OCAML program will not see "a" outside this spawn expression.

    The let ... in and def ... in constructions can be mixed, i.e. we can write things like
    let i = ref 0 in def a() & b() = incr i; a() .

    Interactions between different chemical machines

    We can run many independent instances of the "chemical soup machine" on the same physical computer, or on several different computers. (One JoCaml program can run only one "soup", but we can run several compiled JoCaml executables.) Different JoCaml instances can connect to each other and interact in the following manner:

  • we can inject appropriate reagents into a "remote soup", that is, a chemical soup running in another JoCaml instance,
  • we can react (synchronously or asynchronously) to the termination of the connection with a remote soup,
  • we can allow remote JoCaml programs to inject reagents into our own soup.

    Suppose we wanted to inject a reagent such as "a(2)" into a "remote soup". But what if the remote soup does not even have a reagent called "a"? The mechanism provided by JoCaml is that we first need to obtain a reagent constructor, "a", from the remote chemical machine. This is done through the "reagent name server" mechanism. Each chemical machine has its own "reagent name server" and can register reagents with it. Another program can then query the server and obtain reagent constructors by name.

    Here are the typical use cases:

  • we want to register our local reagent "b" with our name server.
    def b(x) = print_int x; 0;; (* defined a reaction involving "b(x)", thus also defined "b" *)
    Join.Ns.register Join.Ns.here "reagent_b" b;; (* registered "b" with our own reagent name server *)

  • we want to retrieve a reagent constructor from our own name server (e.g. to pass this reagent between disconnected parts of our own program).
    let rb = (Join.Ns.lookup Join.Ns.here "reagent_b" : int Join.chan);;
    Now we can use the constructor "rb", say spawn rb(0) or whatever.

    Note that we need to specify a type constraint explicitly. This is so because the name server registers the reagents merely by their names (strings), and there is no way to send OCAML type information over the network. A reagent constructor for an asynchronous ("slow") reagent has a type like 'a Join.chan, where 'a is the type of the decorating value. A reagent constructor for a synchronous ("instant") reagent has a type 'a->'b like an ordinary function. So we could obtain an instant reagent from the server like this:
    let rs = (Join.Ns.lookup Join.Ns.here "reagent_s" : int->int);;

  • we want the reagent name server to start listening on some TCP-IP port, say, 12345.
    Join.Site.listen (Unix.ADDR_INET (Join.Site.get_local_addr(), 12345))
    (The JoCaml manual says that this expression needs to be evaluated from some process that does not terminate as long as the server has to keep listening. For instance, within a reaction that blocks and never finishes until the program is terminated.)

  • we want to obtain a reagent constructor "b" from a remote machine whose reagent name server is listening on a known TCP-IP port and host name. This code is a bit longer:
    let server_addr = Unix.gethostbyname "some.remote.host"
    and server = Join.Site.there (Unix.ADDR_INET(server_addr.Unix.h_addr_list.(0),12345))
    and ns = Join.Ns.of_site server
    and b = (Join.Ns.lookup ns "reagent_b": int Join.chan) in
    spawn b(2);;

    The spawn operation will inject the reagent "b(2)" into the remote soup because the constructor "b" was defined by binding with a remote name server. Once the constructor "b" is defined as a local OCAML value, we do not need to do anything special to inject the reagent into the remote soup. The JoCaml system does this automatically.

    Injecting a slow reagent into a remote soup does not produce any immediate results. However, the remote machine could also inject something into our own soup in response. In this way, we could eventually get an asynchronous response.

    We can also inject instant reagents into a remote soup, which is analogous to a remote procedure call: some function will be evaluated synchronously on a remote machine, while our process is waiting, until a result value is delivered to us. Note that instant reagents might pass on exceptions generated on the remote machine to us!

    (If an exception is generated within an asynchronous payload expression, the exception is absorbed and the payload expression may remain incompletely evaluated.)

    In order to detect whether a remote machine is available, JoCaml provides two mechanisms.

    Synchronous: If we try to inject instant reagents into a remote soup but the remote machine was disconnected, the exception Join.EXIT is raised. Let us assume that the remote soup defines an instant reagent named "is_alive" of type unit->unit. We can obtain the constructor for this reagent (assuming that the name server "ns" has been already defined):
    let is_alive = (Join.Ns.lookup ns "is_alive": unit -> unit);;
    Now we can define a function that checks whether the connection is alive. The function is evaluated synchronously and may block.
    let remote_is_alive() = try is_alive(); true with Join.EXIT -> false
    This looks like we are injecting the reagent is_alive into our own soup. However, we have defined is_alive through binding with a remote name server, so the JoCaml system will transparently inject this reagent into whatever remote soup we connected with.

    Asynchronous: When the remote host closes the reagent server socket, the JoCaml system can inject a slow reagent of our choice into our own soup. This is done by the function Join.Site.at_fail. This function takes two arguments: a name server and a slow reagent of type unit Join.chan (i.e. the slow reagent is decorated with an empty value). We can then react to injection of this reagent as we see fit.

    Here is an example with a remote soup running as a separate process on the local machine and listening on port 12345.
    let server = Join.Site.there (Unix.ADDR_INET(Unix.inet_addr_loopback, 12345));;
    spawn (
    def remote_was_disconnected() = print_string "remote machine disconnected!"; 0 in
    Join.Site.at_fail server remote_was_disconnected; 0
    )

    A side note: I used spawn here, just so that the reagent constructor remote_was_disconnected can be locally defined with def ... in, which is permissible only within a reagent expression. Since this expression evaluates to the reagent "0", the spawn does not actually inject any reagents.

    How to understand the manual

    The official manual says:

    "Channels, or port names, are the main new primitive values of JoCaml.
    Users can create new channels with a new kind of def binding, which should not be confused
    with the ordinary value let binding. The right hand-side of the definition of a channel a is a process that will be spawned whenever some message is sent on a. Additionally, the contents of messages received on a are bound to formal parameters.
    "

    In our terminology, "channels" or "ports" are called "constructors for reagents", and "processes" are called "reagents". It is confusing to think about "constructors for reagents" as "ports" because a "sending messages on a port" suggests having a queue of messages, whereas reagents are not queued in any particular order. It is confusing to think about "reagents" as "processes", because reagents by themselves do not perform calculations; only reactions between (perhaps several) reagents can run and perform calculations. (In the manual, reactions are called "join definitions").

    The "port/message" metaphor does not work also because there are actually no fixed "ports" waiting for messages and executing the right-hand sides of reactions when messages "arrive". Consider the definition
    def a(x) & b(y) = print_int x+y; 0
    This is a definition of a reaction, not a definition of the "channel a" or a "channel b". In the "chemical" view, the reaction will start only when both reagents "a" and "b" are present in the soup. If only "a" is present but not "b", the reaction will not start. So it is incorrect to say that this definition describes the behavior of "channel a", or that a process is spawned whenever a "message arrives on the channel a".

    There is one aspect of the "chemical" model which is not quite precise: in JoCaml, we are not allowed to define a reaction with several copies of the same reagent, such as a(x) & a(y) & a(z) = b(x+y+z). However, the naive application of the "chemical" intuition would allow this kind of reaction: in real chemistry, such reactions are of course common. Now, the "channel/message" metaphor implies that there can be only one channel named "a". We may send three messages on this channel, but these three messages cannot be processed all at once. This analogy suggests that reactions with several copies of the same reagent ("port") are not allowed.

    The rest of the introductory parts of the manual are similarly quite confusingly written. For instance, it is said that "Processes are the new core syntactic class of JoCaml. The most basic process sends a message on an asynchronous channel. Since only declarations and expressions are allowed at top-level, processes are turned into expressions by “spawning” them: they are introduced by the keyword “spawn”."

    A "process" is spawned using the spawn keyword not because processes are "not allowed at top level" (why not?), but because we would like to add a new reagent to the soup, - which may or may not start new reactions immediately. Thus, the spawn keyword does not necessarily start new payload computations. There is a confusion between starting a process that computes an OCAML value, which is what one would expect here, and the statement that a process merely "sends a message on a channel". There is also a confusion between "spawning" processes by using the spawn keyword, and "spawning whenever message arrives on a channel" as mentioned before. According to this description, the expression spawn a(2) will "spawn a process", which "sends a message on a channel", which again "spawns a process"...? At this point, the new reader has no idea what is going on and probably gives up reading the manual.

    I think that explanations based on the "chemical" intuition are simpler and much clearer than explanations based on the "channel/message" metaphor.

    But I must say that, among all the scant information available about the JoCaml system and the "join calculus", the official manual is still less confusing than most other papers and tutorials.

    So far it seems that the "chemical" model works quite well. Next I will try to figure out some typical use cases and maybe do some testing.

    (To be continued.)
  • Profile

    chaource: (Default)
    chaource

    February 2026

    S M T W T F S
    123 45 67
    8 910 11121314
    151617181920 21
    22232425262728

    Most Popular Tags

    Style Credit

    Expand Cut Tags

    No cut tags
    Page generated Feb. 25th, 2026 12:07 am
    Powered by Dreamwidth Studios