chaource: (smiling face)
There are lots of people who understand moderately advanced to very advanced mathematics and who may want to learn how to write code. These people include mathematicians, physicists, engineers, and others. I will call them "math-savvy".

When math-savvy people look at almost any programming languages in use today, they are appalled by the lack of logical elegance in these languages, and hence they think programming is unpleasant.

There is now a practically useful programming language, Haskell, based upon the principle that programs should be as similar as possible to mathematical expressions. There are several important similarities between Haskell and mathematics. For example, values in Haskell are immutable and locally scoped, just like symbols such as x or f in a mathematical text: these symbols are defined once and then used throughout a certain part of the text (but not outside that part). If a value is defined but not used for any calculation, that value will be forgotten and never computed. The basic elements of a Haskell program are functions that are very similar to mathematical functions: a function takes an argument value and computes a result value. Functions are defined using formulas and/or inductive reasoning, similarly to how mathematical texts define functions. Each function in a mathematical text admits only a certain type of argument (e.g. real number, a group element, etc.) and computes the same type of value; same in Haskell.

I think that math-savvy people will find it quite easy and even interesting to learn Haskell. Once one learns some Haskell, it becomes much easier to learn most other programming languages. Techniques of "functional programming", i.e., similar to the basic idioms of Haskell, have become standard in many programming languages. So learning Haskell should be a good entry point into modern programming.

Ideally, there should be a tutorial on Haskell oriented towards maths-savvy people. When I first learned about Haskell, I couldn't find it. Does such a tutorial exist now? If not, what would be a good way of writing it - what specific programming tasks should be chosen as running examples, what kind of exercises, what subset of language to present first? Should the tutorial talk about implementing number theory algorithms, numerical solutions of equations, optimization problems, symbolic algebra?

PS

There exists one book on using Haskell for machine learning and data analysis. Might be a good start; haven't looked at it yet.
chaource: (smiling face)
http://chaource.livejournal.com/84503.html Prologue. Specification and synthesis in Boolean logic
http://chaource.livejournal.com/84914.html Part I. Temporal logic. Basic definitions
http://chaource.livejournal.com/85196.html Part II. Temporal fixed points
http://chaource.livejournal.com/85401.html Part III. A catalog of operations
http://chaource.livejournal.com/85714.html Part IV. Examples
http://chaource.livejournal.com/86203.html Part V. Normal forms
http://chaource.livejournal.com/86279.html Part VI. Satisfiability

In the prologue post I described the solution to the Boolean synthesis problem. If we write a specification as a Boolean formula, we can compute the required values of the unknown variables as explicit Boolean functions of the given parameters.

For example, given a specification

φ(a,b,x,y) = (a -> x)(b -> x'y) = (a' + x)(b' + x'y),

we demand that x and y be such that φ(a,b,x,y)=1. Here the values of a, b are treated as given parameters. The point of synthesis is to avoid guessing but instead to derive the rules of computation for x and y. Ideally, we merely state the specification and the computer will automatically deduce how to compute x, y so that the specification is always satisfied.

In the propositional Boolean logic, the synthesis problem is solved by a calculation. For the example above, we get the formulas

x=b', y=a'+b'.

These formulas can be seen as a "Boolean program" that calculates x and y given a,b. We have thus "synthesized" a program from the Boolean specification. The synthesized program is correct by construction; we can make a mistake only if we start with a wrong specification. (Incidentally, this is what happens in the example above: the specification is unsolvable if a=b=1.)

Can we find a similar procedure that allows us to synthesize "temporal Boolean programs" (also called reactive programs) from temporal logic specifications? If so, we will have a powerful tool for writing programs that are correct by construction.
Read more... )
chaource: (smiling face)
http://chaource.livejournal.com/84914.html Part I. Basic definitions
http://chaource.livejournal.com/85196.html Part II. Temporal fixed points
http://chaource.livejournal.com/85401.html Part III. A catalog of operations
http://chaource.livejournal.com/85714.html Part IV. Examples
http://chaource.livejournal.com/86203.html Part V. Normal forms

R. Kipling's temporal specification of the cobra Nagaina's behavior ("If you move, I will strike; if you do not move, I will also strike") gave us an example of a curious recursion where the same formula φ seems to be both a pessimistic and an optimistic fixed point:

φ = ALL(SOME s) = s N+(φ) + N-(φ).

Is this a new kind of recursion that we might want to explore? What exactly is the meaning of N+ and N- in this equation?

Answers to these questions will help us understand the use of normal forms in temporal logic, which is mainly for deciding the satisfiability of a temporal formula.
Read more... )
chaource: (Default)
http://chaource.livejournal.com/84914.html Part I. Basic definitions
http://chaource.livejournal.com/85196.html Part II. Temporal fixed points
http://chaource.livejournal.com/85401.html Part III. A catalog of operations
http://chaource.livejournal.com/85714.html Part IV. Examples

In the ordinary Boolean logic, we are able to perform calculations by straightforwardly rewriting the formulas, much like in ordinary arithmetic. For example, we can expand the parentheses and simplify terms that cancel or duplicated terms:

φ = (a+b)(a'+bc) = aa' + a'bc + ba' +bbc = 0 +a'bc + a'b + bc = a'b(c+1) + bc = a'b+bc.

The result of this kind of simplification -- expanding parentheses and canceling trivial terms -- is a normal form of the initial formula.

In this kind of calculation we do not have to guess anything or apply any tricks. We simply keep rewriting the expression, using the rules of Boolean algebra such as a(b+c)=ab+ac; (a+b)'=a'b'; aa'=0; bb=b; c+1=1, and so on, until no more rules can be applied. Some simplifications, such as a+ab=a, are peculiar to Boolean algebra; otherwise the calculation is quite similar to what we would do when simplifying a formula in the ordinary arithmetic.

Now we would like to obtain a normal form for temporal logic expressions. We want to be able to simplify temporal formulas and to check whether a temporal formula is equal to FALSE.

Read more... )
chaource: (Default)
http://chaource.livejournal.com/84914.html Part I
http://chaource.livejournal.com/85196.html Part II
http://chaource.livejournal.com/85401.html Part III

At this point I'd like to see some examples of properties specified through temporal logic. The goal is to introduce temporal Boolean properties A, B, ... and to write a temporal formula χ(A,B,...). This formula expresses the specification: the sequences A, B, ... should be such that χ(A,B,...) is equal to TRUE = {1, 1, ...}.

Example 1. "If you do not have the passport or the ticket, you are not allowed to board the airplane."

Denote by P and T the properties "have passport" and "have ticket", and by B the property "boarding the airplane now". We have a restriction that B is false at any time if one of P or T is false at that time. So we write the formula

P' + T' -> B' -- or, equivalently, B -> P T

This formula must evaluate to "true" at all times, so the final specification is

χ(B,P,T) = ALL (B -> P T)

Note that this specification does not imply that you can board only once: B can be "true" at many instances of time, or even at all instances. All χ implies is that if you are trying to board the plane at any time, you need to have P and T equal to "true" at that time.

Example 2. "Whenever a message is sent, another message will be eventually received."
Read more... )
chaource: (Default)
http://chaource.livejournal.com/84914.html Part I
http://chaource.livejournal.com/85196.html Part II

Temporal logic can express properties such that "x must become true when y first becomes true" or "x stays true until y first becomes false". To even come up with such properties, we used intuition or guessing. It would be useful to derive the list of all the different temporal logic operations we can use.

There are some relationships between logic operations. Ordinary Boolean logic has relations like (a+b)' = a'b', so the "and" operation can be expressed through "not" and "or". Knowing these relations, we can transform or simplify Boolean expressions.

Similarly, there are relationships between temporal operations. Some of these we have seen: ALL(x') = NOT SOME(x); ALL(x) = x ⊵ TRUE. It would be useful to have a small set of temporal operations from which all others can be expressed, and to determine the relationships between temporal operations. Our goal is to be able to perform calculations in temporal logic as effectively as in Boolean logic.

In all texts about temporal logic I've seen so far, these questions are not considered. Instead one postulates a fixed, small set of temporal operations. Some authors use "strict UNTIL", others "weak UNTIL". I would like to get to the bottom of it.

Update: what I am trying to develop here is known in the research literature as "fixpoint linear temporal logic". Rigorous definitions are in [Banieqbal, Barringer: Temporal logic with fixed points. 1989. Springer lecture notes in computer science, volume 398. Pages 62-74.]

Read more... )
chaource: (Default)
I talked about a recursion approach to temporal logic in part I. Temporal logic operates on infinite sequences of Boolean values and includes ordinary Boolean operations (AND, OR, NOT) as well as new, "temporal" operations. The only allowed "temporal" operations are those defined recursively, through an equation of the form

F x = φ(x, NEXT(F x))

where φ(x,f) is a Boolean formula.

For instance, the temporal operations ALL(x) and SOME(x) are defined recursively by

ALL x = x NEXT(ALL x)

SOME x = x + NEXT(SOME x) (*)

I am now using the notation "F x" instead of "F(x)" for temporal logic operations. Where needed for clarity, we will still use parentheses.

Ambiguity of recursive definitions


There is, however, an important deficiency of the recursive definitions: they cannot, by themselves, define temporal operations unambiguously.

A recursive definition "F = φ(x, NEXT F)" is usually understood as an "equation" for the unknown function F; due to the form of this equation, a solution F may be called a fixed point of the function φ(x, NEXT F). In any case, the solution of this equation is the function F we are defining. However, what if this equation has several solutions? Which of the solutions is the function F we are trying to define?

An example of this problem is shown already by the temporal operations ALL and SOME. Initially, we defined SOME(x) as the logical "or" of all values of the sequence x, starting from the current value. For instance, if x = {0, 0, 1, 0, 0, ...}, with all values 0 after time 3, then SOME(x) = {1, 1, 1, 0, 0, ...}. However, the recursive definition (*) admits a solution where SOME(x)=TRUE regardless of x:

TRUE = x + NEXT(TRUE) -- this is obviously an identity, for any Boolean sequence x.

Therefore, the recursive definition (*) by itself does not uniquely define the function SOME.

Similarly, the recursive definition for ALL(x) admits a solution when ALL(x)=FALSE for any x.

How can we fix this?

Read more... )
chaource: (Default)
By now I have read many expositions of temporal logic, and I am getting an interesting picture. I am striving for a presentation that is simpler and more understandable than the standard definitions in the literature. So I will start with a particular simple kind of temporal logic (called "propositional linear" temporal logic), rather than with a general definition, and I will use the familiar propositional Boolean logic as a basis.

Boolean logic talks about variables such as x, y, ... that can have values 0 or 1, i.e. Boolean variables. One computes functions of these variables, and the resulting values of these functions are also 0 or 1.

Temporal logic talks about a different kind of variables, "temporal propositions", whose values are infinite sequences of Boolean values.

A temporal proposition x is an infinite sequence {x1, x2, ...}, where each of the x1, x2, ... is a Boolean value (0 or 1). The interpretation of a "temporal proposition" is that a certain property or statement "x" (such as, "it is raining") can be true or false at different times. We represent time by a discrete sequence {t=1, t=2, t=3, ...}, like a sequence of days or other discrete intervals. So, "x3"=1 says that the property "x" holds at time "t=3".

The main task of temporal logic is to compute Boolean functions involving temporal propositions, i.e. "temporal functions". A "temporal function" is a map between temporal propositions. We can apply a temporal function to a temporal value, and the result is again a temporal value, i.e. an infinite sequence of Boolean values.

How does one write a "temporal function"? For example, if x = {x1, x2, ...} is a temporal value then a function of x could be something quite arbitrary, such as {x1 + x3 x5', x2 x8 x10, x2+x3, ... whatever .... }. Each element of the resulting sequence could be an arbitrary Boolean function of some of the elements of the sequence x.

The important idea in temporal logic is that we admit only a narrow class of temporal functions. The admissible temporal functions must be defined by a single-step recursion: we define what happens with the first element of the sequence, and then we repeat the function on the "tail" of the sequence.

But let us start from simple examples.

Read more... )
chaource: (Default)
I am still hopeful that temporal logic will yield a workable method for programming. So far the most useful paper has been that by L. Lamport (of LaTeX fame).

However, I realized that it would be certainly helpful, as a first step, to understand the solution to the same problem in ordinary Boolean logic (i.e. classical propositional logic). This turned out to be a useful exercise, indeed.

An example: We are given a Boolean formula f(a,b,x,y) where a, b, x, y are elementary propositions (i.e. variables with values "true" or "false"). The variables a, b are considered to be the "input" and x, y as "output". We are required to write an algorithm that takes values a, b and computes some values x, y such that f(a,b,x,y) evaluates to "true". In this way, the formula f is a specification of the desired properties of our program.

What we would like to do is to derive our program directly from the formal specification. (This is the task of "program synthesis".) In other words, we want an algorithm that takes any formula "f" and outputs a program that correctly computes x and y. Of course, this should work for any number of input and output variables.

Read more... )
chaource: (Default)
Part 1.
Part 2.
Part 3.

The join calculus has some further limitations, beyond those I already described:

  • It is impossible to define a reaction with a variable number of input reagents, say a reaction that starts when a(x) is present or when a(x) and b(y) are present. It is also impossible to define a reaction with duplicated reagents. We cannot have a reaction that starts when, say, exactly three reagents a(x), a(y), a(z) are present.

  • More generally, we cannot have reactions with a computed set of input reagents. For instance, if we wanted to define a reaction with 1000 input reagents, we cannot write the input reagents as "a_n for n = 0, ..., 999". We would have to write each of the reagents a000, a001, ..., a999 explicitly.

  • It is not possible to specify reactions that depend on the set of input reagents. A reaction starts when its input reagents, statically specified, are present; the chemical engine only checks that the reagents are present. We cannot specify any additional computations to determine the set of input reagents for a reaction.

    For example, if we wanted to have 1000 reagents named a000, a001, ..., a999, and a reaction that starts when any 100 of them are present, the chemical engine would have to perform a computation before deciding whether to start a reaction. By design, the join calculus prohibits any nontrivial computations before starting a reaction.

  • For the same reason, we cannot specify that a reaction starts only if some computed condition holds on the decoration values of the input reagents.

    So we cannot have a reaction a(x) & b(y) = ... that starts only when some computed condition p(x,y) holds on the values x, y.


Nevertheless, it seems that the join calculus can express essentially all concurrent computations. I would like to consider three examples now:

  1. Given an array of functions, produce an array of their result values. The functions should be evaluated asynchronously in parallel.
  2. Given an array of integers, compute their sum. All pairwise summations should be evaluated in parallel in arbitrary order, and the partial sums should be also added in parallel in arbitrary order.
  3. Sort an array of integers using the merge-sort algorithm, again doing as much as possible in parallel.


Now let us figure out the implementation of these examples in join calculus. We will be using the purely "chemical" approach to concurrent computation. We will never say the words "semaphore", "thread", "deadlock", "mutex", or "synchronize". Instead, we will talk about reagents and reactions. My goal is to see what kind of tricks and patterns emerge from this paradigm.

Read more... )
chaource: (Default)
I described in Part I the "chemical abstract machine" model of concurrent computation, and in Part II some additional features of the JoCaml system. Now I would like to see how the "abstract chemistry" needs to be designed if we would like to perform some common tasks of concurrent programming. My goal is to see whether one can actually think in terms of "reagents" and "reactions" rather than in terms of "processes", "channels", or "messages". Is the "chemical" approach more productive for designing concurrent software?

The "chemical" way of thinking has only two primitives:

  • defining a chemical reaction between reagents, -- keyword def,
  • injecting of reagents into the soup, -- keyword spawn.

    Computations are imposed as "payloads" on running reactions. Thus, we organize the concurrent computation by defining the possible reactions and injecting the appropriate reagents into the soup.

    The JoCaml notation for reactions is actually so concise that one can design the reactions directly by writing executable code!

    Read more... )
  • chaource: (Default)
    I described in Part I the "chemical abstract machine" model of concurrent computation. In that model, one imagines a soup of "abstract chemical reagents" that interact randomly among themselves, according to a predefined set of possible "chemical reactions". Each "reaction" carries a computational "payload", which is an expression computed when the reaction takes place. We can organize a complicated parallel computation by designing an appropriate rules of an "abstract chemistry" and by running the machine with chosen initial "reagents".

    The JoCaml system adds several other important features to this model, and I would like to interpret all of JoCaml using the "chemical" view since it makes everything much simpler. (Other tutorials talk about "channels", "sending messages", "processes" and so on, which is quite confusing. In particular, I think that the "channels/messages" metaphor does not work well in this context.)

    Read more... )
    chaource: (Default)
    Continuing my exploration of OCAML, I found that there is an extension called JoCaml, which supports concurrency. I don't yet know if it really supports multicore computing - probably not, a real shame! But JoCaml's view of concurrent execution is very nice.

    Now, there is almost no tutorial documentation either on the formal model (the "join calculus") or on the JoCaml, except for the JoCaml manual. Other tutorials and presentations about the join calculus merely confused me after a few pages (and probably will just as well confuse you). Things were quite unclear to me until I discovered the following formulation, in the introduction of one of the quite formal early papers on the join calculus. Now I would like to write down a distilled explanation that makes sense to me.

    "Join calculus", also known as the "Reflexive Chemical Abstract Machine", is a model of concurrent computations that seems to be a lot easier to think about than anything I've seen so far on the topic of concurrency. It really looks like we can forget all that synchronize-mutex-semaphore-exception-fork-thread-race-condition-deadlock baggage and just get down to coding!

    Read more... )
    chaource: (Default)
    Can't stop thinking about "parsing expression grammars" (PEGs).

    First of all, the choice of the terms "PEG" seems rather infelicitous to me. (I would rather say "deterministic top-down parsers".) The words "PEG" mean really "grammars described through expressions intended for parsing the input text". We are given some input text, and we are using some pattern expressions as instructions for processing the input. PEGs are an alternative to the Backus-Nauer Form (BNF).

    There seem to be relatively few papers about PEGs. I read a paper by Mizushima et al. that explains how Prolog's "cut" operator can be used with PEGs, together with Ierusalimschy's lucid description of an implementation of PEGs through a simple virtual machine. Then I tried to see if it were easy to implement the "cut" operator in that virtual machine. It turned out that the "cut" operator is more basic than most of PEG operators. So I realized that one can describe PEGs by introducing only two easy constructions. What follows is a tutorial about this, which, I hope, is more understandable than other descriptions of PEGs.

    [Even though I introduce different (more "low-level") operators than the original PEG papers, I will still call this description a PEG, since the original PEG operators generate the same class of languages. So here goes.]

    PEG patterns are instructions for parsing an input string. The input string is a linear array of characters. The parser looks at each character and goes on to the next characters when some given pattern matches. When the parser goes on to the next character, we say that the current character "has been consumed from the input".

    In its work, the parser follows a fixed set of rules, as described by the given PEG pattern.

    The definition of a PEG pattern is recursive; it builds up larger patterns from smaller ones:
    Read more... )
    chaource: (Default)
    Part I
    Part II
    Part III

    Both with syllogisms and with soriteses in the sense of Carroll, we are given information that has an existentially quantified part and a universally quantified part. Previously, we achieved a lot of progress once we reduced the universally quantified part to a statement with just one universal quantifier. For example, the information "For all x, all y : A(x)B(y)" is equivalent to "For all x : A(x)B(x)". Now, this kind of simplification is impossible with existential quantifiers. For example, the information

    (*) "For some x, some y : A(x)B(y)"

    means that some objects have property A while some other objects have property B. This is not equivalent to "For some x : A(x)B(x)" because we do not know that the same objects have both properties. We could instead derive from (*) the disjunction "For some x : A(x)+ B(x)". [In our notation, A+B means the disjunction "A or B".] However, this is a weaker consequence, since it admits situations where no objects have property A, while (*) does not admit those situations.

    If we are looking for a consequence of the kind "Some objects have property A", then we are looking for a statement with just one existential quantifier. Therefore, as a first step we are obliged to reduce the information to a Boolean formula with a single quantifier. We could make a choice of how to do this:
    Read more... )
    chaource: (Default)
    Part I
    Part II

    If we are given information as universally quantified statements ("All objects satisfy properties A, B, ...") together with existentially quantified statements ("Some objects satisfy properties C, D, ..."), and we are asked to derive a statement of the form "All objects satisfy properties X, Y, ...", we are compelled to use only the universally quantified part of the given information. We cannot use the information of the form "Some objects are ..." to derive "All objects are ...".

    We already know that our universally quantified information is equivalent to a formula of the form "For all x : [ F(x) ]", with a single variable "x" and a Boolean formula F that depends on various predicates. Let us now see what we can do with such a formula to derive consequences from it.

    Consider two examples:

    1. "All bald creatures are conceited. All aardvarks are bald creatures."

    Denote aardvarks = A, bald creatures = B, conceited creatures = C. Then the universally quantified part of the information looks like this expression:

    For all x : [B(x) -> C(x)] [A(x) -> B(x)].

    Now, we can easily see that the desired consequence is "For all x : A(x)->C(x)", but we would like to avoid any heuristics that require us to guess how to proceed. We would like to obtain consequences from the given expression by an algorithm, and not by "inspecting" a graphical diagram, by guessing how to build a derivation tree, or by other heuristic methods.

    We note that the formula after the quantifier can be seen as a Boolean expression F(A,B,C) in the variables A, B, and C: that is, we can temporarily ignore the fact that these are predicates depending on "x". We can then simplify this Boolean expression using the rules of Boolean algebra and bring it to the conjunctive normal form or to the disjunctive normal form:
    Read more... )
    chaource: (Default)
    Part I

    So now we have a Boolean formula with quantifiers, something like this:

    For all x, some y, some z : [ (B(x)C'(x))' B(y) A(z) B(z) ].

    (Here we denote the logical "not" by the apostrophe, and the logical "and" is left without a symbol, like the algebraic product.)

    It seems that Carroll's methods are based on graphical diagrams and heuristic choices of predicates, and are not formalized as an algorithm. Moreover, Carroll only presents examples of two kinds: syllogisms and soriteses, but does not discuss a more general problem of this kind. (The strange word "soriteses" is an invention of Carroll's, which is intended to be the plural of "sorites". He also advocates using "serieses" as the plural of "series". I will use "soriteses" since the word "sorites" seems to be almost forgotten already.)

    Books on logic say that full predicate logic (allowing arbitrary nested quantifiers and predicates) is algorithmically undecidable. However, we are not dealing with full predicate logic here. The kind of expressions arising from syllogisms and soriteses differ from full predicate logic in two important ways:
    Read more... )
    chaource: (Default)
    Recently I have been studying formal logic; at this time, mostly at an elementary level, up to predicate logic. Then a book by Lewis Carroll, "Symbolic logic", caught my attention. I read that and tried to understand what those "syllogisms" and "soriteses" were all about. After some struggling, - Carroll doesn't go very far in generalizing from his examples, - I got the following picture.

    All problems of "syllogism" or "sorites" are equivalent to the following single problem. Read more... )
    chaource: (smiling face)
    Recently I read some books on probability theory and realized that there is no empirical and consistent definition of probability. If probability is defined as "frequency of occurring events" then there can be no way to compute or predict the probability: there is no theory of probability beyond a description of what events have already occurred with what frequency. If probability is defined axiomatically, there is no proof that it has any relation to experiments, and no way to verify experimentally that a certain process is well described by a theoretically derived probability distribution. There is only one real justification for using probabilistic reasoning in practice:
    Read more... )

    Profile

    chaource: (Default)
    chaource

    February 2026

    S M T W T F S
    123 45 67
    8 910 11121314
    151617181920 21
    22232425262728

    Syndicate

    RSS Atom

    Most Popular Tags

    Style Credit

    Expand Cut Tags

    No cut tags
    Page generated Feb. 24th, 2026 01:41 pm
    Powered by Dreamwidth Studios