language engineering - Medium

Testing in JetBrains Meta Programming System: What and Why?

Kolja Dummann — Tue, 27 Nov 2018 09:21:12 GMT

There are a number of questions around testing in JetBrains Meta Programming System (MPS) that I get a lot. For example: How does MPS affect the way I do testing? How can I do TDD in MPS? Should I test this in MPS?

In this post I will try to answer these questions, but also enable you to answer these questions for yourself. To answer these questions we have to take a step back and reflect on why we do testing in the first place. This post is not a guide on how to implement tests in MPS, it focuses on the bigger picture: what you want to test, and why.

For this post I assume that you have some basic knowledge about JetBrains MPS and are familiar with its terminology. If you are unfamiliar with MPS I suggest taking a look at this ongoing video series I‘m doing on Twitch or to checkout the „Fast track to MPS“ tutorials by JetBrains.

Why Tests?

One questions we need answer before we can look at what we should test is why do we test at all. Obviously we don‘t do testing for the purpose of having tests. The reasons why we do testing mostly fits on one of these three categories:

Validation

Tests validate that the software is in accordance with a specification, norms or legal requirement. Examples here might be the law demanding that passwords are stored encrypted or that exported data from the software fits a schema.

Maintenance

This is probably the most obvious reason why we test. We want to ensure that changes to the software do not accidentally change the way the software behaves. For instance, a calculation is working correctly or an error message is presented to the user if an input was invalid. Obviously there is some overlap here with testing because of validation but the motivation is different. The tests are not performed because a change in the behavior of the software would violate a specification, but because the behavior change in general needs to get detected. In some cases that would mean the software still behaves according to the specification but the test fails.

Quality

It sounds pretty obvious that we want to improve the quality of our software through tests but what does that actually mean?

Do we get better quality because a tests fails? Mostly not. In fact when doing Test Driven Development (TDD) with its test first approach, even during development the test cases rarely fail. How do we get better quality then when not through tests that find bugs in our implementation during development?

The fact that you have to write the test first requires you to think about the actual problem and how to test it. It enforces thinking.

As Mike Gehard has put it: TDD stands for Thought Driven Development

TDD = Thought Driven Development

The whole part on why testing is heavily inspired by this talk by Micheal Feathers. If you want to get a deeper understanding on why and maybe why not to test something I really recommend watching it.

Testing in MPS

JetBrains MPS ships with built-in support for testing various aspects of a language. For instance editors, type systems, constrains or scoping. The newest addition is support for generator tests, which at the time of writing is still very basic. These built-in testing facilities also integrate nicely with the well-known ways of writing unit tests in plain Java.

One thing that MPS requires is that you think about your problem and how to encode it. Its dedicated language aspects and their DSLs enforce a certain separation of concerns. To encode a simple language concept that also reports errors to the user, you need to divide it into at least it‘s structure, some constrains for its values and other means of checking supported byMPS. Doing this often results in much deeper understanding of the problem and its solution. For me, the idea that tests force me to think about a problem, more than I would otherwise do, is much less prominent in MPS than in for instance in Java.

In this aspect, MPS is similar to programming languages with powerful type-systems like Scala or Haskell. When you want to encode a problem in way that is native to these language you need to think about the problem a lot and get a better understanding of it this way. These languages also often make illegal state unrepresentable. With its individual DSLs for the different aspects of a language MPS does something similar but even goes a bit further: it also enforces a software architecture. You can‘t skip some layer easily in these DSLs. For instance you can‘t do editor-related operations in the constrains aspect of a concept. Of course this is not bullet-proof and with enough (misguided) energy you can still do these things, but doing so is usually much harder than doing the right way.

Of course there are exceptions. Not everything that is build with MPS is implemented with language aspects or DSLs that require you to get the abstractions right. There are also parts that are written in plain Java where a approach like TDDs test first is beneficial.

For the other reasons to write tests, validation and maintenence, MPS is not much different from other software development tools or approaches. If something is missing from the built-in test support or does not fit the needs it‘s often easy to test these things manually by interacting with the respective APIs. Hiding these APIs is often also possible via extensions to the testing languages.

Because it‘s easy to interact with programs written in MPS programmatically, adding new kinds of tests is often possible. Getting new metrics about the tests or their coverage is also easy. These metrics can be much more domain specific that for instance the typical line/branch coverage information you get for programming languages. Well chosen metrics can help a lot to gain valuable insights about the quality of the tests. In some domains e.g. medical, these metrics can also help to verify and qualify the tool/approach to certification authorities.

At the moment MPS does not support property-based testing for anything related to languages. Of corse it‘s possible to plugin a property-based testing framework for Java code written in MPS.

What to Test?

Now that we know why we test it is easy to decide what to test, right? Not quite, there are a lot of factors that you should take into account.

Project

The purpose and context of the project has significant influence on what needs testing. A project with the purpose to prove a concept or that is still exploring the domain, most probably does not need to test for changes in behavior as behavioral changes are intended at this point. Especially with MPS’ ability to quickly evolve languages testing editor behavior it pretty much not what you want to do that at this stage of a project. In contrast, a project that aims to build a DSL to program pacemakers, with a good understanding of the domain and a lot of users in the field has a stronger requirement on what to test. In the latter case you probably don‘t want to change the behavior of the software unintentionally.

Risk

Another factor to take into account is the actual damage if something goes wrong. What are the consequences if a certain aspect of a language does not work as expected? One good example for this are constraints on a concept that limits where it can be instantiated. If these constrains aren‘t working as intended what are the consequences? In most cases it will lead to an error in the downstream processing of the model, an interpreter will fail to produce a result or a generator will generate code that does not compile. In this case an error in the language produces a usability problem, the user can enter something that is unsupported, but will not produce a program/artifact that behaves in a erroneous way. The effort for testing these constrains is also not trivial and therefore most of the time it’s not done.

Maintenance

All code that is written needs to be maintained and this also applies to tests. Test must evolve along with the software they test. As an example, consider editor tests in MPS. They depend on the layout of the editor, when an editor definition is changed these tests tend to fail and require maintenance. The same applies for tests that check scopes of references, these tests need very well defined input model to work reliably.

Don‘t Test the Infrastructure

Awareness of what your are testing it important. In a Java project, would you test that method invocation on JVM works correctly? No. The same applies to MPS. As an example a editor definition that uses a grammar.wrap needs no editor test, the only thing the test would test is that the grammar cell wrap generator works correctly. A complex hand-written contribution to the transformation menu? You most probably want to test that.

Conclusion

If we take a step back from the concrete implementation of tests and look at the motivation behind testing we can see that the usage of MPS largely doesn’t affect testing. Though one aspect is different: the motivation „forced thinking“ from TDDs „test first“-approach is less prominent than in classical programming environments.

The decision of what you test is in the end much more influenced by the context than by the tool you use. Testing doesn’t come for free, the decision whether it is worth the cost depends on the project and risks involed. As with all things in software development there is no silver bullet . Think about the context and the risks involved and the take cautious decision what needs testing and what does not.

Testing in JetBrains Meta Programming System: What and Why? was originally published in language engineering on Medium, where people are continuing the conversation by highlighting and responding to this story.

MPS, Feature Branches, Language Migrations- DOs and DONTs

Kolja Dummann — Thu, 10 May 2018 07:58:57 GMT

MPS, Feature Branches and Language Migrations: DOs and DON’Ts

The MPS meet-up in Munich sprung some discussions about how feature branches and language migrations get along. It seems that there is quite some uncertainty how to approach this. In this blog post I shed some light on the topic and show what we found to work in practice. I assume that you already have some background in MPS. I won’t explain the details of how to write migrations or other aspects of MPS that play a role in the overall picture.

Let’s first explain what I mean by the two main topics I’m going to talk about here: Feature Branches and Language Migrations.

Feature Branch

Here I’m going to stick with Martin Fowlers definition:

The basic idea of a feature branch is that when you start work on a feature (or UserStory if you prefer that term) you take a branch of the repository to work on that feature. In a DVCS, you’ll do this in your personal repository, but the same kind of thing works in a centralised VCS too.

So in general it’s a branch that you create specifically to work on your feature without putting the changes directly into the main development branch. In effect, you isolate the development of your feature from everybody else’s work as long as you are not yet done.

Language Migration

In the context of MPS, language migration refers specifically to migration scripts that are written by the language developer in the migration aspect of a language.

There are also two other reasons why MPS would run migrations on the project. One is refactoring logs, which are automatically generated by MPS when, for instance, a concept is moved into a different language. While you can’t really influence the content of such auto-generated scripts, most of the suggestions of this post also apply to this kind of migrations.

The second type is called update module dependency versions. It is basically updating the transitive dependency versions in the language/solution file. This migration is always safe to execute; even if it causes a merge conflict, the MPS merge driver is pretty good at solving these automatically.

DOs

All of the following points mentioned here aren’t strict rules. They should be considered more like guidelines to keep in mind and applied when needed. Sometimes it makes sense to divert from them. But diverting should be an deliberate decision.

Update Dependencies Explicitly

One main reason why migrations are required is because some upstream dependency has changed and the updated version contains migrations. When this happens on two branches and the migrations are executed concurrently on them, merging them will cause merge conflicts. This typically happens when using a build system that also manages your dependencies like Gradle or Maven. A unintended update is avoidable by making the update of an upstream dependency an explicit step. If you are using Maven or Gradle and you specify your dependency by using a version matcher (for instance: 1.0+), this can cause your build to fetch a newer version of a dependency without noticing. If there are multiple feature branches active at the same time it can happen that these migrations are then executed on two branches. This is very likely because developing multiple features in parallel is one of the reasons why feature branches are used in the first place. When merging these changes back this will certainly cause merge conflicts because the migration will do the same changes on two branches.

These conflict are avoided by making the update an explicit step and not simply letting it happen during a random build on some developers machine. If you are using Gradle it is easy to handle. This plugin supports locking your dependencies from unintended updates during the build while still allowing you to use a version matchers. I won’t go into the details how it works here but it basically stores the exact version a version matcher has resolved to and uses this version until the user instructs it to update the version again.

Write Resilient Migrations

Writing resilient migrations applies more to language developers than to language users but many projects use their own languages to dog food their development.

What do I mean by resilient migration? Resilient means for me that the migration doesn’t entirely depend on the MPS infrastructure, most importantly the language version number, to determine which part of the model requires migration. It automatically results in migrations that are rerunnable. Of course this is not always possible because in some situations it’s impossible to detect if parts of a model need migration without relying on the language version that MPS stores alongside with the model. Especially in cases where the migration is required to correct/change semantics without a related structure change of the meta model this is impossible. But in many cases it is possible to detect this independent of MPS and it also results in a better experience for the end user.

Let me give an example:

When adding a new child to an existing concept, the child should have a default value. You might have a concept constructor or a node factory that correctly initialises this child for each new instance of the concept. Now you need a migration that sets the value to the default for all existing instances. This migration will get executed by MPS automatically for each Module (Solution or Language) that uses an older version of the language. In the implementation of the migration we could simply assume that all instances of our concept need a default value for the new child. Thinking like this is natural because we assume that the instances are in their old state and we now need to update all of them. This also results in the fact that we could never execute the migration again after is has run, it would simply override all instances where the user choose a difference value than the default. But lets face it, the assumption that users will execute your migration right away when they first see the migrations assistant popup isn’t always correct. Users will simply click cancel and delay migrations, might play around with the update and then later run the migrations. If, before the migrations are executed, the user has already created new instances of the concept and selected a different value than the default the migration will also overwrite them.

To prevent this problem, the migration can be written in a slightly different way which then makes it rerunnable and thus can also handle the situation where the user delay the execution. The migration could simply apply itself only to nodes where the newly created child is still empty. As there is no way for the user to construct a model where the child is not set this must be a node that still requires attention from the migration.

Don’t forget to mark the migration as rerunnable 😉.

Migrate Before Merge

Once we have resilient migrations we can make use of them. For instance to prevent outdated models from showing up on the main development branch even when MPS could not automatically determine that migrations were required. Before a feature branch is merged, simply check if anything requires migration by selecting the run rerunnable migrations entry form the migration menu. This will execute all migrations marked as rerunnable. Make sure you have committed all your changes before running this. Afterwards commit the migration result in a single commit.

I wouldn’t recommend adding this item to your normal feature branch merging habit but it can help in situations where migration where executed on incoming changes to ensure everything is up to date.

Commit Migrations Separately

When migrations are executed, put them into a separate commit that only contains the result of the migration. Worst case, this allows this commit to be reverted or dropped from the history, if necessary. It also allows easier review of the migration result.

Short Living and Isolated Branches

One key point aside from technical concern in MPS is that feature branches should be as short lived as reasonably possible. It allows to keep the branch isolated during the time it is active. Ideally you don’t need to merge your development branch into it while the feature branch is active. Of course this is not always possible but reducing the amount of merges also reduces the amount of potential migrations required.

Apply your Migrations before Merging

If migrations have been written on a feature branch that affect parts of the project, these migrations should get executed as part of the merge process of the feature branch. This way you ensure that in there are no migrations to be executed on the main development branch. Combined with the explicit update of dependencies this is quite effective to prevent accidental migrations.

DONTs

The points below are common pitfalls that we have observed when working with migrations in MPS in general and aren’t really specific to feature branches but still worth while considering.

Migrate when Solving Merge Conflicts

When solving merge conflicts MPS will also pop up the migration dialog if it detects migrations that are executable. This often happens when languages that had a conflict are built from the IDE to ensure the conflict resolution has worked out and the merged changes contained a new migration. It’s important not to apply this migration while merging. First solve the conflicts, commit the merge and then apply the migrations. This way it’s easy to commit the migration result as a dedicated commit.

If you agree that running migrations during a merge is generally a bad idea, you should upvote an MPS feature request to prevent this here.

Merge “master” and Migrate

Merging your main development branch and apply the incoming migration without thinking about it will sooner or later cause harm in your codebase which can be a pain to address. This might be one of the reasons which lead to the initial statement that feature branches are dangerous because of migrations. Whenever migrations are going to be executed, think about if it actually makes sense to execute them right away. MPS does not allow you to execute migrations selectively (yet). You might be in the middle of writing a migration and therefore running the migration from the incoming changes might also execute your (incomplete) migration. In these cases skip them and execute them when you think it’s a good point in time to do so.

Take Away

I hope I was able to show that language migrations in conjunction with feature branches aren’t something to be scared of. However, it is important to put deliberate thought into writing your migrations as it will benefit the developer of the language, and even more the end user.

Languages aren’t much different from other dependencies in software development. Updates to dependencies should be done deliberately to prevent problems with uncoordinated updates. In order to avoid large changes that might require integration against lots of migrations, make sure you work in small increments and merge feature branches early and often.

MPS, Feature Branches, Language Migrations- DOs and DONTs was originally published in language engineering on Medium, where people are continuing the conversation by highlighting and responding to this story.

Categorising the Complexities in Programming

Markus Voelter — Tue, 03 Apr 2018 07:48:41 GMT

Trying to understand what makes programming hard.

As you could tell from my previous post on teaching the basics of programming, I am struggling to understand how what I want domain experts to do with DSLs is different from what we all know as programming. Sure, the obvious one is that one is domain-specific, whereas the other is not, and that the DSLs have more domain-specific concepts and notations. That does make things easier to learn and understand for domain experts.

But are there more fundamental differences? In the introduction to the Programming Basics tutorial I wrote that I don’t want domain experts to have to build their own abstractions — they are expected to use existing ones. That’s of course not really true, because an insurance contract described with a DSL is an abstraction. But it’s not a reusable abstraction.

So, in this post I try to classify different approaches to programming according to the complexities, or challenges in learning, they incur. This is of course just a (hopefully reasonably argued) proposal, there are many other dimensions according to which the space can be structured. An obvious one is size: the bigger a program gets, the harder it is to understand.

Please comment where you agree or disagree.

Workbooks

The simplest form of programming is what I call creating a workbook. Workbooks are characterised by the fact that the program logic and all relevant data is “on the same page”. The obvious example of this approach are spreadsheets, they are essentially 2-dimensional workbooks where every value is addressable by its coordinates. But there are other document-like interfaces, for example, Mathematica Notebooks or IPython/Jupiter Notebooks. They are essentially glorified REPLs where you alternate inputs and outputs, and you can refer to earlier outputs in downstream input expressions.

If your language is functional (which is true for spreadsheets and also most Mathematica Notebooks) then you can re-evaluate every input over and over again, and the outputs update accordingly. So the state of the program is right there on the page (in the cells or the outputs). This is really simple, because everything is right there on the page.

One could also imagine workbooks that organise there state as a tree, and you address parts of the state via path expressions.

An important ingredient to this style of “programming” is the fact that you see the output immediately (“live”). So there is no edit-compile-run cycle. Instead you change functionality or data, and the workbook immediately and automatically updates.

(A comment: I could argue why this style is very simple because it doesn’t have all kinds of complexities. However, I won’t, because the following sections introduce each of those complexities step by step — I don’t want to say everything twice).

Purely Functional Program

So, is a workbook actually a program? I think one could argue either way. But the fact that all relevant data is on the same page (i.e., part of the “program”) makes it a kind of simplistic, degenerated form of a program (which is why I mentioned it first in this categorisation). A “real” program does not know all of the data it might work with at the time of writing the program. This, of course, makes writing such a program harder: the programmer has to anticipate the possible input data, and ensure that the program works correctly for all possible inputs. To express for which data the program is valid, the programmer uses typed arguments. I think creating things that can be parametrised, i.e., things that have some kind of interface, is where programming really starts.

More generally, there is this separation between writing the program and running it. This separation is not there in a workbook, a workbook always runs (or never, depending on your perspective). Which is why using Excel or using Mathematica is often not called programming.

In contrast, in a functional program, the programmer has to abstract from specific inputs. To ensure correctness for all possible inputs, a program must be tested. Here, the programmer supplies (exemplary) data, that (hopefully) covers all corner cases of the program. Again, this requires imagining how the program will run once it is used. You don’t really have to test a workbook — you probably want to check the outputs for plausibility, but testing, in the sense of exploring the input space, is not necessary.

Very often, the person who writes the program (or function, or whatever the granularity is) will be a different person from the person who uses it. Again, this is not the case for creating a workbook.

Imagine structured workbooks, i.e., workbooks with a schema, where users can fill in data into the “holes” of the schema. Like spreadsheets with predefined formulas and certain marked-up cells where users insert values. Then the act of creating such a schema is essentially a functional programming.

State and Time

In both, functional programs and (functional) workbooks, the complete computation can be represented as a single, immutable, never-changing state. In a linear notebook, the state is the outputs. The inputs are essentially step-by-step transformations of the state. Similarly for a functional program: for any given executions (with a set of inputs), you can represent the program execution as a tree of intermediate results; again, the program “is”, it doesn’t “run”. You can debug a functional program through a post-mortem tree of intermediate results.

Enter effects. You get changeable data, i.e., variables in the classical sense. The distance between a program (as source text) and its execution grows. In fact, now you actually do run the program, running means changing the values of variables change over time (as well producing other effects). Representing all values (of variables) of a program execution becomes visually challenging, at the very least you need some kind of time-travel slider thingy. Practically, you now need an actual debugger or various other means of program animation.

Note that this the notion of time I mention here concerns the execution of the program, not time as part of the program data. You can perfectly well write a functional program that performs computations on time values. We have at least two projects that deal with time as part of the domain.

(Re-)Assembling Programs

(The title of this one sucks, I know. I couldn’t find a better one). In the interest of avoiding duplication and “complexity”, we split a program into various parts: we decompose into functions that build on each other, we build hierarchical component structures, we separate concerns, we use extension/ specialisation/inheritance to separate the more general from the more specific.

All of this clearly helps to avoid repetition, and it also makes code easier to test because the individual parts are smaller and better isolated through an interface. The flipside, of course, is that it is now much harder to understand the overall behavior of a program, because you have understand how all these previously separated things fit together and/or interact, intentionally or unintentionally.

It really is a conundrum, isn’t it? We can either keep everything in one place, then there’s no assembly, but the program is a mess, not reusable, and untestable. Or we can separate things, but then it is harder to understand the big picture and be sure that everything works together well. It is not a coincidence that a lot of innovation in programming languages is about solving exactly this challenge.

Operational Concerns

As a software engineer, we don’t just care about implementing an algorithm. We want it to be fast, and scalable, and robust, and secure, and whatever else. It is obvious that this makes things harder, because we have to keep more concerns in our heads when writing the program: not just how data changes, but also how long it takes to change, how this time becomes longer as we add more or bigger inputs, how much memory it will use, what other, unintended (and potentially exploitable) behaviours the program might have, how to recover from (partial) failures, and so on.

And when trying to write a program this way, we usually have to change the original algorithm. Think about sorting: sorting a list naively is easy to do, and a program that does that is easy to understand. Once you make it fast using quicksort, it’s a different story. It takes a while to understand the algorithm, and when looking at a quicksort implementation, it’s not at all obvious that this program sorts a list!

So if you want write code that takes these non-functional concerns into account, this adds another dimension of complexity that was not there before.

Building Abstractions

In some sense, every function is an abstraction. So building abstractions already starts when one writes a functional program. And in fact, this is why functional programming is harder than creating a workbook. But as we all know, building “real” abstractions starts with libraries and frameworks (and, ultimately, languages of course). The reason why this is hard is that you have to abstract to a much higher degree than when writing a plain function.

And often, abstractions are stacked: you build an abstraction and then you use it to build further abstractions. And you potentially use more and more advanced features of your programming language (think: higher-order functions, generic types, or type reflection). Obviously, this is a whole order of magnitude more complex than what we have discussed so far. Because of the much larger input space, testing also become much harder.

So, what can we learn for DSLs?

Of course I have to bring this around to DSLs, right :-) ? Let us see what we can learn from this categorisation for the design of DSLs. And when I say DSLs, I mean DSLs targeted at domain-experts, not DSLs that programmers use to optimise matrix computations.

Workbooks: For the vast majority of DSLs, at least those that we build, the workbook style is not enough. It is necessary that the program will run with different sets of input data. But maybe we could provide a workbook-style interface for testing? Users could define functions and other abstractions and directly test them in the workbook. But then these functions can also be run separately. And we should definitely make sure that programs can be run immediately, at least on example/test data. And there is clearly a use for a Workbook-style DSL for engineering, as Mathematica proves.

Functional: If at all possible, we should design our DSLs in a way where the domain expert only writes functional code. That code calculates values, makes decisions, or, in order to manage state, returns deltas or commands. This functional code is driven by an engine that handles state. Many DSLs fit very well into this schema.

State and Time: If our domain requires stateful programs then make sure that tool support makes the passage of time explicit. Build animators that show how values change over time, build UIs that simulate the resulting program with which users can play, maybe use an immutable persistent data structure to represent the complete state and allow users to go back and forth on the time axis.

(Re-)Assembling Programs: This one is hard. We really got stuck a number of times with domain users when we tried to explain inheritance or, worse, aspects or something. One approach that has worked was to build tool support that allows users to see the assembled version of the separated program. And immediate execution and animation also helps. But only to a point. In the end, the modularization/re-assembly conundrum limits the size and complexity of the kinds of DSLs and/or programs you can expect domain experts to be able to deal with, at least without significant training.

Operational Concerns: Separate them out. Your domain experts should never see any of this in their programs. Of course this is easier said than done, because, for example for performance, you have to build optimising code generators. Or for scalability, you might have to execute the (generated) program reactively or incrementally, so you re-run only as little as possible as some of the inputs change. All of this might have an impact on the language design, the programs have to be written to, for example, support reactive execution.

Building Abstractions: Well, just don’t :-) Put the abstractions relevant to the domain into the language as first-class citizens. And when new abstractions are needed, evolve or extend the DSL. Or at the very least, separate library programmers from library users.

And a note on Excel: we often complain about it being more of a problem than a solution. The reason for this is not that it is a spreadsheet/workbook. The problem is that it is general-purpose. There is no actual DSL involved. And very often, spreadsheets are intended to have a schema (predefined, read-only formulas, plus well-defined places where people can enter values), but Excel (and its users) aren’t good at enforcing one. This is why the use of Excel & Co often ends up in chaos.

Conclusions

If your DSL represents the domain well, then steps one and two (Workbook and Functional Programs) are very much feasible to master for most domain experts I have worked with, especially if you educate and train them a bit (for example, with this tutorial :-)). If it is a good fit, and supported by tools, State and Time also works. But everything beyond this generally belongs to software engineers, and not into DSLs targeted at end users.

Categorising the Complexities in Programming was originally published in language engineering on Medium, where people are continuing the conversation by highlighting and responding to this story.

The Philosophy behind Language Engineering with MPS

Markus Voelter — Fri, 16 Feb 2018 07:07:27 GMT

Introduction

Over the last few years, we have built a lot of interesting different DSLs with MPS. In this post, I explain the thinking behind language engineering with MPS, and why the languages one typically builds are different from languages built with other tools.

Sure, it is possible to use MPS to define programming languages that work like any other one: a relatively small set of language constructs designed for letting the user define their own abstractions plus a large standard library on which users can build. For example, MPS ships with an implementation of Java (called BaseLanguage) which is essentially unchanged from regular Java. The whole JDK is available for users to build on. While some extensions are available, users can do regular Java programming with MPS’ BaseLanguage.

However, when exploiting MPS’ unique characteristics, the resulting languages look very different.

Differences to “normal” Language Engineering

Syntactic Forms

Because of MPS’ projectional editor, it is possible to use a wide range of notations (see figure below). Direct support exists for structured and unstructured text, tables, box-and-line diagrams and math. But it is also possible to define completely custom notations that do not fit any of these paradigms. The notations can also be mixed (nesting one in another, using them next to each other in the same “file“). Since this MPS is unique in this respect among industry-strength language workbenches, it is not uncommon that MPS is specifically selected for a language because of this feature. However, even a language that is fundamentally textual, like KernelF, exploits decision tables and trees, has an extension for math syntax.

Language Modules instead of Libraries

In general-purpose programming languages, new abstractions are provided through libraries (and frameworks, which we consider a form of library in this article), developed with the language itself. This is possible because GPLs are built for defining abstractions. However, as a means of providing new abstractions for programmers, libraries are limited in the sense that they cannot extend the language syntax, type system and IDE support (this is a slightly over-generalized statement because, depending on the language and its meta programming facilities, new abstractions can provide their own syntax and type system and IDE support, however, generally the statement is true).

In idiomatic use of MPS, additional abstractions are provided through language extensions, defined outside the language, using MPS’ language definition facilities. A language extension can be seen as a library plus syntax, plus type system and plus IDE support (and a semantics definition via an interpreter or generator). The structure definition of languages is object-oriented, and many of the design patterns relevant for libraries and frameworks can also be found in MPS languages (examples include the Adapter/Bridge/Strategy patterns or the separation of the construction of a data structure from its subsequent interpretation or execution). This approach fits extremely well with DSLs, which, because of their purpose and target audience, often do not come with sophisticated means of building custom abstractions.

One very nice feature of libraries is that, in general, they can be composed. For example, you can use the collections from the Java standard library together with the Joda Time library for date and time handling and the Spring framework for developing server-side applications. There is no need to explicitly combine the frameworks, the combination “just works”. While this composability is not true for language composition in general (primarily because of syntactic ambiguities), it is true with MPS: for all intents and purposes, language extensions can be composed modularly, just like libraries. The composition also has the same limitations: one cannot statically proof that it will work. And the set of libraries/language extensions might not fit well in terms if their style. However, if language extensions are developed in a coordinated, but still modular way, as stack of extensions, these limitations do not apply. mbeddr is a very comprehensive example of this approach.

To illustrate the library vs. language extension point, I provide two examples. The first one concerns the collection in KernelF. Consider the following code:

// type inferred to list
val l1 = list(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)

// type inferred to list; results in type error for l2
val l2: list<int> = l1.where(|it > 5|).select(|it / 2|)

As you can see, the collections are generic: the list type carries the type of its elements, either explicitly specified (l2 ) or inferred (l1). However, KernelF does not generally support generic types. For example, users cannot write the following:

fun<type T1, type T2> typedPair(v1: T1, v2: T2): [T1, T2] = [v1, v2]

Generics are not generally necessary for DSLs. In fact, their exposure to the user will be often be confusing, and it will make the job of the language extender harder, because he has to take into account generics for all extensions. However, for collections, an explicit specification of their element type is useful and intuitive. This is why the language extension for collections supports it. In the list example above you can also see the where and select operators; they are also language extensions, available on list types. These could have been implemented with extension functions in a standard library. However, because they have to work with the collections’ type parameters and because they use a particular kind of type inference for the it argument, not generally supported by KernelF, these are also built using language extension.

As a second example, take a look at the state machine example below. State machines come with a rich syntax, specific type checks, and dedicated IDE support. In the future, model checking will be available.

The second example is probably more convincing to you; it is hard to imagine how the state machines could be implemented as a library, even in a language with meaningful meta programming facilities. For the collections and their operations a language with more powerful type system could provide them as a library with the same end-user visible features. However, as mentioned above, it would lead to complications in the end-user experience in other places and for the language extender. This is why those have been implemented as language extensions as well.

Because of the ease of developing languages in a modular way, we try to separate generally useful KernelF extensions from actual customer-specific extensions when we run projects; the generally useful parts become a customer-independent KernelF extension — if you will, the equivalent of a standard library, but as languages.

The last point of comparison between libraries and language extensions is the effort to create them. For an experienced MPS developer, the development of a language extension is not significantly more effort than the effort to write a library. In addition, because language development and language use in MPS happens in the same environment, turn-around time is very quick, supporting iterative, and example-driven language development, just as if you develop a library together with representative examples of its use.

More First-class Concepts

As a consequence of the heavier reliance on language extensions, a (stack of) MPS language(s) will typically be more keyword- heavy than non-MPS languages. While this may offend the sense of style of some developers, this has two distinct advantages.

First, because more concepts are first-class, the IDE can know the semantics of those concepts and provide better support in terms of analyses. This, in turn, can be used to create meaningful error messages that align with the particular semantics of an extension. For example, in state machines, if the user creates a transition to the start state (assuming scoping allows this in the first place), an error message could read Start states cannot be used as the target of a transition, and in smaller font, below, Start states are pseudo states that are only used internally during startup of the machine. In a library-based solution, or one that relies on meta programming, very likely this problem cannot be determined statically at all, and would lead to a runtime error. Alternatively, the error message would perhaps be much more generic, as in Type StartState is not a subtype of State or something, which is also not very helpful to the end user.

Second, the language is easier to explore, primarily because code completion has more sensible things to show. In a minimal language like Scheme, the contents are essentially the completions for the basic syntactic forms such as atoms, lists or functions, plus the calls to existing functions. This makes it harder for the user to explore the things they can do with a language.

Focus on Evolution

Because languages and their extensions contain comparatively many first-class concepts, and many reflect a business domain that evolves, the languages we build with MPS also evolve quickly. Evolution in this context can mean one of two things. First, we may build additional languages on top of a core language, while keeping the core language stable; we grow the stack of languages into one or more domains.

The other notion of evolution is the actual invasive evolution of the language itself (to make it concrete: you’ll ship a new version of kernelf.jar , whereas in the extension case above you ship additional jars that rely on an unchanged kernelf.jar). If the new version is compatible with the previous version, this case is simple: just deploy the new version of the language, and users now have more features. If the new version is not backward compatible, then existing programs become invalid. For this case, MPS supports explicit language versioning. As the language developer makes a breaking change to a language, they increase the version counter and provide a migration script. When language users open existing models after the new version has been deployed into their IDE, the scripts run automatically, bringing the model up-to-date. If no algorithmic migration is feasible (because the user has to make a semantic decision not previously necessary), the recommended approach is to keep the old construct around, deprecate it, and output an error message that tells the user that he has to make a decision and migrate.

Note how this is a much more robust infrastructure for dealing with program migration than what is possible with libraries: an incompatible change prompts a generic error from the type checker or compiler, and automatic program migration is not available (outside of experimental systems). All in all, iterative development of languages is quite feasible, even when taking into account models that are “in the wild” with language users.

Recasting IDE Tools as Languages

Traditional programming systems consist of the language and libraries, the compiler and type checker, and an IDE. Many added-value services, for example, those for program understanding, testing, and debugging, are part of the IDE; more specifically, they rely on tool windows and other service-specific UI elements (buttons, menus, etc.). Because of MPS’ flexibility in how editors can be defined, we use languages and language extensions for things that would be tool windows or other IDE add-ons in classical languages and IDEs. Examples include the REPL and its rendering of structured values (1 in the picture below), the overlay of variable values over the program code during debugging (3), test coverage and other assessment results, generated test vectors and their validity state (2) and the diffing of mutated programs vs. their original in the context of mutation testing (4).

As a consequence, the notion of what constitutes a language is much broader in MPS, compared to the traditional understanding. A side-effect of this approach is that the chrome of the development environment — the set of windows, tabs, buttons, menus and such — can be reduced, because “everything happens in the editor, through typing, code completion and intentions”. Since complaints about MPS’ too cluttered tool UI is among the most-heard complaints among our users, we consider this side-effect an advantage.

More Reliance on the IDE

There is no standard for the implementation of languages, which means that, once a language is implemented with one particular language workbench, it cannot be ported to another language workbench (unless one implements it completely from scratch). This is all the more true for MPS, which, because of its particular style of language implementation is unique among language workbenches. Specifically because of its projectional editor, MPS languages cannot be used outside of the MPS tool. While this can be seen as a drawback, the flip side is that one can assume the IDE to always be present, and the language can be design assuming the IDE and its services. A few examples:

Different projection modes: Instead of making a design decision on which level of detail should be used for function signatures, the user can switch (see figure below). This is useful because users with different levels of proficiency will prefer different styles: the newbie prefers the explicitly listed types, and once one gets more proficient, one appreciates the conciseness of alternative (C).

Read-only editor contents: In many DSLs we use them to create a more form-style editor experience, with non-editable labels. In mbeddr, when a component implements an operation defined in an interface, we use a read-only projection of the operation’s signature in the implementation.
Intentions: These little in-place transformations of the program are available from a drop down menu activated with Alt-Enter (aka as Quick Fixes in Eclipse). In some languages, especially non-textual ones, these are the only way to access certain constructs — you can’t just type them. Many examples of this can be found in those languages that recast traditional IDE services (see previous paragraph). While replying on intentions might be unintuitive for text-focused programmers, we teach our users to consider the intentions menu to be an integral part of the editor experience.

Summary

Domain-specific languages in general, and our approach in particular, are a hybrid between modeling and software language engineering. From modeling we borrow declarativeness and high-level, domain specific concepts; multiple integrated languages; meta modeling for defining the structure of languages (named properties and links, inheritance, actual references); notational freedom, and in particular, diagrams. From the field of software language engineering we adopt a focus on behavior and integration of fine-grained aspects, such as expressions; actual type checking and not just constraint checks; powerful, productivity-focused IDEs; and textual languages.

We like to think that the approach combines the best of these two worlds and leads to convincing outcomes.

The Philosophy behind Language Engineering with MPS was originally published in language engineering on Medium, where people are continuing the conversation by highlighting and responding to this story.

An Argument for the Isolation of “Fachlichkeit”

Markus Voelter — Tue, 30 Jan 2018 11:18:02 GMT

Yes, dear reader, this is a German word. I wasn’t able to find an English term that captures the exact meaning, so bear with me. And it wouldn’t be the first German word to make it into English either: Kindergarten, Schadenfreude, Doppelgänger, anyone? The exact pronunciation (‘faχlɪçkaɪ̯t) might be a bit of a challenge for an English-speaker, because the German “ch” sound doesn’t exist in English — and its two occurrences are even pronounced slightly differently (you can hear one of them here). A close approximation is “fahk-lick-kite”; if you are Howard Carpendale, you’ll pronounce the middle syllable as “lish” :-)

What is Fachlichkeit

Fachlichkeit, in the context of software systems, is the core of the functionality of a software system, as far as the domain is concerned. It is usually what makes the software system valuable to an organization. It embodies the collected expertise of a company; the stuff you don’t buy from consultants. Fachlichkeit is not necessarily contributed to a system by software engineers, but by experts in your domain. To illustrate, let me give you a couple of examples:

In an system that creates monthly salary and wage statements for employees, the Fachlichkeit is all the rules that determine what counts as work time, as well as all the laws that drive the deductions and benefits that apply to the employees gross salary. Those are the expertise of employment law experts and tax lawyers.
In a medical application that helps patients deal with the side effects of treatments, the Fachlichkeit is a set of data structures, algorithms, decisions procedures and correctness criteria. They are maintained by doctors and other healthcare professionals.
In a tachograph, the device that monitors driving and break periods in trucks, the rules that govern when a driver has to take a break, and for how long, depending on the driving history over days and weeks, are the core Fachlichkeit of the system.
In an observation planning system for a radio telescope, the Fachlichkeit is the parameters needed to perform a successful observation of a particular spot in the sky, including positioning, focus, filtering and image processing. These are specified by astronomers.

The Fachlichkeit really does not at all care about software concerns. Non-functional (aka operational) requirements such as scalability, security, or timing really are not relevant to the Fachlichkeit. Of course they are just as crucial for the final software system, but they are different concerns. I will return to this later.

Many established terms are close, but not identical. Functional requirements are broader; for example, how a particular REST API is designed is a functional requirement, but it is not fachlich. Business logic is closely related, but a technical system such as the telescope control system mentioned above would never use this term. For our German readers, there’s a word Fachkonzept (no, this is not one I want to inject into English :-)), but it is more than the Fachlichkeit, because it usually also includes processes and organizational changes induced by a (to-be-developed) software system.

Why do we care?

We care because it is absolutely crucial to separate Fachlichkeit from the rest of a software system. There are two primary reasons for this. First, as I mentioned above, the experts on a business’s Fachlichkeit are usually not the software engineers. If you bury the Fachlichkeit in the software, expressed in a programming language, it is effectively inaccessible to the people who care most. They are forced to writing requirements documents. This, as we know, is problematic: they are usually not rigorous, so they can’t be checked by tools, they can’t be tested automatically, they cannot be executed. And they become stale as soon as the “real truth” in the software evolves. Requirements documents are like undead zombies: you can’t quite kill them because they are the best you have, but they also aren’t alive.

The other reason for separating Fachlichkeit from the implementation in software is that the two have a completely different lifecycle. How many times have you heard the story that a company had to “reverse-understand” and then reimplement the Fachlichkeit because they were forced to use a new platform? Mainframe to Java to web to mobile to what-is-next? At the same time the domain experts feel slowed down when they have to involve the software guys all the time when the Fachlichkeit inevitably changes. I would argue that the horror of legacy systems isn’t really the fact that you have to move from Cobol to whatever, but that you have to reverse-engineer all the Fachlichkeit that’s tangled up in the Cobol thicket in order to bury it again, this time in today’s favourite programming language. Let’s not allow it to get tangled up in the first place!

Software enginners very much value the notion of separation of concerns. We separate structure from layout in HTML/CSS, we separate transactions, security and scaling from the core behavior in JEE app servers, and in protocol stacks, we separate physical transport from logical request/response. As a community, we have invented all kinds of mechanisms to achieve separation of concerns, from layers to frameworks to aspect-oriented programming (another zombie). IMHO, Fachlichkeit is the most important concern to separate.

How do you do it?

From what I said above we can extract the following constraints for any approach to isolate Fachlichkeit: it can’t be code because that’s inaccessible to domain experts. And it can’t be just documents, because they are dead, they can’t be checked for consistency and they can’t be tested.

In my opinion the only way to approach is to create some kind of model of the Fachlichkeit. The model must be rigorous enough in order to avoid ambiguity. It must be formal enough so it can be checked for consistency by tools, at least to some degree. And it must be executable so that domain experts can write tests to verify that it works correctly. Finally, it must be represented in a way that is accessible to domain experts, for example, by using concepts and notations already established in the domain. If you have all these properties, the models are also suitable for code generation or for execution in an interpreter, which gives you the connection to the actual software implementation, in a way that is still decoupled: you just write a new generator or interpreter if your platform changes.

You will not be surprised that this sounds a lot like using domain-specific languages. However, if you agree to the premises of what I am writing here, I really don’t see another solution. I am genuinely curious whether you can suggest a different approach?

Just for the fun of it, let’s look at some candidates. Remember analysis models and business analysis and analysis patterns? They aim at some degree of rigor and structure, but they are not executable and testable. And UML, which is what was used primarily, isn’t good enough. Ever tried to express insurance contract calculations with UML diagrams? Doesn’t work.

Using controlled natural language for requirements? This can help. It makes requirements more precise, and removes ambiguity. But again: no execution, no testing, no code generation. So it falls short quite a bit. A related approach is to use math, logic, decision tables and other well-defined formalisms to describe behaviors; I’ve seen it for medical algorithms and nuclear reactor shutdown procedures. Good idea. They are potentially checkable and executable. But if you write them down in Word (as was the case in both examples), you get stuck at “potentially”. And if you back it up with a language definition and an IDE … well, then you’re effectively building a DSL.

Domain-Driven Design? Good and bad, in my opinion. The ubiquitous language is a good idea, it helps establish a common vocabulary in a domain. It can be a milestone on the way to a DSL. DDD also has several good architectural ideas, such as the anti-corruption layer. But in terms of separating Fachlichkeit from the rest, it doesn’t go far enough, because it still buries it in implementation code (admittedly not as deep as if you did “normal” programming).

What else except a DSL? I genuinely don’t know.

What else goes into those models?

It is not 100% true that it is only the pure Fachlichkeit that goes into these models. There are three very typical additional concerns that go there (with a few additional ones in particular systems). The first one is tests. You can only be sure of the correctness of your Fachlichkeit if you test it. In this respect it is not different from programming. Second, Fachlichkeit usually evolves over time. You have to manage this somehow. For example, you can use version control systems to track changes over time, or you can model temporality explicitly. For example, in the salary example above, the evolution of the law is represented in the models so you can re-run “old” calculations with the then-applicable laws. Third, variability should be captured. In the tachograph example, the rules that govern break times are similar, but not identical in different European countries. Expressing these differences in a way that lets domain experts keep the emerging complexity in check requires some thinking and has influence on the language that gets used.

Drawbacks?

There’s no free lunch, right? So what are the drawbacks of this approach? The obvious one is that you have to define a suitable language. This requires certain skills that might not be available in your organization. And it’s also just effort. Regarding skills, well, companies buy consulting services to acquire all kinds of skills, why not for this arguably central one? Regarding effort: the language definition comes in two parts. One is the systematic understanding of your domain so you can build the language. This part of the effort really isn’t a waste at all: it should be done in any case. The second part concerns the language implementation in a language workbench. Here, the effort is probably much lower than what you might expect by extrapolating from your compiler construction course at university, because modern tools have reduced the necessary efforts a lot! But yes, developing the language, and maintaining it over the years, is additional effort.

The other challenge is the cultural and organizational change that goes along with adopting the approach. The domain experts have to get used to increased structure and rigor. Even though good language design and good tool support can go a long way, this typically requires training. And this is a hard sell in most organizations these days, unfortunately. It can also be a change for the technical guys in a company, because of their now stricter focus on core technical issues, and their involvement in language and generator implementation. However, once people are over the change itself, the clearer focus is usually appreciated by both technical and domain people. And: while it sounds like I’m trying to tear the two groups further apart because of the clearer separation of responsibilities, it actually improves collaboration because it removes the inefficiencies and sometimes real conflicts around incomplete and inconsistent requirements documents.

Wrap up

So is this worth doing? It depends. It depends on how well your domain is suited for the approach, not all are. It depends how you value a clean, lasting and unambiguous specification of your core business expertise. Which in turn depends on how long you plan to be in business. It also depends on the complexity of what your system does: the more complex, the more evolution, the more variability, the more useful the approach becomes. My experience speaks a clear language though!

An Argument for the Isolation of “Fachlichkeit” was originally published in language engineering on Medium, where people are continuing the conversation by highlighting and responding to this story.

A Smart Contract Development Stack, Part II: Game Theoretical Aspects

Markus Voelter — Wed, 13 Dec 2017 12:30:31 GMT

Game theory is “the study of mathematical models of conflict and cooperation between intelligent rational decision-makers” [Wikipedia]. In particular, it looks at how rules in cooperative processes (“games”) impact the outcome, and also how the parties taking part in the game can cheat, i.e., exploit the rules for their own benefit.

In my previous post I sketched some ideas of how smart contracts can be expressed declaratively to avoid low-level mistakes, and also emphasised that techniques from functional safety, such as model checking, can be used to ensure properties of (state-based) contracts. However, as Björn Engelmann pointed out to me, many (smart) contracts are, by definition, cooperative processes, which is why they are susceptible to “game-theoretical” exploits. In this post I will mention a few, and show some ideas how they can be remedied.

A First Example: Sybil Attacks

A sybil attack is one where a reputation-based system is subverted by one (real-world) party creating loads of fake (logical) identities who then behave in accordance with the real world party’s goals. For example, consider a decision that is based on majority vote. An attacker could create lots of additional parties and thereby taking over the majority, leading to a decision in the interest of the attacker. This attack only works, of course, if it is at all possible to include additional parties in the set of decision makers, but this kind of dynamic group is certainly not rare. So what can be done to prevent it?

Vote in

One thing that can be done is to require that new parties cannot just join, but have to be voted in by the existing members. It would then be there task to ensure, through means outside the contract, that a join request is valid in the sense that is does not original from a malicious real-world entity. The last example in my previous post showed this process. However, ensuring this validity can be a lot of effort, risking that it isn’t done consistently. It can also be exploited by the existing decision makers to keep unwanted additional parties out, reducing the openness of the overall process (if that openness is required in the first place).

Rate Limitations

Another approach to reducing the risk from Sybil attacks is to limit the rate at which new parties can request to join the process. To implement this declaratively, I have implemented a facility by which the rate of commands that come into an interactor (read: the events that come into a state machine) can be limited. The following code expresses that while the machine is in state requesting, only three commands per second are allowed (obviously a value that is useful only for testing). If more requests come in, they are rejected.

The Interceptor Framework

The rate keyword shown above is an example of an interceptor. Interceptors are associated with a state (or with a parent state in a hierarchical state machine). Every trigger or variable access that enters the machine is first passed to all applicable interceptors (they are collected by walking up the state hierarchy). The interceptor can observe the command or variable access and decide what to do about it:

It can let the request through unchanged,
It can make arbitrary changes to the request,
Or it can block the request completely.

In the example above, the request is completely blocked if the frequency of requests exceeds the one specified in the rate interceptor. Note that interceptors are different from guard conditions because they apply, hierarchically, to all transitions in a state, and they can also maintain their own state . The rate interceptor maintains a counter of the number of requests in the given timespan to decide about rejection.

Interceptors are a bit similar to Solidity’s Function Modifiers, although interceptors are associated with states and not single functions.

I have implemented several other interceptors that can all be used to help with the game theoretical nature of smart contracts. More examples follow below, after we have introduced context arguments.

Context Arguments

A command entering an interactor (e.g., a triggering event entering a statemachine) can have arguments. For example, when triggering a buy event, that event may carry the price the client may want to pay for whatever they buy. Guard conditions can take the arguments into account.

In addition, I have extended interactors to support context arguments. They are essentially a second list of arguments that are different from the regular arguments in the following way:

They are optional in the sense that an interceptor decides whether they are required or not for a given command
For the client, special syntax exists to supply values for the arguments without explicitly mentioning it for each command.

For (Ethereum-style) smart contracts, an obvious context argument is the sender of a message; this allows the contract to make decisions about the validity of a command and also to ensure that the sender’s account pays the transaction fees.

We are now ready to look at the next interceptor.

The SenderIs interceptor

It is rather obvious that, for contracts to be valid, one often has to check that commands come from a limited set of parties. To continue with the example above, only the set of already-voted-in decision makers can take part in a vote. The following code enforces this by using the senderIs interceptor:

This interceptor takes as an argument a collection of parties, and for every command or variable read that comes in, it checks that the sender is in the list of parties (players in the example above). If it isn’t, the request fails. It also fails if no sender context argument is supplied by the client at all.

Note how within a state that (transitively) has a senderIs interceptor, a variable sender is available that refers to whoever sent this command; interceptors that enforce (by otherwise failing) that a given context argument is specified can also make available special variables with which that context argument can be accessed to make further decisions in the implementation. In the example above, the sender variable is passed into a multi-party boolean decision (as explained in the previous post) that handles the actual decision of whether the requesting sender should be allowed in.

Obviously, one could perform the validation of the sender manually in every guard condition of every transition; but as you can tell from the words “manually” and “every”, this is tedious and error prone, and should thus be avoided.

Turn-by-Turn Games

Many “games” require a fair allocation of opportunities to participating parties. One way of achieving this is to run a game turn-by-turn, where each party can make one “move” in every “round”. In my example, I have a bidding process:

Note how the bidding state uses a takeTurns interceptor. Again, it takes as an argument a list of parties which have to take turns. You can configure how strict the turn-by-turn policy should be enforced. Unordered means that in each round, everybody has to make a move, but the order is not relevant. Ordered means that the order given my the list of parties passed to the interceptor is strictly enforced. A violation leads to a failure of the command. The interceptor also provides access to the list of allowed next movers; this could potentially be used to notify parties that it’s their turn.

Now, there is a risk of a denial of service attack: assume ordered mode, and the next party P just doesn’t make its move: the whole process is stuck. Nobody else can make a move because it’s P’s turn. But P just doesn’t do anything. This is why a turn-by-turn game should always include a timeout:

In the refined example above, we specify a timeout of 500. If the next party does not make their move within 500 time units after the previous one, that party is removed from the game. It then becomes the another party’s turn. It’s also possible to just skip the “sleeping” party and give the turn to the next one in line. Further policies can be imagined, such as skip three times and then remove, or whatever.

Summary

When I chatted with Björn about safety of smart contracts and mentioned model checking, he said that he doesn’t know how to do model checking for game-theoretical properties. Obviously I don’t know either. But instead of checking, another way to reduce the risks is to support correctness-by-construction, i.e., make it very simple to define contracts that do not have a particular class of risks, for example by providing declarative means for things like authorisation, rate limiting or turn-by-turn games. This is what I have shown in this post.

It should also become more and more obvious why higher-level abstractions (compared to a “normal” programming language) are needed to efficiently express safe smart contracts. Trying to implement all of these things “with normal code” or just with libraries will lead to many low-level mistakes. DSLs are really very useful here.

A Smart Contract Development Stack, Part II: Game Theoretical Aspects was originally published in language engineering on Medium, where people are continuing the conversation by highlighting and responding to this story.

A Smart Contract Development Stack

Markus Voelter — Wed, 06 Dec 2017 08:33:59 GMT

Better Abstractions for Correct Smart Contracts

Blockchains and Smart Contracts are one of the most intense hypes I have ever encountered, a close number two right after micro services :-) Even the IHK Stuttgart, the local industry association, recently organised a Blockchain Camp, and around 200 people attended, many of them with no computer science background.

Two Aspects of Smart Contracts

For me, the topic conceptually splits into two aspects. One is the non-functional properties provided by the Blockchain technology, such as trustworthiness and non-repudiability, as a consequence of the distributed nature of block processing and the involved math. Even though it builds on many established ideas and technologies (such as public key crypto), the approach is fundamentally new and interesting. Some of the particular technologies definitely need to change to approaches that are way less wasteful of energy though; Algorand looks very interesting, as well as Ethereum’s native work on proof-of-stake in the context of Casper.

The other aspects is the idea that non-programmers can specify, analyse, simulate and ultimately execute contracts. And I really mean contracts, i.e., processes where multiple involved parties make (a sequence of) decisions over time. The non-functional properties of blockchains can be beneficial here, but are not necessary — if you trust a central entity, you can delegate the execution of the contract to that entity and use crypto to control who is allowed to do what as part of a contract. This second aspect is very much in line with the computerification of other non-technical domains such as computational law or computational governance. It is probably not a surprise to you, dear reader, that I consider this a very good use case for DSLs.

Ethereum and Solidity

The preeminent platform for executing smart contracts is Ethereum. Except for the energy issue, it is very suitable for providing the non-functional properties of blockchains I mentioned above. However, it falls short in the second aspect. The primary programming language for the Ethereum VM (EVM) is Solidity. It is essentially a general-purpose programming language with support for the specifics of programs that run on the blockchain. For example, running code on the distributed EVM costs money (Ether), and developers can limit the processing time used by a particular program in order to not run out of money. And every method that’s called on a contract implicitly carries information about the sender (more specifically, their account number) so that the fee for a transaction can be paid by this account; the ID can also be used for authorisation of operations defined in the contract. However, Solidity (and the other Ethereum languages I have seen so far) does not provide first-class support for the typical patterns found in smart contracts that run on the blockchain, i.e., programs where a group of parties collaboratively make decisions and run processes. This is interesting, since the community has identified typical “contract patterns” that are a good starting point for reification into language abstractions.

An Architecture for Smart Contract Development

The following picture shows an overview of how I imagine a Smart Contract development environment to look like. Let’s walk through the parts.

The first realisation is that the overall problem can be broken down into the development of contracts and into its execution.

Contract Execution

Considering that blockchain technology exists (and ignoring the energy challenge), execution is almost the simpler problem. Once you have implemented a contract correctly (!), you can deploy it to the blockchain and execute it there, benefiting from the guarantees provided by the blockchain (and maybe also suffering from some of its limitations, such as relatively low throughput, at least for now). Several different blockchain technologies exist; for example, in a business context, it seems that Hyperledger might be(come) more important than Ethereum. It doesn’t have exactly the same properties or guarantees, but that is good: users can choose the properties they need. Notice, however, the exclamation point behind the “correctly” above. That is the crux!

It is of course necessary that the infrastructure itself provides correctness guarantees. This is why various projects are under way to formally verify the virtual machine or enhance the solidity compiler to support advanced checking through integration with a solver. However, you can still implement the wrong behavior in your contract (which is then correctly executed by a verified VM). This is where the importance of the right contract development languages and tools come in.

Contract Development

It’s almost funny how often I have heard statements like, “Well, contracts have to be correct, because, in contrast to other software, money is involved, and you don’t want to lose that.” True, of course, but if you’re the developer of pace makers, you don’t want to kill people because of bugs in your software. And if you develop satellites, you don’t want it to die from a software bug on day two of the mission. So, ensuring the correctness of a program is relevant outside of smart contracts, too. And we should take a look at what those communities do, and not reinvent the wheel (a gentle hint at the formal methods booklet :-)). More generally, this means that the development of (correct) contracts must also be supported by the overall toolchain. Here is how I envision this to work.

Contract development should rely on a language stack. At the core of this stack, I expect a functional programming language. Functional languages are useful because they are (relatively) easy to verify, and can also relatively easily support (in-memory) transactions (as I have discussed in this previous post), which is useful for contract-style programs. On top of the functional core, I expect a couple of language extensions that directly support the above mentioned contract patterns, such as decisions, auctions, agreements and resource allocations. Each of those can be broken down into a whole range of configuration options (think: domain analysis, variability, feature model), that determine the specific behavior. The Executable Multi-Party Contract Language (EMPCL) in the above picture contains language constructs for these typical Smart Contract building blocks. It also support state machines, since most non-trivial contracts are in effect state machines. I will return to this idea below when we look at example code.

On top of EMPCL, I see languages (or EMPCL-extensions) that are closely aligned with domains such as logistics or finance. Each of those has their own idiomatic contract constructs, and the language extensions should support those directly.

Why language extensions and not just frameworks or libraries? Because they make it easier to write correct code (once they are stable). Two reasons. First, by using language constructs at an appropriate level of abstraction, many lower-level mistakes cannot be made in the first place. The contract is, to some degree, correct-by-construction. In addition, a language that makes semantically relevant things first-class citizens makes verification of the not-correct-by-construction things much easier (again, a hint at the formal methods booklet :-)); good IDE support can also be provided more easily. It’s also just less code one has to write, so it’s easier to understand and review. Even better, by providing an interpreter, one can interactively play with the contracts and explore their behavior, and write test cases that are executed immediately. Finally, at least for those DSLs that are aligned with particular domains (those above EMPCL), there’s a realistic chance that non-programmer domain experts can read or write the code. Really, this is the “usual DSL story” that has been proven to work over and over again, and it will also work here.

Contracts also have a couple of very specific risks that go beyond functional safety-style verification that result from their game theoretic nature, for example sybil attacks or timing problems. For now, dealing with those is outside of what I am looking at.

Prototypical Implementation

I have implemented some aspects of this language stack in MPS. In particular, I have started building a language that could become EMPCL. At this point I have not yet implemented verification tools, and I don’t yet translate to any blockchain technology for execution. But some of the core abstractions are available, and this illustrates how and why they are useful. I will explain them in detail below. Make sure you have read the post on Dealing with Mutable State in KernelF, because I rely on it heavily.

The Multi-Party Boolean Decision

The core abstraction I have implemented is the multi-party boolean decision, i.e., a process which lets a group of parties make a yes/no decision. It’s the simplest process one can think of in the context of contracts. We had performed an initial domain analysis for decisions, and based on this, I have derived the following set of configuration options.

First, who are the parties involved in making the decision. In the example above, bernd and markus are references to global variables of type party. Optionally, the set of parties can be dynamic, which means that, as the process executes, additional parties can be made members of the decision process. If the parties are dynamic, an additional check box shows up that allows support for sealing the parties: once the process is sealed, no new parties can be added anymore. The remaining options concern the actual decision process. The procedure determines how the final decision is made, e.g., by simple majority, by unanimous agreement or by a custom algorithm. The turnout determines whether a minimum number of parties have to actually make a decision. The time limit requires the decision to be made within a given timeframe, and the revokable flag determines whether a party can revoke their decision once they have made it.

The decisions are an example of a process, i.e., a language construct that can be stimulated by executing commands, and it can be observed by reading values. Based on the configuration of the process, different commands and values are available. Since this is a language extension and not just a library, the IDE knows about these and can provide support, i.e., help ensure some degree of correctness by construction. A few examples are shown below.

For example, once a turnout is configured, a non-vote cannot be interpreted as a “no” vote, so the system has to explicitly support voteFor and voteAgainst commands. Similarly, although not shown, the decision value is now an opt instead of a boolean because, until the turnout has been achieved, no decision has been taken (and none is returned). Also, all the commands to add parties are only offered if dynamic is selected.

Playing with Decisions — the REPL

Processes have this nice property of uniform interaction: send in commands, observe values. The KernelF Read-Eval-Print-Loop, or REPL, has special support for such values through the live() expression. Applied to a process (such as the MultipartyBooleanDecision), it provides code completion for the commands that are currently allowed. In fact, it even looks at the current state of the process and only supports those commands that are allowed at the current execution state. The REPL also supports a nice, readable rendering for the values of processes. For a sequence of steps, it even highlights the changes in blue, so it’s easy to observe how a process evolved as users issue comm. The next screenshot shows a REPL session on a MultipartyBooleanDecision with a dynamic set of parties and sealing enabled. Check out how the internal state changes based on the commands issued.

Because all of this is based on a generic, reflective API, one can imagine other UIs. In particular, we will build a “simulator” where there are buttons for each command and UI widgets for each value. This will allow non-programmers to creatively play with a contract and thus better understand how it behaves.

Of course, one can also script tests and execute them directly in the IDE, based on the same interpreter that also drives the REPL.

Combining Decisions with State Machines

The decision described above supports basic, multi-party decisions. Over time, we will also add support for auctions, agreements and other contract patterns. However, a contract will also always have specific behavior that cannot easily be expressed declaratively. However, there’s no reason to “fall down” to imperative programming just yet. State machines are a much better abstraction for many of these behaviors, especially when combined with the processes. Consider the following set of requirements:

We have a set of products, each can potentially be sold. First, a
predefined group of stakeholders has to make a decision for each of the
products whether it should indeed be sold or not. Everybody has to vote, and the
decision is by majority. There is a limited time by which the vote has
to have taken place. Once that decision has been made for all products,
each products can be sold to somebody; a product can only be sold once,
and the price must be the same or higher as the one specified in the
offer, and the offer must not have been sold before. All sales are
recorded. Once all sellable products have been successfully sold, the
contract terminates.

This is a realistic, and not totally simplistic example of a contract. Let’s look at its implementation based on declarative decision processes, state machines and a couple of helper functions. We also need a couple of data types; we start with those, they should be rather obvious:

Next, we define a few helper functions, the comments in the code explain what they do.

The remaining implementation consists of one decision process and a state machine (with a few more embedded functions). We show the code next, and then discuss it in some detail:

The state machine serves as the “top level” interface for the contract. Its API are the two events defined in it. The first one, vote, expresses that the party who votes for or against (expressed through the Boolean parameter) selling the product with the given ID. The second event, buy, expresses that a particular party wants to buy a product for a given price.

The next line is crucial. Remember that we want to make a separate sell/no-sell decision for every product/offer. So the salesDecisions variable is initialized to a map that contains an instance of the ShouldWeSell process associated with each product ID.

Let’s now move on to the initial state decideOnSelling. If the vote event comes in, we react by checking whether the shouldWeSell flag is true. If so, we retrieve the sales decision process for the respective product and voteFor it. Otherwise we voteAgainst. If, for all sales decisions, the turnout has been achieved we transition to the selling state.

In the selling state, we expect the buy event (any other event leads to a failure of the operation). If it comes in, and the decision to sell the particular product has been positive (check out the shouldBeSold function), we retrieve the box that contains the offer and call tryToBuy. Otherwise we ignore the event. Once all to-be-sold products have been sold, we move to the ended state.

The final piece of the puzzle is the tryToBuy function. If the product/offer has already been sold it does nothing; otherwise it checks if the price paid by the buyer is greater or equal to the price stated in the offer, and if so, sets the offer’s sold flag to true and adds the corresponding sale to the list of sales.

Another Example

This final example is one where several decision processes interact and processes are started dynamically by the coordinating state machine. It’s similar to the (in-)famous DAO in the sense that the set of decision makers changes dynamically. In particular, the requirements are:

We’re an online community that has to continuously maintain a (selling)
decision; it can be revoked or granted over time. The group of
individuals, called the deciders, can vote (and revoke) this sales
decision. The vote has to be unanimous. In addition, additional people
can be voted into the group of deciders. The existing deciders vote for
new candidates, by simple majority, but with a time limit. Once voted
into the group of deciders, the new member can participate in the
sell/no-sell decision. Multiple member approval processes can go on at
the same time. While a member request is pending, the sales decision cannot
be changed.

At the core are two decision processes. Sale runs “forever” and manages the group decision. It’s membership can change over time. The other process, AccessControl, manages the join request and the voting for a potential new guy. It is started, dynamically, for each join request. The key is that whenever a join request for a new guy finishes successfully, it’s join process is terminated, and the new guy is added to the parties of the Sale process. Check out the code, as well as the REPL session below.

The REPL session is a bit longish, but it does illustrate the point.

Wrap up and Outlook

It should be obvious from these examples that the ability to declaratively specify contracts based on a mix of state machines and (a growing) set of declarative decision, auction, agreement and resource allocation processes provides a solid foundation for efficiently implementing Smart Contracts. Many low-level mistakes cannot be made, and stakeholders can experiment with the contract by playing with it in the REPL and a future simulator. Tests can be written an executed.

It’s probably worth mentioning that the whole contract, state machines and decision processes, are transactional (as discussed in that previous post). So if anything goes wrong (e.g., an event is posted when there is no transition that handles it), the machine fails and the transaction, if one has been started before, is rolled back.

In the future, we will implement functional verification of contracts based on an integration with model checkers and SMT solvers; but don’t expect a post that too soon, this is quite a bit of work.

Finally, we also need a deployment story. We are working on a Java generator for KernelF and all the rest, so a plain Java (Enterprise) deployment is not too far away. We are also working on a generator to Ethereum to exploit its non-functional properties as well. But again, these things will take some time. My goal with this work, and this post, was to play with better abstractions for Smart Contracts; let me know if I succeeded.

A Smart Contract Development Stack was originally published in language engineering on Medium, where people are continuing the conversation by highlighting and responding to this story.

Safety and Security from a Language Engineering Perspective

Markus Voelter — Thu, 09 Nov 2017 16:18:56 GMT

A brief heads-up on today’s WJAX talk

When I started my writing here on Medium, I planned to write original stuff (as I did in all my posts so far), but also to “reuse” other stuff I’ve been producing for other media. This is the first post in this second style; it’s shorter, because it references other material, slides in this case. I gave a talk at the WJAX conference today with Bernd Kolb called Safer Software Through Better Abstractions and Static Analysis.

A few years ago, Gary McGraw published a book called Software Security, where he looks at flaws in the implementation of a software system that can be exploited maliciously. Traditionally, security is seen more like a process issue (education/awareness, reviews, pen testing) or a matter of architecture (authentication, encryption, DMZs, runtime monitoring). And of course, both of those are very relevant. But bugs in the implementation are also a problem as we know from loads of security exploits in SSL libraries (Heartbleed, goto fail) or blockchain contracts (the DAO and several other more recent ones). Many others examples exist. I liked Gary’s book, and we expand on this perspective by looking at language extensions that prevent security problems constructively, and to other extensions that make analysis or review simpler. Of course, many of the ideas are “the usual language engineering stuff”, but applying them to security is relevant, I think.

Check out the slides here; I add a few more comments below.

Looks like one cannot use images as links on Medium…

Here are some of the core ideas:

Security, Safety and Robustness cannot be separated. In some sense, security are maliciously exploited robustness issues, safety issues are just “unfortunate”. You also hear sentences like “this exploit constitutes a massive safety risk”, further illustrating the relationship. Yes, there are some particular security risks as well (e.g., making sure key material cannot be read from a memory image). But there’s a lot of overlap.
By using better languages, many low level errors are avoided. Things like mbeddr’s statemachines, a first-class extension for gotofail-style error handling or wiping of the stack when a leaving a scope (for key material) can be supported by the language directly.
Advanced type systems, such as those that support option types (to avoid null dereferencing), number ranges (to avoid overflows) or support tagging (to track tainted data) are relatively easy to build and are a quick win.
Program verification techniques, such as SMT solving or model checking, can find non-trivial problems. Sure, trying to use those for the high-hanging fruits is very non-trivial. But the low hanging ones are easier to reach and should be harvested.
High Quality Tests are crucial. Measure coverage, generate test cases, use mutation testing. The tools exist. Learn them!
Better abstractions and notations, such as tables, state machines, or mathematical formulas are much easier to read than “code”. They make review easier, and thus, make the code more trustworthy.
Make programs simulatable by stakeholders, because, again, this helps program understanding and thus reduces the likelihood of unintended behaviors.

Just to repeat: we are not suggesting that “classical” security techniques like penetration testing are not needed or not useful. But we do think the stuff in these slides is very relevant as well.

Safety and Security from a Language Engineering Perspective was originally published in language engineering on Medium, where people are continuing the conversation by highlighting and responding to this story.

Some Concepts in Functional Languages

Markus Voelter — Thu, 26 Oct 2017 11:01:02 GMT

Purity, Idempotency, Cacheability, Effects, Tracking

I am relatively new to functional programming. To learn about it, I am building this functional language KernelF — actually, I/we am/are building it because we need an embeddable, extensible functional language as the core of our MPS-based DSLs, but building it also does help me learn functional programming.

The core of KernelF is pure. This means that there are no effects, no (global, observable) state is ever modified. This is nice because it makes KernelF programs easily analyzable. However, in the end, a program that has no effects does not do anything useful, it only heats up the CPU (as I heard Simon Peyton Jones once say). So at some point, you do need effects. The question is: how to you integrate effects into a functional language in a way that does not destroy the benefits of a pure functional language in the first place.

The general answer here is monads and algebraic effects. Yes, I know these words, and I could explain intuitively what they mean. But I won’t. Considering that monads are explained only after three fourths of a 500 page book on category theory , any explanation of mine must necessarily be simplistic. Or wrong. However, in the context of thinking about effects, I did have to clarify a couple of terms around functions. In this post I will explain what I understood. Take it with a grain of salt :-)

Hong Kong at Night

Purity

A pure function is one that only depends on its arguments, it has no access to any other mutable data (it may access constant global state). The body can only compute with the arguments, so invoking the function several times with the same arguments must necessarily produce the same result. Pure functions are what we know from math.

For this to be true, the function can also not have access to sources of data, such as a random number generator, a network socket or IO. In addition, successive “same” return values, must be indistinguishable. For example, calling add(3, 4) three times may technically return three different instances of an object that represents 7. However, the client program must not be able to distinguish them. Finally, the function must also not have any additional outputs beyond what is returned; in other words, it must not modify the state of the world.

Cacheable/Memoizable

If a function always returns the same result (or the technically different objects are not semantically different), then subsequent calls to a function with the same argument values do not have to be reexecuted, the result from the first call can be cached. Whether caching makes sense depends on a tradeoff between the effort for reexecution, the memory needs for caching and the performance of the cache lookup. Generally, the more work a function performs, the more useful it is to cache the results and avoid reexecution.

Again, value semantics are essential here, because the caller must not be able to tell the difference between whether the function is cached or reexecuted. If the cache returns a previously computed value, then, of course, that value must not have been modified in the meantime. This requires immutability (“value semantics”) of the data returned by functions. This is the reason why functional programming and immutable data (in the form of persistent data structures) usually appear together.

Effects

An effect is a change to the world. For example, a function might print a string to the console. While you could consider a console as another data structure that might have an undoLastOutput method, effects that actually affect the real world really cannot be undone. To again quote from a talk by Simon Peyton Jones, “you cannot undo a rocket launch” :-)

So let’s dig into this a bit deeper. A function with a (world modifying) effect cannot be cached. Because then the cached invocations would not perform the effect, despite otherwise returning the same result. So if we imagine a runtime system that decides by itself, based on runtime profiling, whether a particular function should be cached or not, it must not ever cache a function with an effect. This is one major reason why we want to make it explicit that a function (potentially) has an effect (note that an imperative programming language implicitly assumes everything to have an effect and thus can cache nothing — unless it performs non-trivial analyses to figure it out anyway). Similarly, a function with a world-reading effect can take the (changing) data from the world as part of its computation; thus, subsequent invocations can return different results. No more purity, no caching.

Idempotency

Consider the following non-pure function:

fun createNewCustomer(id: ID, data: CustomerData) {
  if (!database.hasKey(id)) 
    then database.store(id, data) 
    else false
}

Here, the database is the world (a global variable), the call to store is the effect. Notice how the implementation is defensive: if the customer, as identified by its ID, already exists, the function does nothing. So you can call it any number of times without messing up the global state. A function that behaves this way is called idempotent. Idempotency is very useful in distributed systems where remote calls/packets might get lost. If the called function is idempotent, you can always re-call if you’re in doubt whether the first invocation succeeded or not. I am not so sure where and how I would exploit idempotency in a functional language.

Different Kinds of Effects

So, are all effects equal? Consider a function that adds two values and also
outputs a log value:

fun add(a: int, b: int) {
  log "adding " + a + " and " + b
  a + b
}

Is this function pure? You could argue no, because it has an effect of appending to the log. On the other hand you could say yes, because the log output is not part of the world that is visible to your program; presumably you cannot read from the log, so change of that part of the world is irrelevant. In some sense, the log is “observing” the program’s execution rather than being something that the program modifies intentionally. If we cached this function, then the log output would only be written during the first (non-cached) execution. If you switched off caching, the log would be written for every invocation. However, this might be exactly the point of the log in the first place: find out if caching works. So in that sense, logging is not an effect (you care about).

Here is another example:

fun add(a: int, b: int) pre programMode == active {
  a + b
}

In this example, we read from the global state programMode in the precondition. So even if the function body is never executed (because the precondition fails), we do execute the precondition. We obviously do not want to allow modifying global state in the precondition, but we do want to allow read access, because, presumably, the a purpose of the precondition might be to only execute the function if the world is in a particular state. Even if we cached the execution (because the inputs are the same as in a previous invocation), we still want to execute the precondition (and maybe throw an exception and stop program termination if the precondition is not met).

Summing up, we have to distinguish between different kinds of effects. Expressions that modify a global data structure that is “observing” in its nature do not constitute an effect. And we have to distinguish between reading mutable state and modifying it.

A more elaborate effect system would also distinguish between the different parts of the world an effect concerns: IO, network, global memory, whatever, the list is potentially endless. A user would have to be able to define their own kinds of effects. I have a vague notion of how this could be implemented, but I don’t quite see the reason for it. If anybody can tell me what I would do with this information, I am all ears.

Totality

A total function is one that terminates with a valid result for all inputs. A function that does not terminate is said to diverge. The above example of the precondition that throws if it is not met is an example of divergence. There are functional languages that are total (where, I guess, the compiler has to somehow show termination even in case of recursion), but the details of this are beyond my current understanding.

Effect Tracking

As mentioned before, a key idea of making effects explicit is to be able to (easily) analyze a program to find out where effects happen, and where they don’t. Parts of programs that have no effects can be cached, reexecuted at will and even parallelised. Because all the inputs are explicit, it is also relatively easy to implement a reactive system where computations are retriggered when an input changes. We will return to this particular use case in a future post.

So how do you track effects? Let us consider the following code:

record Person {
  name: string
  age: int
}

fun perhapsStoreData(p: Person) {
  if p.age > 10 then personDB.store(p) else false
}

perhapsStoreData(Person("Markus", 43)) // 1
perhapsStoreData(Person("Peter", 5))   // 2

Let's assume thatdb.store is an operation that has a modify effect. The question is: does the function perhapsStoreData have an effect or not? As its name suggests, it depends on the particular p passed as an argument. So of the two calls to perhapsStoreData, the first one has an effect, the second one does not.

This realisation leads to two different styles of effect tracking. In the first one, you perform a data flow analysis of the code along the call graph. It would find out that call number 1 has an effect, and call number 2 does not. A less precise alternative analysis would simply look at the code of a function, see if any of its children has an effect (true in the case of perhapsStoreData), and then determine that the function (potentially) has an effect.

In the latter case you associate the fact that the function (potentially) has an effect with the function, for example, as part of the type (the return type of perhapsStoreData could be effect) or you maintain the information separately (you can imagine it as a kind of second return type). In the former case, the function itself does not say anything about effects or not; only a function call, where you know the argument values (potentially through more analysis), gets an effect associated with it. Whether you take the data flow and values into account is a precision property of the analysis. The trade-offs associated with this and other precision properties are discussed in section 4 of the formal methods booklet.

Effect Tracking in KernelF

In KernelF, we use the less precise alternative to keep things simple. We do not mangle the effects with the type, but maintain it separately. By default, an Expression has no effect. If an expression does have an effect, it implements the IMayHaveEffect interface. This interface has a method that returns an EffectDescriptor that has the flags modifiesState and readsState. We propagate effects up along the program hierarchy (a BlockExpression has an effect if any of its child expressions has an effect), as well as along specific edges (a FunctionCall derives its own effect descriptor from the effect descriptor of the called function). The code optionally highlights effects on functions and function calls with the /R, /M and /RM annotations (see below).

A second interface IMayAllowEffect controls at which program locations effects are allowed. Based on the interactions of these two interfaces, the IDE can report errors relating to effects:

The init value of a global constant can have a read effect, but not a modifies effect:

The actions in a state machine must have a modifies effect (“action” kinda suggests that)

In a block expression, an expression that is not assigned to a value and is not the last one in the block must have some kind of effect. Otherwise its value gets lost and the expression should be avoided.

Wrap Up

I hope this clarifies some of the terminology around functions. I am pretty sure I have missed something, so please reply with corrections and additions! In the next post I will discuss how we integrate mutable state as well as transactions into the (functional) KernelF language.

Some Concepts in Functional Languages was originally published in language engineering on Medium, where people are continuing the conversation by highlighting and responding to this story.

Thoughts on Declarativeness

Markus Voelter — Mon, 16 Oct 2017 07:41:01 GMT

What it means, when to use it, and how to shape it

DSLs are supposed to be “declarative”. People say this, but it is not clear what they mean with it. So let us unpack this a little bit. What does it mean for a language to be declarative? Here is the definition from Wikipedia:

In computer science, declarative programming is a programming paradigm — a style of building the structure and elements of computer programs — that expresses the logic of a computation without describing its control flow.

Ok, this is useful. But then, is a pure functional language declarative? After all, it has trivial control flow (the call graph is the control flow graph) and no side-effects. If so, what is the difference between functional and declarative? Wikipedia continues to say

[describe] what the program must accomplish in terms of the problem domain, rather than describe how to accomplish it as a sequence of the programming language primitives.

This is essentially the definition of a (good) DSL. So are all DSLs declarative?

Another definition I have seen points out that a declarative program is one that does not even have a predefined execution order (a functional program has this). In other words, there is some kind of “engine” that processes the program by finding out an efficient way of “working with the program” to find a solution.

Solvers & Constraint-based Programming

Solvers are examples of such engines. They take as input a couple of equations with free variables, and try to find value assignments to these variables that satisfy all the equations. For mathematical equations, we know the problem from school:

Eq1:   2 * x == 3 * y 
Eq2:   x + 2 == 5
Eq3:       3 == y + 1

A solution for this set of equations is x := 3, y := 2. Solvers that solve this kind of equations, are called SMT solvers. SMT stands for “Satisfiability Modulo Theories”; the “Theories” part relates to the set of abstractions it can work with. For example, in addition to to arithmetics, they can also work with other “theories”, such as logic or collections. Modern SMT solvers, such as Z3, can work with huge sets of equations and large numbers of free variables, while still solving them in very short time (sub-second). For more details on the use of solvers in DSLs, and lots of related work, see Section 5 in the Formal Methods booklet.

Note that these equations do not imply a direction — they can be solved left to right and right to left, because they only specify constraints on the variables. This is why this approach to programming is also called constraint-based programming.

It is possible to encode structure/graphs as equations (also called relations). Once we have done this, we can “solve for structure”, i.e., we can find structures that satisfy the other constraints expressed in the equations. For example, one can express the data flow graph as a set of relations over the program nodes and then perform dataflow analyses, ideally incrementally. The INCA paper by Tamas Szabo, Sebastian Erdweg and yours truly explains how to do this.

Such constraints can also represent some notion of cost, for example, the needed bandwidth for a network connection. In this case, a solver is used iteratively. Czarnecki et al. perform multi-objective optimisations for structural and variable models. A hello-world example is also in the Formal Methods booklet.

The constraints can also be used to express structural limitations (such as visibility rules in program trees). We will return to this later.

This is all very nice. But solvers have three problems. First, encoding non-trivial problems as equations can be tough. This is especially true for structures. Second, depending on the number, size and other particular properties of the equations, the required memory and/or performance can be a problem — it does get big and slow quickly, again, especially for structures And finally, because the engine uses all kinds of heuristics and advanced algorithms to find solutions efficiently, debugging can be a real nightmare.

State Machines and Model Checking

Are state machines declarative? Well, I don’t know. They do encode control flow (in particular, as reactions to outside events). They also change global state (which is where their name comes from). And that state potentially evolves differently, depending on the order (and timing, in timed state-machines) of incoming events. So this pretty much makes state machines as non-declarative as you can get.

But is this really a useful definition? One reason why people use state machines is that they can be analyzed really well. The whole field of temporal logic and model checking (see Section 6 in the Formal Methods booklet) is about verifying properties on state machines. The reason why this works is that they make state and the way it evolves explicit using first-class language constructs. I will discuss model checking in a future post.

Alternative Definition of Declarativeness

Let me propose another definition of declarativeness. It might sound overly pragmatic, but it is nonetheless useful:

Declarativeness: A declarative language is one that makes those analyses simple/feasible that are required for the use cases of the language.

This reemphasises a core idea underlying the Formal Methods booklet: you should consciously design a language in a way that makes analyses simpler; usually by making the right things first class. In the rest of the post I show two examples of meta languages (i.e., languages used in language definition) that illustrate the point.

Scopes

In a language workbench like MPS, programs are represented as graphs. The containment structure is a tree, but there are cross-references as well. For example, a Method contains a Statement, a Statement might contain a VariableRef (a particular kind of Expression), which in turn references a Variable defined further up in the method. The language structure only talks about concepts: a VariableRef has a reference var that points to a Variable. But it does not specify which variables. Scopes are used to define the set of valid reference targets (Variables in this example). They are a way of implementing visibility rules in a language.

Scopes as Functions

A scope can be seen as a function with the following signature:

fun scope(node n, Reference r): set

In our example, T would be VariableRef, r would represent the var reference, and U would be Variable. A naive implementation of a scope can be procedural or functional code that returns, for example, the set of all Variables in the current method (let’s ignore order and nested blocks for now) as well as the GlobalVariables in the current program. The example below encodes the particular reference for which we define the scope in the name of the function:

fun scope_VariableRef_var(node n): set = {
  n.ancestor.descendants union 
    n.ancestor.descendants
}

So far so good. Whenever the user presses Ctrl-Space to see the set of valid reference targets (or when an existing reference is validated by the type checker), the system can simply call this function and use the returned set for display in the menu (or for testing if the current target is in the set).

Creating Targets

Let us now introduce an additional requirement. Let’s say, the user wants to reference something that does not yet exist. Since a projectional editor like MPS establishes references eagerly (as opposed through lazily resolved names), the target element must actually exist for us to be able to establish the reference. We can’t write down a method name (with the intention to call it), and then later fill in the method; the reference cannot be entered! To still support top down programming, we have to be able to implicitly create missing targets. So if a user presses Ctrl-Space in the context of a VariableRef, in addition to choosing from existing targets (computed by the scope), the tool should provide one-click actions for creating new targets (through a quick fix or a special part in the code completion menu).

Here is the catch: the one-click target creators should create valid targets. So they have to somehow find out, where in the program which kinds of targets would be valid and only propose those. Extracting this information from the scope definition above (or even worse, one written in a procedural language) is really quite hard.

Solver-based Solution

In some sense, we have to “forward-evaluate” the scope to find the places where, if a node existed there, it would be a valid target for the reference. Solvers can do just this. If we formulate the program structure and the scopes as constraints, transform the current program to a set of equations and then ask the solver to solve them (in the right way), this would solve the problem quite elegantly. People have worked on this, see for example Friedrich Steimann’s work (here, here and in particular, here). However, the performance and memory requirements, as well as the (from most developers’ perspectives) non-intuitive way of specifying the constraints makes this approach challenging in practice. Hundreds of MB or even GBs of memory for relatively small programs is common. Currently, I cannot see how to use this in practice.

Structure-based Declarativeness

Consider the following scope definitions:

scope for VariableRef::var {
  navigate 
    container Method::statements
         path node.ancestor
  navigate 
    container Program::contents
         path node.ancestor
           of GlobalVariable  
}

This has the following structure and semantics. A scope defines which concept and which reference it is for (here: var in VariableRef). It then contains multiple separate definitions of where targets can be found. Each one specifies a container location (the statements collection in a Method or the contents of a Program) as well as a path expression of how to get there from the perspective of the current node. The execution algorithm is as follows: for each navigate block, execute the path expression from the current node to get to the container nodes. Then grab all the nodes in the specified container slot (statements or contents), but only select those that are of the compatible concept, or those explicitly specified by the of clause.

For our second use case, the creation of missing targets, we don’t have to perform any magic: we create an action for each of the navigate blocks (so in the example, we’d get two, as expected). To execute the action, we execute exactly the same path expression to lead us to the container nodes, just as when we evaluate the scopes. We then create the new nodes in the slot defined by the container.

For this to work we don’t need a solver, we do not have to “forward-evaluate” (or solve) anything. The path expressions can be as complicated as they want to be, we can filter, or compute anything we want …

path node.ancestor.where(
        it.statements.size < 10 && !it.name.endsWith(“X”))

… because we always only evaluate it as a functional program. And there is no performance problem, we just “execute” code. There are, of course, limitations. We cannot express more complex structural constraints. But this was not the goal. Below is a screenshot of a working example of a somewhat more complex set of scoping rules:

Even debugging such scope definitions is not a real problem, because, again, there is no real advanced magic going on: it is a mix of function evaluation and some specific (but straight-forward) debug support for the declarative parts. I will revisit debugging of these and other DSLs in a future post.

The obligatory useless picture :-)

Type Systems

Type systems are primary candidates for declarativeness. In fact, MPS itself uses a type equations and a solver to compute types. Let me give an example.

MPS Type System DSL

Imagine you want to compute the type of the following list:

val l = list(1, 2, 23.3)

The list contains ints and floats, and in most languages, the resulting type would be list because float is a supertype of int. In other words, the resulting type is a list type of least common supertype of the elements of the list. Here is how MPS expresses this rule:

typing rule for ListLiteral as l {
  var T;
  foreach e in l.elements {
    T :>=: typeof(e)
  }
  typeof(l) :==: 
}

This code first declares a type variable T. It then iterates over all elements in the list literal and adds an equation to the set of equations that expresses that T must be the same type or a supertype (:>=:) of the type of the element e we are currently iterating over. At the end of the foreach we have created as many equations as there are elements in the list literal, all of them expressing a constraint on the variable T. Next we create another equation that expresses that the type of the list literal l (the thing we are trying to find a type for) must be the same types as (:==:) a ListType whose base type is T. All of these equations are then fed to a solver who tries to find a solution for the value of T.

Another example. Consider a language with type inference. When you declare a variable, it can be

Only a name and a type; the type of the variable is then the explicitly given type (var x: int).
Only a name and an init value; in this case the type of the variable is inferred from the init expression (var x = 10).
A name, type and an init expression; the type is the explicitly given one, but the init expression’s type must be the same or a subtype of the explicitly given one (var x: int = 10).

The typing rule that expresses this is given here:

typing rule for Variable as v {
  if v.type != null then typeof(v) :==: typeof(v.type)
                    else typeof(v) :==: typeof(v.init)
  if v.init != null then typeof(v.init) :<=: typeof(v)
}

The MPS type system language is quite powerful. For meaningfully-sized programs it performs well, mostly because it is updated incrementally. MPS tracks changes to the program and then selectively adds and removes equations from the set of equations considered by the solver. The solving itself is also incremental.

The nice thing about the MPS type system DSL is — surprise! — its declarativeness. MPS supports language extension and composition, and the type system fits in neatly: and extension language just adds additional equations (constraints) to the set of type equations considered by the solver. There’s no fuzz with execution order or extension points or anything.

However, it has one major weak spot: debugging. If your type system does not work, there is a debugger. All it does, essentially, is to visualise the solver state. If you don’t understand in detail what the solver does, this is rather useless. I have heard rumours that there is somebody in Jetbrains’ MPS team who understands the debugger, but I haven’t met the guy yet :-)

Type Systems expressed through Functions

A much simpler way for expressing type systems is functions: you can imagine every language concept to essentially have a function typeof that returns its type. In that function you explicitly call the typeof functions of other nodes. You define the order explicitly, usually bottom-up! It runs as a simple functional program, with good performance, and debugging is easy. There’s no constraints, no execution-order independence, and it is a bit harder to extend (because of the execution order).Many language workbenches, including Xtext, use this approach.

Structure-based Declarativeness

Let us now look at the following typing rule for the variable declaration:

typing rule for Variable as v = 
  primary v.type :>=: secondary v.init

Here is its semantics: if a v.type is given, then it becomes the result type; the v.type wins, it is the primary. If no v.type is given, then the secondary wins, the v.init in this case. If both are given, the primary still wins, but the secondary must be the same or a subtype of the primary.

For the list literal example, the typing rule is:

typing rule for ListLiteral as l = 
  create ListType[baseType: commonsuper(l.elements)]
}

It is so obvious, I don’t even have to explain what it does.

Both of these are much shorter, more expressive, and can be evaluated without a solver. The reason for this is that we have created direct, first-class abstractions for the relevant semantics — the language is declarative!

Debugging is straightforward by the way, the debugger can simply “explain” what is going on: v.type is null, so taking v.init: float.

There is of course an obvious drawback as well: you have to implement more, and more specific language abstractions, essentially one for every typing idiom you want to support (or you provide a fallback onto regular functional programming). However, we expect there to be fewer than 20 of such idioms to express the vast majority of relevant type systems. Implementing those with a tool like MPS is not a big deal — certainly more feasible than implementing a fast solver!

Wrap up

In this post I have illustrated what I mean by declarative: a language where the analyses I am interested in are expressed first-class. I have explained the idea based on two meta languages (for scopes and type systems). Those, by the way, have been taken from a new set of language definition DSLs we are working on. Stay tuned, you will read much more of those in the future :-)

Thoughts on Declarativeness was originally published in language engineering on Medium, where people are continuing the conversation by highlighting and responding to this story.