About this blog

Hi, I'm Peter Bex, a Scheme and free software enthusiast from the Netherlands. See my user page on the CHICKEN wiki or my git server for some of my projects.

The 3 most recent posts (archive) Atom feed

The European Commission has posted a "call for evidence" on open source for digital sovereignty. This seeks feedback from the public on how to reduce its dependency on software from non-EU companies through Free and Open Source Software (FOSS).

This is my response, with proper formatting (the web form replies all seem to have gotten their spaces collapsed) and for future reference.

The added value of FOSS

In times where international relations are tense, it is wise to invest in digital sovereignty. For example, recently there was a controversy surrounding the International Criminal Court losing access to e-mail hosted by Microsoft, a US company, for political reasons.

A year earlier, a faulty CrowdStrike update caused the largest IT outage in history. This was an accident, but it was a good reminder of the power that rests in foreign hands. We have to consider the possibility of a foreign government pressuring a company to issue a malicious update on purpose. This update could target only specific countries.

Bringing essential infrastructure into EU hands makes sense. But why does this have to be FOSS? For instance, the CrowdStrike incident could also have happened with FOSS.

With FOSS, one does not have to trust a single company to maintain high code quality and security. Independent security researchers and programmers will be looking at this code with a fresh perspective. It is also an industry truism that FOSS code tends to be of higher quality, simply because releasing bad code is too embarrassing.

FOSS also reduces vendor lock-in. One can switch vendors and keep using the same product when for example the vendor:

  • goes bankrupt,
  • drops support for the product,
  • drastically increases prices,
  • decides on a different direction for the product than the user wants,
  • or gets acquired by a foreign company.

Therefore, FOSS brings sovereignty by not being at the mercy of a single vendor.

Public sector and consultancies

The EU can set a good example by starting in the public sector: government EU organisations and those of the member states, as well as semi-government organisations like universities and libraries. Closed source software still reigns supreme there. Only "established" companies may apply to tenders. These often employ professionals certified in proprietary tech. This encourages vendor lock-in. The existing dependency ensures lock-in for future projects, as compatibility is often a key requirement.

These same vendors are ruthless and have repeatedly sabotaged FOSS migrations. Microsoft was involved in multiple bribery scandals in The Netherlands, Romania, Italy and Hungary, for example. There have also been allegations of illegal deals that were never investigated, such as with the LiMux project in Munich.

How the EU can help:

  • Fully commit to FOSS. Set a date by which all software used by the public sector must be FOSS and running on hardware within the EU, at fully EU-owned companies. No compromises, no excuses and no easy outs - those were the bane of previous efforts.
  • Map out missing requirements and pay EU consultancy firms to improve FOSS where it is lacking. This will also make said software also more attractive for large private organisations that provide essential services in the EU.

Concrete examples:

  • Many EU and member state institutes rely on American services for hosting or securing their e-mail. E-mail software is a complete commodity, for which there are good European alternatives, based on FOSS. It should be easy to switch.
  • Workstations for public servants typically run on Windows and use Microsoft Office. Switch these to a proven open operating system like Linux and office suite like LibreOffice.

Education and mind share

In schools, informatics is typically taught using proprietary software. This is often cloud software. Schools do not have the expertise or funds to run their own servers. Therefore, they use the easy option that teachers are familiar with: "free" online offerings from US Big Tech. Network effects ensure deeper entrenchment. Big Tech offers steep discounts for educational licenses for these exact reasons.

Vocational schools focus on proprietary tech most used in industry. This goes beyond IT studies. For example, statistics and psychology courses use SPSS over PSPP or R. Mathematics and physics courses use MATLAB over GNU Octave. Engineering courses use AutoCAD instead of FreeCAD or LibreCAD.

A focus on the impact of tech choices in education could change the situation from the ground up. In high school, there could be a place (e.g. in civic education class) to focus on the impact of tech choices on society. This goes beyond domestic versus foreign "cloud" hosting and open versus proprietary code. For example, studies show that social media can have harmful effects on mental well-being, societal cohesion and even democracy.

How the EU can help:

  • Provide funding for course material, and/or create a certification programme for suitable course material to wean schools off of Big Tech software.
  • Start an education campaign aimed at the broader public in order to explain why closed software and the non-EU cloud are harmful. For example, it could focus on concrete issues that affect anyone like data protection, privacy and resistance against "enshittification" such as unwanted ads, price hikes and feature removal.
  • For the existing work force, the EU can fund training in open alternatives so that people feel confident with these alternatives. Such training should include a theoretical component to discuss the benefits of using open alternatives to ensure people are fully on board.

Existing FOSS companies and economic situation

The EU has plenty of FOSS businesses already. A handful of examples: SUSE was one of the first companies to provide FOSS server and desktop operating systems for the enterprise. Tuta and Proton Mail provide innovative secure e-mail solutions. Nextcloud offers cloud-based content collaboration tools. GitLab and Codeberg offer code hosting platforms.

These companies are innovative and profitable, but small in the global market place. Competitors from the US benefit from economies of scale. The initial US market is a large country with a single language and minimal legislation. This allows for quick domestic growth followed by global expansion. The EU market is more fragmented so it is harder to gain a foothold, requiring more up front investment to e.g. support the languages spoken in the EU.

Venture capital is also less likely to invest in the EU because of stricter legislation. Because FOSS solutions give competing companies a chance to offer the product, the returns on investment are lower than with proprietary software where a single company has a monopoly on the software.

Some EU companies have realised that this legislation is an asset: it allows for differentiation from US-based offerings. EU software can compete in the global market place on its own merits.

How the EU can help:

  • Promote tech sovereignty to countries across the world. Start with countries who are not formally allied to the US. This could help EU companies to expand into the global market.
  • Help EU companies become more well-known by organising trade shows exhibiting only FOSS EU companies.
  • Provide funding to organisations like the FSF Europe to run awareness campaigns about FOSS alternatives.
  • Perhaps controversial: heavily tax proprietary, non-EU software or provide tax breaks for FOSS EU software to level the playing field.
  • Even more controversially: prevent foreign-owned companies from operating data centers in the EU. Make it as hard as possible for them to offer high-speed cloud software here. These data centers are already unpopular, as they use precious water and land, and they only make foreign companies more powerful.

Conclusion

The reasons for dependency on foreign proprietary solutions are systemic. The causes are various: from inertia and ignorance to market effects and bribery. The solutions must be equally systemic: from education to policy and funding, all points must be attacked in order to succeed. This is the only way we can get rid of our dependency on non-EU software.


I feel a change is happening in how people produce and (want to) consume software, and I want to give my two cents on the matter.

It has become more mainstream to see people critical of "Big Tech". Enshittification has become a familiar term even outside the geek community. Obnoxious "AI" features that nobody asked for get crammed into products. Software that spies on its users is awfully common. Software updates have started crippling existing features, or have deliberately stopped being available, so more new devices can be sold. Finally, it is increasingly common to get obnoxious ads shoved in your face, even in software you have already paid for.

In short, it has become hard to really trust software. It often does not act in the user's best interest. At the same time, we are entrusting software with more and more of our lives.

Thankfully, new projects are springing up which are using a different governance model. Instead of a for-profit commercial business, there is a non-profit backing them. Some examples of more or less popular projects:

Some of these are older projects, but there seems to be something in the air that is causing more projects to move to non-profit governance, and for people to choose these.

As I was preparing this article, I saw an announcement that ghostty now has a non-profit organisation behind it. At the same time, I see more reports from developers leaving GitHub for Codeberg, and in the mainstream more and more people are switching to Signal.

Why free and open source software is not enough

From a user perspective, free software and open source software (FOSS) has advantages over proprietary software. For instance, you can study the code to see what it does. This alone can deter manufacturers from putting in user-hostile features. You can also remove or change what you dislike or add features you would like to see. If you are unable to code, you can usually find someone else to do it for you.

Unfortunately, this is not enough. Simply having the ability to see and change the code does not help when the program is a web service. Network effects will ensure that the "main instance" is the only viable place to use this; you have all your data there, and all your friends are there. And hosting the software yourself is hard for non-technical people. Even highly technical people often find it too much of a hassle.

Also, code can be very complex! Often, only the team behind it can realistically further develop it. This means you can run it yourself, but still are dependent on the manufacturer for the direction of the product. This is how you get, for example, AI features in GitLab and ads in Ubuntu Linux. One can technically remove or disable those features, but it is hard to keep such a modified version (a fork) up with the manufacturer's more desirable changes.

The reason is that the companies creating these products are still motivated by profit and increasing shareholder value. As long as the product still provides (enough) value, users will put up with misfeatures. The (perceived) cost of switching is too high.

Non-profit is not a panacea

Let us say a non-profit is behind the software. It is available under a 100% FOSS license. Then there are still ways things can go downhill. I think this happens most commonly if the funding is not in order.

For example, Mozilla is often criticised for receiving funding from Google. In return, it uses Google as the default search. To make it less dependent on Google, Mozilla acquired Pocket and integrated it into the browser. It also added ads on the home screen. Both of these actions have also been criticised. I do not want to pick on Mozilla (I use Firefox every day). It has clearly been struggling to make ends meet in a way that is consistent with its goals and values.

I think the biggest risk factor is (ironically) if the non-profit does not have a sustainable business model and has to rely on funding from other groups. This can compromise the vision, like in Mozilla's case. For web software, the obvious business model is a SaaS platform that offers the software. This allows the non-profit to make money from the convenience of not having to administer it yourself.

There is another, probably even better, way to ensure the non-profit will make good decisions. If the organisation is democratically led and open for anyone to become a member like Codeberg e.V. is, it can be steered by the very users it serves. This means there is no top-down leadership that may make questionable decisions. Many thanks to Technomancy for pointing this out.

What about volunteer driven efforts?

Ah, good old volunteer driven FOSS. Personally, I prefer using such software in general. There is no profit motive in sight and the developers are just scratching their own itch. Nobody is focused on growth and attracting more customers. Instead, the software does only what it has to do with a minimum of fuss.

I love that aspect, but it is also a problem. Developers often do not care about ease of use for beginners. Software like this is often a power tool for power users, with lots of sharp edges. Perfect for developers, not so much for the general public.

More importantly, volunteer driven FOSS has other limits. Developer burn-out happens more than we would like to admit, and for-profit companies tend to strip-mine the commons.

There are some solutions available for volunteer-driven projects. For example Clojurists together, thanks.dev, the Apache Foundation, the Software Freedom Conservancy and NLnet all financially support volunteer-driven projects. But it is not easy to apply to these, and volunteer-driven projects are often simply not organised in a way to receive money.

Conclusion

With a non-profit organisation employing the maintainers of a project, there is more guarantee of continuity. It also can ensure that the "boring" but important work gets done. Good interface design, documentation, customer support. All that good stuff. If there are paying users, I expect that you get some of the benefits of corporate-driven software and less of the drawbacks.

That is why I believe these types of projects will be the go-to source for sustainable, trustworthy software for end-users. I think it is important to increase awareness about such projects. They offer alternatives to Big Tech software that are palatable to non-technical users.


NOTE: This is another guest post by Felix Winkelmann, the founder and one of the current maintainers of CHICKEN Scheme.

Introduction

Hi! This post is about a new project of mine, called "CRUNCH", a compiler for a statically typed subset of the programming language Scheme, specifically, the R7RS (small) standard.

The compiler runs on top of the CHICKEN Scheme system and produces portable C99 that can then be compiled and executed on any platform that has a decent C compiler.

So, why another Scheme implementation, considering that there already exists such a large number of interpreters and compilers for this language? What motivated me was the emergence of the PreScheme restoration project, a modernisation of "PreScheme", a statically typed compiler for Scheme that is used in the Scheme48 implementation. The original PreScheme was embedded into S48 and was used to generate the virtual machine that is targeted by the latter system. Andrew Whatson couragously started a project to port PreScheme to modern R7RS Scheme systems (PreScheme is written in Scheme, of course) with the intention of extending it and keep the quite sophisticated and interesting compiler alive.

The announcement of the project and some of the reactions that it spawned made me realize that there seems to be a genuine demand for a statically typed high-performance compiler for Scheme (even if just for a subset) that would close a gap in the spectrum of Scheme systems currently available.

There are compilers and interpreters for all sorts of platforms, ranging from tiny, minimal interpreters to state-of-the-art compilers, targeting about every imaginable computer system. But most Schemes need relatively complex runtime systems, have numerous dependencies, or have slow performance, which is simply due to the powerful semantics of the language: dynamic typing, automatic memory management (garbage collection), first class continuations, etc. which all have a cost in terms of overhead.

What is needed is a small, portable compiler that generates more or less "natural" C code with minimal dependencies and runtime system that supports at least the basic constructs of the language and that puts an emphasis on producing efficient code, even if some of the more powerful features of Scheme are not available. Such a system would be perfect for writing games, virtual machines, or performance-sensitive libraries for other programs where you still want to use a high-level language to master the task of implementing complex algorithms, while keeping as close to C/C++ as possible. Another use is as a tool to write bare-metal code for embedded systems, device drivers and kernels for operating systems.

There are some high-performance compilers like Bigloo or Stalin. But the former still needs a non-trivial runtime-system and the latter is brittle and not actively maintained. Also, one doesn't necessarily need support for the full Scheme language and if one is willing to drop the requirement of dynamic typing, a lot of performance can be gained while still having a relatively simple compiler implementation. Even without continuations, dynamic typing, the full numeric tower and general tail call optimization, the powerful metaprogramming facilities of Scheme and the clear and simple syntax make it a useful notation for many uses that require a high level of abstraction. Using type inference mostly avoids having to annotate a source program with type information and thus allows creating code which still is to a large part standard Scheme code that can (with a little care) be tested on a normal Scheme system before compiling it to more efficient native code.

History

There was a previous extension for CHICKEN, also called "crunch", that compiled to C++, used a somewhat improvised type-inferencing algorithm and was severely restricted. It was used to allow embedding statically typed code into normal CHICKEN Scheme programs. The new CRUNCH picks up this specific way of use, but is a complete reimplementation that targets C99, has a more sophisticated type system, offers some powerful optimizations and has the option to create standalone programs or separately compilable C modules.

Installation

CRUNCH is only available for the new major release of CHICKEN (version 6). You will need to build and install a development snapshot containing the sources of this release, which is still unofficial and under development:

 $ wget https://code.call-cc.org/dev-snapshots/2024/12/09/chicken-6.0.0pre1.tar.gz
 $ tar xfz chicken-6.0.0pre1.tar.gz
 $ cd chicken-6.0.0pre1
 $ ./configure --prefix <install location>
 $ make
 $ make install
 $ <install location>/bin/chicken-install -test crunch

CHICKEN has minimal dependencies (a C compiler, sh(1) and GNU make(1)), so don't be put off to give it a try.

Basic Operation and Usage

CRUNCH can be used as a batch compiler, translating Scheme to standalone C programs or can be used at compile time for embedded fragments of Scheme code, automatically creating the necessary glue to use the compiled code from CHICKEN Scheme. The compiler itself is also exposed as a library function, making various scenarios possible where you want to programmatically convert Scheme into native code.

There are four modes of using CRUNCH:

1. Embedding:

;; embed compiled code into Scheme (called using the foreign function interface):
(import crunch)
(crunch
  (define (stuff arg) ...) )
(stuff 123)

2. Standalone:

 $ cat hello.scm
 (define (main) (display "Hello world\n"))
 $ chicken-crunch hello.scm -o hello.c
 $ cc hello.c $(chicken-crunch -cflags -libs)
 $ ./a.out

3. Wrap compiled code in Scheme stubs to use it from CHICKEN:

 $ cat fast-stuff.scm
 (module fast-stuff (do-something)
   (import (scheme base))
   (define (do-something) ...))

 $ cat use-fast-stuff.scm
 (import fast-stuff)
 (fast-wait)

 $ chicken-crunch -emit-wrappers wrap.scm -J fast-stuff.scm -o fast-stuff.c
 $ csc -s wrap.scm fast-stuff.c -o wrap.so
 $ csc use-fast-stuff.scm -o a.out

4. Using CRUNCH as a library:

#;1> (import (crunch compiler))
#;2> (crunch
       '(begin (define (main) (display "Hello world\n"))
       '(output-file "out.c") )

Module system and integration into CHICKEN

CRUNCH uses the module system and syntactic metaprogramming facilities of CHICKEN. Syntax defined in CHICKEN modules can be used in CRUNCH code and vice versa. CRUNCHed code can produce "import libraries", like in CHICKEN to provide separate compilation of modules.

Modules compiled by CRUNCH may only export procedures and a standalone program is expected to export a procedure called main. This simplifies interfacing to C and makes callbacks from C into Scheme straightforward.

As in PreScheme, toplevel code is evaluated at compile time. Most assigned values can be accessed in compiled code.

;; build a table of sine values at compile time
(define sines
  (list->f64vector
    (list-tabulate 360
      (lambda (n) (sin (/ (* n π) 180))) ) ) )

Restrictions

A number of significant restrictions apply to Scheme code compiled with CRUNCH:

  • No support for multiple values
  • No support for first class continuations
  • Tail calls can only be optimized into loops for local procedure calls or calls that can be inlined
  • Closures (procedures capturing free variables) are not supported
  • Procedures can have no "rest" argument
  • Imported global variables can not be modified
  • Currently only 2-argument arithmetic and comparison operators are supported
  • It must be possible to eliminate all free variables via inlining and lambda-lifting

This list looks quite severe but it should be noted that a large amount of idiomatic Scheme code can still be compiled that way. Also, CRUNCH does not attempt to be a perfect replacement for a traditional Scheme system, it merely tries to provide an efficient programming system for domains where performance and interoperability with native code are of high importance.

Datums are restricted to the following types:

  • basic types: integer, float, complex, boolean, char, pointer
  • procedure types
  • strings
  • vectors of any of the basic types, and vectors for specific numeric types
  • structs and unions

Note the absence of pairs, lists and symbols. Structures and unions are representations of the equivalent C object and can be passed by value or by pointer.

The Runtime System

The runtime system required to run compiled code is minimal and contained in a single C header file. CRUNCH supports UNICODE and the code for UNICODE-aware case conversions and some other non-trivial operations is provided in a separate C file. UNICODE support is optional and can be disabled.

No garbage collector is needed. Non-atomic data like strings and vectors are managed using reference counting without any precautions taken to avoid circular data, which is something that is unlikely to happen by accident with the data types currently supported.

Optimizations

CRUNCH provides a small number of powerful optimizations to ensure decent performance and to allow more or less idiomatic Scheme code to be compiled. The type system is not fully polymorphic, but allows overloading of many standard procedures to handle generic operations that accept a number of different argument types. Additionally, a "monomorphization" optimization is provided that clones user procedures that are called with different argument types. Standard procedures that accept procedures are often expanded inline which further increases the opportunities for inlining of procedure calls - this reduces the chance of having "free" variables, which the compiler must be able to eliminate as it doesn't support closures. Aggressively moving lexically bound variables to toplevel (making them globals) can further reduce the amount of free variables.

Procedures that are called only once are inlined at the call site ("integrated"). Fully general inlining is not supported, we leave that to the C compiler. Integrated procedures that call themselves recursively in tail position are turned into loops.

A crucial transformation to eliminate free variables is "lambda lifting", which passes free variables as extra arguments to procedures that do not escape and whose argument list can be modified by the compiler without interfering with user code:

(let ((x 123))
  ; ... more code ...
  (define (foo y) (+ x y))
  ; ... more code ...
  (foo 99) )

  ~>

(let ((x 123))
  ; ... more code ...
  (define (foo y x) (+ x y))
  ; ... more code ...
  (foo 99 x) )

Monomorphization duplicates procedures called with arguments of (potentially) different types:

(define (inc x) (+ x 1))
(foo (inc 123) (inc 99.0))

~>

;; a "variant" represents several instantiations of the original procedure
(define inc
  (%variant
    (lambda (x'int) (+ x'int 1)) 	; "+" will be specialized to integer
    (lambda (x'float) (+ x'float 1)))))	; ... and here to float
(foo (inc'int 123) (inc'float 99.0))

Certain higher-order primitives are expanded inline:

(vector-for-each
  v
  (lambda (x) ...) )

~>   ; (roughly)

(let loop ((i 0))
  (unless (>= i (vector-length v))
    (let ((x (vector-ref v i))) ... (loop (+ i 1))) ) )

A final pass removes unused variables and procedure arguments and code that has no side effects and has unused results.

Together these transformations can get you far enough to write relatively complex Scheme programs while ensuring the generated C code is tight, and with a little effort, easy to understand (in case you need to verify the translation) and (hopefully) does what it is intended to do.

Performance

Code compiled with CRUNCH should be equivalent to a straightforward translation of the Scheme code to C. Scalar values are not tagged nor boxed and are represented with the most fitting underlying C type. There is no extra overhead introduced by the translation, with the following exceptions:

  • Vector- and string accesses perform bound checks (these can be disabled)
  • Using vectors and strings will add some reference counting overhead

If you study the generated code you will encounter many useless variable assignments and some unused values in statement position, these will be removed by the C compiler, also unexported procedures are declared static and so can also very often be inlined by the C compiler leading to little or no overhead.

The Debugger

For analyzing type errors, a static debugger is included, that presents a graphical user interface. When the -debug option is given, a Tcl/Tk script is invoked in a subprocess that shows the internal node tree and can be used to examine the transformed code and the types of sub-expressions, together with the corresponding source code line (if available). Should the compilation abort with an error, the shown node-tree is the state of the program at the point where the error occurred.

Differences to PreScheme

CRUNCH is inspired by and very similar to PreScheme, but has a number of noteworthy differences. CRUNCH tries to be as conformant to R7RS (small) as possible and handles UNICODE characters and strings. It also is tightly integrated into CHICKEN, allowing nearly seamless embedding of high-performance code sections. Macros and top-level code can take full advantage of the full CHICKEN Scheme language and its ecosystem of extension libraries.

PreScheme supports multiple values, while CRUNCH currently does not.

PreScheme uses explicit allocation and deallocation for compound data objects, while CRUNCH utilizes reference counting, removing the need to manually clean up resources.

I'm not too familiar with the PreScheme compiler itself, but I assume it provides more sophisticated optimizations, as it does convert to Static Single Assignment form (SSA), so it is to be expected that the effort to optimise the code is quite high. On the other hand, modern C compilers already provide a multitude of powerful optimizations, so it is not clear how many advantages lower-level optimizations will bring.

Future Plans

There is a lot of room for improvements. Support of multiple vales would be nice, and not too hard to implement, but will need to follow a convention that should not be too awkward to use on the C side. Also, the support for optional arguments is currently quite weak; the ability to specify default values is something that needs to be added.

Primitives for many POSIX libc system calls and library functions should be straightforward to use in CRUNCH code, at least the operations provided by the (chicken file posix) module.

What would be particularly nice would be if the compiler detects state machines - mutually recursive procedures that call each other in tail position.

Other targets are possible, like GPUs. I don't know anything about that, so if you are interested and think you can contribute, please don't hesitate to contact me.

Disclaimer

CRUNCH is currently alpha-software. It certainly contains numerous bugs and shortcomings that will hopefully be found and corrected as the compiler is used. If you are interested, I invite you to give it a try. Contact me directly or join the #chicken IRC channel on Libera.chat, if you have questions, want to report bugs, if you would like to suggest improvements or if you just want to know more about it.

All feedback is very welcome!

The CRUNCH manual can be found at the CHICKEN wiki, the source code repository is here.


Older articles...
Except where otherwise noted, content on this site is licensed under a Creative Commons Attribution 3.0 License. All code fragments on this site are hereby put in the public domain.