Stories by Arjun Dhawan on Medium

KafkaGoSaur: a WebAssembly powered Kafka client

Arjun Dhawan — Mon, 14 Mar 2022 08:01:56 GMT

A new Kafka client for Deno built using WebAssembly

KafkaGoSaur is a new Kafka client for Deno built with WebAssembly on top of kafka-go, the excellent Kafka client library written for Go. This article explains the basic usage of KafkaGoSaur and shines a light on some of its inner workings.

kafka-go

Go is a minimal yet powerful language. Its simplicity has driven its adoption in recent years by both startups and enterprises alike as it allows to build scalable and performant software fast. A useful standard library, modern tooling, and high-quality third-party libraries make it one of the most wanted languages to work with.

One of these third-party libraries is kafka-go. An efficient and simple to use Kafka client developed by Segment. It features both a low- and high-level API.

Deno

Lesser known is Deno, a modern runtime for JavaScript and TypeScript focussing on great developer experience. Created by Ryan Dahl, it fixes long-standing issues and regrets that were introduced when he build Node.js. Deno is web-compatible wherever possible, meaning it runs WebAssembly binaries out of the box!

WebAssembly

WebAssembly (WASM) is a binary instruction format that serves as a universal compilation target. In simple terms, it allows code from almost any language to be run in the browser or any compatible environment like Deno.

Mix up all 3 technologies by compiling kafka-go to a WebAssembly binary and you get KafkaGoSaur: a new Kafka client for Deno that is ready to go.

Usage

Producing

Producing a message is simple. Message values are binary encoded and are produced in batch using writeMessages:

import KafkaGoSaur from "https://deno.land/x/kafkagosaur/mod.ts";

const kafkaGoSaur = new KafkaGoSaur();
const writer = await kafkaGoSaur.createWriter({
  broker: "localhost:9092",
  topic: "test-0",
});

const encoder = new TextEncoder();
const messages = [{ value: encoder.encode("Hello!") } ];

await writer.writeMessages(messages);

Consuming
Messages are consumed one by one using readMessage:

import KafkaGoSaur from "https://deno.land/x/kafkagosaur/mod.ts";

const kafkaGoSaur = new KafkaGoSaur();
const reader = await kafkaGoSaur.createReader({
  brokers: ["localhost:9092"],
  topic: "test-0",
});

const message = await reader.readMessage();

WebAssembly

Can we use WebAssembly to port kafka-go to any language or runtime other than Deno? In principle, yes. But there is a limitation on WebAssembly stemming from the browser environment. Kafka communicates using TCP, which is not supported by browsers. Even though browsers support WebSockets, this web equivalent is not directly supported by Kafka brokers.

That’s why KafkaGoSaur exposes TCP functionality of its host — the Deno runtime — to kafka-go. Go exchanges the functions needed with the Deno runtime through the global object using syscall/js.

Example of functions exchanged in WebAssembly in KafkaGoSaur.

In essence, the exchange of functions is what happens when a new KafkaGoSaur instance is created. The constructor of KafkaGoSaur runs the WebAssembly binary that makes available the API of kafka-go in Deno.

KafkaGoSaur can use two different socket implementations for TCP: Deno’s Socket API (Deno.connect) or the net module (createConnection) from the Node.js compatibility layer. They are used to construct a DialFunc: a kafka-go function to create a net.Conn. By default, the node Node.js implementation is used but switching is easy. Just specify the one you want to use when creating the reader or writer:

const reader = await kafkaGoSaur.createReader({
  brokers: ["localhost:9092"],
  topic: "test-0",
  dialBackend: DialBackend.Node
});

There is one limitation for DialBackend.Deno: producing messages is not supported yet. Somewhere in the implementation of Deno.connect seems to be a bug that causes broken pipe errors. The issue is being investigated.

Promises and goroutines

While Go achieves concurrency through goroutines which can spawn multiple threads, concurrency in JavaScript is modeled on a single-threaded event loop using Promises. That makes concurrency in JavaScript and Go inherently different.

Promises created in Deno need to be awaited in Go. That is done by sending its resolved value into a channel, which inherently blocks the current goroutine. Any function defined in Go can be invoked from JavaScript by wrapping it using js.FuncOf, turning it into a regular JavaScript value. Invoking the wrapped function in JavaScript affects the execution model in both languages:

The event loop of the JavaScript runtime gets paused.
A new goroutine is spawned, executing the Go function.
The event loop resumes when this function returns.

But there is a catch. Any other function wrapped using js.FuncOf will be executed on the very same new goroutine. This poses a problem when js.FuncOf wraps a Go function that calls (and awaits) a blocking JavaScript API. An example of such API would be fetch or read on a TCP connection. These APIs are asynchronous, meaning their return value resolves not now but some moment later in the future. Asynchronous functions in JavaScript rely on the event loop to process their return value whenever it resolves.

Thus when the event loop gets explicitly paused due to invocation of the wrapped function, it ends up in deadlock as the function defined in Go relies on the (never occurring) resumption of the event loop to return.

That’s why it is the responsibility of the caller of js.FuncOf to start a new goroutine to wrap any blocking function. This allows the wrapped Go function to return immediately with a Promise. This immediate return resumes the event loop so it can process the Promise when it resolves. Take a look at the interop package to see how functions NewPromise and Await respectively wrap blocking functions and await JavaScript Promises in Go.

Performance

KafkaGoSaur can write in batches nearly as fast as kafka-go, but reading suffers roughly a 50% performance penalty:

Read and write performance comparison. Tested on a Confluent Cloud Basic cluster.

The stats function (backed by its Stats counterpart) reports the so-called wait and read times and sheds a light on why this happens. The wait time is the time that is spent waiting for a batch of messages to arrive. The read time is the time it takes to read all the messages from this binary response. Kafka-go spends on average 15 ms waiting for a batch and 160 ms reading its messages. For KafkaGoSaur this is 20 ms waiting and 360 ms reading. Thus most of the performance penalty is incurred when KafkaGoSaur parses messages from the already fetched batch response.

Next steps

The exact cause for the performance degradation in the case of reading needs to be still uncovered. But a potential solution is luckily offered by the dual low- and high-level API that kafka-go offers. If the WebAssembly compiled function to read messages from the batch response is ill-performing, the same functionality can simply be reimplemented directly in Deno by making use of the low-level (but still performant) batch fetching.

Even though KafkgaGoSaur is in the early stage of development, your contributions are highly valued and welcomed! Be an early adopter and feel free to ask for new features, report bugs, or submit your code 🙂.

KafkaGoSaur: a WebAssembly powered Kafka client was originally published in The Startup on Medium, where people are continuing the conversation by highlighting and responding to this story.

Pure and type safe error handling in Akka Streams

Arjun Dhawan — Mon, 22 Feb 2021 15:25:49 GMT

Pure and type-safe error handling in Akka Streams

Want to know how to deal with errors in Akka streams in a type-safe way, rather than using .recover? You’ve come to the right place!

At Kaizo we process millions of ticket events per day. We subscribe to these events through the Zendesk API. Companies use Zendesk to track, prioritize and solve customer support interactions. When companies use the Kaizo app in Zendesk, they can evaluate and improve their team’s performance with unified and actionable real-time insights, QA, and gamification.

A ticket is a means through which end users communicate with customers' agents. And real-time processing of ticket related events is required to provide fair, engaging, and leading metrics. Rather than using traditional reporting where you’re always playing catch-up.

To that end, we have to be reactive. We want to provide our agents with useful insights, always. That means designing around failure and expecting any kind of error to happen while processing an event.

The problem

We use Akka streams because of its performance characteristics compared to other streaming libraries. In Akka Streams, you traditionally use .recover to deal with failure:

Source(1 to 10)
  .map(n =>
     if (n == 3)
       throw new RuntimeException(s"unexepted value: $n")
     else
       n.toString
  )
  .recover {
    case e: RuntimeException => e.getMessage
  }

As you can see, Akka expects you to deal with errors by throwing exceptions (and therefore use side-effects). Not only that, the compiler has almost no means of pointing out any bugs due to lack of type-safety. Imagine accidentally removing .recover. Since the type of the resulting expression doesn’t change, bugs like this are impossible to catch by the compiler.

Our first step into improving this snippet is to purify it. Instead of modeling errors by throwing exceptions, we can use Either which results in an expression of type Source[Either[Exception, Int], _].

Source(1 to 10)
  .map(n =>
     if (n == 3)
       Left(new RuntimeException(s"unexepted value: $n"))
     else
       Right(n)
  )

Due to the resulting type, the compiler now forces us to deal with any error at every processing stage. But this also creates another problem: at each moment of transforming the stream, we are now forced to introduce boilerplate which is a nested map: .map(_.map(n => ...)) . While other streaming libraries such as ZIO Streams allow you to conveniently carry around typed errors by virtue of their type definition (Stream[E, A]), Akka streams is clearly not designed to do this. The solution we use at Kaizo to overcome this issue originates from Colin Breck’s talk at Scala Days 2018.

Solution: use divertTo and collect

First of all, we create a Sink specifically for dealing with errors. Secondly, we use divertTo and collect to divert any Left value to it:

val errorSink: Sink[RuntimeException, NotUsed] =
  Flow[RuntimeException]
    .log("Error occurred")
    .to(Sink.ignore)

Source(1 to 10)
  .map(n =>
     if (n == 3)
       Left(new RuntimeException(s"unexepted value: $n"))
     else
       Right(n)
  )
  .divertTo(
    errorsSink.contramap(
      _.left.getOrElse(sys.error("No left value"))
    ),
    _.isLeft
  )
  .collect { case Right(a) => a }

Here, contramap extracts the Left out of the Either before sending it to errorSink. The type of the resulting expression now is Source[Int, _] and we can use .map as we are used to. Note that we only use sys.error due to the Akka Streams API not having primitives to express what we are trying to achieve here. In reality, this error can never occur due to diversion happening if and only if _.isLeft.

Another way to approach this is to have an errorSink of slightly different definition:

val errorSink: Sink[Either[RuntimeException, _], NotUsed] =
  Flow[Either[RuntimeException, _]]
    .collect { case Left(exception) => exception }
    .log("Error occurred")
    .to(Sink.ignore)

Now we no longer need .contramap (together with the user of .left.getOrElse ) before sending errors to the sink. But we are forced to use nested .map to transform the error of the errorSink. And we have a type definition for errorSink that is slightly wider than needed: the errorSink should only deal with errors, but that is not reflected in its type definition.

We can abstract over this pattern, by creating a divertLeftTo function:

Source(1 to 10)
  .map(n =>
     if (n == 3)
       Left(new RuntimeException(s"unexepted value: $n"))
     else
       Right(n.toString)
  )
  .via(divertLeftTo(errorSink))

def divertLeftTo[E, A](
  sink: Sink[E, NotUsed]
): Flow[Either[E, A], A, NotUsed] = {
 
  val sinkEither: Sink[Either[E, A], NotUsed] = sink.contramap(
    _.left.getOrElse(sys.error("No left value"))
  )

  def shouldSendToSink(message: Either[E, A]): Boolean =
    message.isLeft

  Flow[Either[E, A]]
    .divertTo(sinkEither, shouldSendToSink)
    .collect { case Right(a) => a }

}

Thanks to errorSink we don't need to convolute error handling with the main logic. Any error gets diverted to this sink, and handled separately. That means we don't need to deal with errors immediately when we extract them from Either. Essentially, this is a data-driven approach to error handling, reaping all the benefits of the compiler doing its type checking.

Different ways of handling errors

Let’s take a step back here, and imagine what kind of errorSink we would want to have. By default (as we have implemented in the example above using Sink.ignore), sending messages to errorSink is effectively the same as skipping a message. But one can think of other types of sinks: those that forwards messages to a dead letter queue to be processed again later (when the error contains sufficient information to do that) or sinks which halt the stream completely and shut down the service.

Skipping a message could make sense for an error that is not (re-)processable at all (think of business constraints). For other types of errors, you’d definitely want to reprocess your messages. Think of an error that occurs during deserialization suggesting that the used data model has evolved, but this service has not yet been updated.

The following example showcases a Sink that can skip or halt the stream completely:

val errorSink: Sink[MyError, NotUsed] =
  Flow[RuntimeException]
    .takeWhile { case (e, _) => shouldSkip(e) }
    .log("skipping")
    .to(Sink.ignore)

def shouldSkip(e: MyError): Boolean =
  e match {
    case _: InvalidUserId  => true
    case e                  =>
      log.error(s"terminating - $e")
      false
  }

Special case: committing on skipped messages using Kafka

We use Apache Kafka as our streaming platform, and use Alpakka Kafka to integrate with Scala. Imagine reading from a topic, and skipping certain events that we cannot process using only .collect :

Consumer
  .sourceWithOffsetContext[String, String](consumerSettings, topics)
  .map(record => deserializeAs[MyEvent](record.value)
  .collect { case Right(event) => event }
  .toMat(Committer.sinkWithOffsetContext(committerSettings))(Keep.none)
  .run()

def deserializeAs[T](message: String): Either[DeserializationError, T]

Can you spot the bug here?

We are not committing the skipped messages! This means we would re-read this message if we’d happen to restart the service while skipping occurs. Luckily, committing skipped messages using divertLeftTo is a breeze:

val errorSink: Sink[DeserializationError, NotUsed] =
  Flow[DeserializationError]
    .log("skipping")
    .toMat(Committer.sinkWithOffsetContext(comitterSettings))(Keep.none)

Consumer
  .sourceWithOffsetContext[String, String](consumerSettings, topics)
  .map(record => deserializeAs[MyEvent](record.value)
  .via(divertLeftTo(errorSink))
  .toMat(Committer.sinkWithOffsetContext(committerSettings))(Keep.none)
  .run()

def deserializeAs[T](message: String): Either[DeserializationError, T]

Thanks for reading! If you enjoyed this story, follow our Publication to stay tuned for more stories.

Interested in joining Kaizo? We are hiring (Scala) Software Engineers and Data engineers! Check the recruitment page for our open positions.

Pure and type safe error handling in Akka Streams was originally published in Kaizo Engineering on Medium, where people are continuing the conversation by highlighting and responding to this story.

Combining Purely Functional Property Based and Docker Integration Tests in ZIO

Arjun Dhawan — Mon, 10 Aug 2020 06:35:47 GMT

Turning modules into law abiding citizens 😇

This article explains how to elegantly spin up Docker for property based tests and cleanup test cases making use of ZIO’s ZManaged.

Introduction

Laws

Property based testing verifies correctness of programs in terms of laws: properties that arise from the domain and should always hold, also known as invariants. This reasoning about code in terms of expected and general behaviors is intuitive and leaves no edge case unturned.

While traditional testing requires you to manually construct test cases yourself, property based tests will randomly generate test cases (including edge cases!) for you. It greatly reduces human error in test development and makes them rely less on discipline and more on automation. It reinforces the principle that a test should actually give confidence about your code, rather than it being a ceremonial procedure.

These laws can range from the very simple:

Semigroup instance stringSemigroup should be associative.

to the more complex; for a DocumentClient module

https://medium.com/media/573385736643ae14112e00f3722dcf31/href

there could be a law

Function call deleteDoc(docId: String) should delete the document with id docId at the remote document store.

External Interactions

While the former law can be tested so easily that it doesn’t warrant an article, the latter is not so straightforward. Function deleteDoc requires an external interaction with the remote document store.

The external interaction could be mocked, allowing the law to become unit tested. If you’re lucky, the mock you just defined happens to coincide with the actual implementation of the document store. The deleteDoc function will be deployed and do its work as expected, always. Or in a more likely scenario, your mock will differ from the real implementation, allowing mistakes in the deleteDoc to go unnoticed during development. Another example is H2: while it might be similar enough to the actual DB system you use in 80% of the cases, it will for sure differ in behavior when dealing with for example PostGIS types or other more obscure features.
That is not the only thing: the mocks we usually define are stateless, meaning that the only interaction we can test is that some particular function of the mock was called in some particular way. These kind of mocks usually result in meaningless, non-intuitive expectations.

Instead the test could interact with the actual, deployed (development) instance of the remote document store, making it an integration test. This ensures that the deleteDoc function works as intended when deployed, but there are performance considerations: property based tests generate many, many test cases which would result in spikes of network traffic and increased load on the document store. The test cases would also need be cleaned afterwards. When this is a responsibility of the test itself it means that by accident the document store could be filled with garbage. Sometimes the cleaning of test cases is easy, as it is for PostgreSQL where a rollback transactor can be used (see for example the doobie setup).

The solution explored in this article, is to integrate the test not with the deployed instance of the document store, but to spin up a throwaway Docker container when running the integration test. The advantages are that the instance is thrown away after running the tests, as well as good performance given that the Docker instance resides on the same machine were the tests are being executed. It used to be the case that using such tests in a CI led to the bad practice of having Docker in Docker, meaning such tests could only be executed on the developers machine. Nowadays CI’s such as Azure DevOps allow pipelines to directly run on the VM, making it no issue to spin up a Docker container inside the CI.

We should note there is another solution though not applicable here, where services’ endpoints are written in a pure description or algebra, from which it is possible to derive clients, service routes, OpenAPI documentation, etc. thus rendering tests obsolete. An example is endpoints4s.

Implementation

Testcontainers-scala

Testcontainers-scala is a wrapper for Testcontainers which supports lightweight throwaway Docker instances for tests. Let’s say our document store is actually a Couchbase NoSQL Database. We can easily setup a container for it in our tests:

val c = CouchbaseContainer()

In this case we chose a CouchbaseContainer, but in reality you can choose any kind of Docker image through GenericContainer. The container exposes start and stop methods, as well as fields getHost and getFirstMappedPort which hold the hostname and the randomized port on which the container starts. These fields are important as they allow us to initialize the DocumentClient with the proper configuration (DocumentClientConfig). These fields can only be accessed after starting and before stopping the container; any access outside of this span results in an exception. We can say that the configuration on which DocumentClient depends has a life cycle, which can be modeled using ZManaged:

https://medium.com/media/dfc04a2505fe7a1843e31f67fc21c7d7/href

Using documentClientConfig we can easily a construct a DocumentClient Layer within the life cycle of the container:

https://medium.com/media/80fcbb7afeddb791ba5cf8d868cb3991/href

Note the @@ TestAspect.Sequential: we want to ensure any test case is cleaned up per test, before proceeding to the next. And provideLayerShared ensures that the same documentClient instance, therefore a single life cycle of the container is used for the entire test suite. The container starts at the beginning of the test suite and stops at the end.

Before we proceed to write an actual test, we show how to achieve automated test case setup and clean up with separation of concerns; we don’t want to deal with setup and clean up in each and every individual test.

Test Case Life Cycle Management

Imagine the example of DocumentClient is extended with methods to create and get documents:

https://medium.com/media/03b382c46cd9946478227278c8f9d62e/href

We can create a ZManaged which creates and gets a document, and deletes it as a cleanup action:

https://medium.com/media/6dbdf962590efa8fd5c06f8e42e05802/href

This Zmanaged can be used to define a TestScenario1, to be used later as a Layer in our tests:

https://medium.com/media/85cf1b3b77a6c1683ab1977baf412a23/href

That’s all we need! All that remains is writing up the test. We want to assert that the delete function returned successfully, as well as the document being no longer available in the document store:

https://medium.com/media/4b767481c984055a1322570f88b22be2/href

If wanted we can easily combine test scenarios. Just combine them the same way you would combine layers:

.provideSomeLayer(scenario1 ++ scenario2)

Each test scenario is automatically cleaned before proceeding to the next test case generation, as well as before proceeding to the next test. We are cleaning the test cases not to prevent the Docker instance from being filled with garbage (the instance will be thrown away anyway), but to ensure it doesn’t affect any other test running in the same suite.

In reality we would check more laws for our DocumentClient module. Not just about deletion, but also about getting, creating, updating, copying, etc. of documents. Moreover the law postulated in the beginning is unlikely to hold, as it would fail for sure for a docId being the empty string "" . The solution is to redefine the algebra and the law using refined types.

Refined

Refined is a library which allows narrowing down common types we use everyday (like String, List , etc.), to types that directly coincide with specific types of the domain. They could be refinements such as NonEmptyString, strings matching a particular RegEx, or even lists of particular length. They allow us to prevent the programs we write from entering an illegal state. This is guaranteed at compile time, rather than having to deal with those errors at runtime. For example we can enforce at compile time that deleteDoc is never called with an empty docId as argument.

The algebra looks as follows using Refined:

https://medium.com/media/d0fa72bb33d1632b238503284a73038f/href

When writing this article there is no interop available yet between Refined and ZIO Test, but we can easily define the interop ourselves:

val anyNonEmptyString = Gen
  .string1(Gen.anyUnicodeChar)
  .map(refineV[NonEmpty](_).fold(sys.error, identity))

And we are ready to start using the new algebra and generator in our tests.

Discussion

Property based and integration testing via Docker leads to relatively performant test suites which fully ensure modules are in accordance with intuitively defined and understandable behaviors.

Deciding what parts of our program need to be property based tested, integration tested or unit tested is admittedly more of an art than a strict science. Even though Docker integration tests don’t carry the overhead of trafficking back and forth data over the wire, they do carry the initial overhead of spinning up the container. And nothing beats unit tests in terms of performance.

As a rule of thumb, clients and (persistence) repositories are critical components that usually have clear laws which should be property based tested, and for which true confidence can only be brought using integration tests. It also makes sense to property based test type classes defined from scratch (unless they’re derived) which can be done through unit testing.

For other components it might not be straightforward to determine what exact laws they should abide by. It requires knowledge of the domain of the problem they solve to postulate them.

Combining Purely Functional Property Based and Docker Integration Tests in ZIO was originally published in The Startup on Medium, where people are continuing the conversation by highlighting and responding to this story.

Composing doobie programs using Cats

Arjun Dhawan — Mon, 24 Feb 2020 22:05:20 GMT

type classes to the rescue

TL;DR

If you want to combine ConnectionIO programs using |+| syntax:

https://medium.com/media/1cca96d87cb5ac54a44fe9c1e4f8b54e/href

you need bring an implicit Semigroup in scope through

https://medium.com/media/252380f568d63d2512ec7b27d0974229/href

Introduction: doobie

Doobie is a functional library for Scala/Cats that allows us to write programs to interact with a database using the JDBC API.

ConnectionIO

All such programs are described in the form of ConnectionIO. An example of such a program:

https://medium.com/media/752aa287e4cdd1f0d5c22961c54859e0/href

Nice thing about ConnectionIO is that it forms a Monad. It has flatMap which enables us to sequence different ConnectionIO programs.

Transaction

A ConnectionIO has no interpretation in the outside world (it’s a construct having only significance to Doobie) and can therefore not directly be run. To interpret it to a meaningful effect (let’s say, to a ZIO Task) we need a Transactor:

https://medium.com/media/7fbaa3e9055b73f5fe4e651d6a5a0e63/href

Here the transaction boundary is put around insertProgram1 and insertProgram2. Meaning that if something goes wrong at the database level for either insertProgram1 or insertProgram2, the entire transaction will be rolled back thus guaranteeing consistency in our database.

The problem

The construction of a ConnectionIO program itself might be modeled using another effect. Take the case where a UserService needs to make an http call before knowing what to insert:

https://medium.com/media/50252a47ea740df3cc3b87676a35adf1/href

See this other example showing how such nesting can arise. If we are dealing with only one such call, there is no issue transforming this into a Task:

https://medium.com/media/d40a1e5ce549554dadcc9ba857b4c877/href

But what if we need to perform multiple calls from different services, and want to keep the transaction boundary around the resulting ConnectionIO’s?

https://medium.com/media/4cda35437fe73626d4077f4ea3d388bb/href

A nested for-comprehension? Yikes!

The solution

We seek to achieve an easy way to combine ConnectionIO programs. Semigroup is just the right abstraction for that:

https://medium.com/media/4a5ec2eb6bc85de0273b54152ed12037/href

But how do we create a Semigroup for ConnectionIO? Doobie provides us out of the box with a type class instance for cats-effect for ConnectionIO: Async[ConnectionIO]. Async is a Monad, and is actually (being related through the type class hierarchy) also an Apply which is a less powerful Applicative. And Apply has defined a function to give us a Semigroup 😊:

https://medium.com/media/853e4283113603d09632a99ba0d5d12e/href

So we can bring any implicit Semigroup[ConnectionIO[A]] in scope by defining

https://medium.com/media/807784fd9d0246493a359450661613a1/href

In this case we are not interested in the return values of our ConnectionIO programs. Since ConnectionIO is also a Functor, we can ignore the result value through .void, meaning we can write:

https://medium.com/media/4023fa4a365a89975a8d009f5ba6ed90/href

and gone is the nested for-comprehension 😊.
If our ConnectionIO programs return ADT’s which we would like to keep, we can use .widen on Functor to cast to the most common subtype:

https://medium.com/media/ac0280d053da6d1b095d2e8a9fd7a685/href

Of course we would need a Semigroup instance for ADT.

Closing thoughts

We could also have defined a Semigroup for Free (what ConnectionIO is) instead of Apply. But since Apply is less strict than Free (it has less laws to obey) it is preferable to define the Semigroup for Apply so we can model more behaviors.

Having interpreted ConnectionIO as a Semigroup also better conveys our intent: when ‘smashing’ together ConnectionIO’s we don’t care about the power to control computations based on the previous result (which is the power that flatMap gives). Instead, we want to merely combine ConnectionIO programs, which exactly fits the semantics of Semigroup.

Acknowledgements

Special thanks to Mark de Jong for his insights on the subject.