<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:cc="http://cyber.law.harvard.edu/rss/creativeCommonsRssModule.html">
    <channel>
        <title><![CDATA[Stories by Leo Gorodinski on Medium]]></title>
        <description><![CDATA[Stories by Leo Gorodinski on Medium]]></description>
        <link>https://medium.com/@eulerfx?source=rss-a8651abcb80a------2</link>
        <image>
            <url>https://cdn-images-1.medium.com/fit/c/150/150/1*8HgRb3zO86HTxR-JJ7_2Cw.png</url>
            <title>Stories by Leo Gorodinski on Medium</title>
            <link>https://medium.com/@eulerfx?source=rss-a8651abcb80a------2</link>
        </image>
        <generator>Medium</generator>
        <lastBuildDate>Wed, 20 May 2026 16:03:38 GMT</lastBuildDate>
        <atom:link href="https://medium.com/@eulerfx/feed" rel="self" type="application/rss+xml"/>
        <webMaster><![CDATA[yourfriends@medium.com]]></webMaster>
        <atom:link href="http://medium.superfeedr.com" rel="hub"/>
        <item>
            <title><![CDATA[F# Async Guide]]></title>
            <link>https://medium.com/@eulerfx/f-async-guide-eb3c8a2d180a?source=rss-a8651abcb80a------2</link>
            <guid isPermaLink="false">https://medium.com/p/eb3c8a2d180a</guid>
            <category><![CDATA[functional-programming]]></category>
            <category><![CDATA[concurrency]]></category>
            <category><![CDATA[fsharp]]></category>
            <category><![CDATA[engineering]]></category>
            <dc:creator><![CDATA[Leo Gorodinski]]></dc:creator>
            <pubDate>Tue, 26 Jun 2018 16:16:41 GMT</pubDate>
            <atom:updated>2018-06-26T16:16:41.579Z</atom:updated>
            <content:encoded><![CDATA[<p>This is a usage guide for asynchronous programming in F# using the <a href="https://docs.microsoft.com/en-us/dotnet/fsharp/tutorials/asynchronous-and-concurrent-programming/async"><strong>Async</strong></a> type. The content should be helpful to existing F# <strong>Async</strong> users or those approaching F# concurrency from another programming language, and is complementary to existing material such as <a href="https://fsharpforfunandprofit.com/posts/concurrency-async-and-parallel/">Asynchronous Programming</a> by Scott Wlaschin, <a href="http://tomasp.net/blog/csharp-async-gotchas.aspx/">Async in C# and F#</a> by Tomas Petricek and <a href="https://docs.microsoft.com/en-us/dotnet/fsharp/tutorials/asynchronous-and-concurrent-programming/async">Async Programming in F# on MSDN</a>.</p><h4>Table of Contents</h4><ul><li><a href="https://medium.com/p/eb3c8a2d180a#21d0"><strong>Definition</strong></a><strong> </strong>— the definition of the F# <strong>Async</strong> type, its interaction with the thread pool and then, async workflows.</li><li><a href="https://medium.com/p/eb3c8a2d180a#b897"><strong>Hazards</strong></a> — common programming hazards with F# <strong>Async</strong> and workarounds.</li><li><a href="https://medium.com/p/eb3c8a2d180a#bc3c"><strong>Related Programming Models</strong></a> — relationship to other programming models.</li><li><a href="https://medium.com/p/eb3c8a2d180a#0f8b"><strong>Concepts</strong></a> — narrative on general concepts in concurrency used throughout the post.</li></ul><h3>Definition</h3><p>The F# Async type represents an asynchronous computation. It is similar in concept to <strong>System.Threading.Tasks.Task</strong> in .NET, <strong>java.util.concurrent.Future</strong> in Java, a <strong>goroutine</strong> in Go, <strong>Control.Concurrent.Async</strong> in Haskell, <strong>Event</strong> in Concurrent ML, or <strong>promise</strong> in JavaScript, with some important differences.</p><p>Overall, F# Async serves the following needs:</p><ol><li>It allows for more efficient use of OS threads by preventing the need to <a href="https://stackoverflow.com/questions/3982501/mutex-lock-what-does-blocking-mean">block</a> them when waiting.</li><li>It provides constructs for concurrency and parallelism in addition to sequential computation.</li><li>It indicates that a computation is long-running, or may not be expected to terminate.</li></ol><p>Programmatically, the <strong>Async</strong> type is defined as follows:</p><pre><strong>type</strong> Async&lt;&#39;a&gt; = (&#39;a → unit) → unit</pre><p>In other words, a value of type <strong>Async&lt;&#39;a&gt;</strong> is a function that accepts a <em>callback</em> function of type <strong>&#39;a → unit</strong> and returns <strong>unit</strong>.</p><p>We can derive the <strong>Async</strong> type as follows. Suppose you&#39;ve an operation that transmits and then waits for a response to an HTTP request:</p><pre><strong>let</strong> download (url:string) : string =<br>  <strong>let</strong> client = <strong>new</strong> WebClient()<br>  <strong>let</strong> res = client.DownloadString url<br>  res</pre><p>In this case, the call to <strong>DownloadString</strong> is <em>blocking</em> - the OS thread on which the execution is taking place becomes <em>blocked</em> for the duration of the IO operation. When a thread is blocked, it isn&#39;t directly consuming CPU resources, however it continues to consume stack space, which it needs to resume when the operation completes. These <a href="https://en.wikipedia.org/wiki/Context_switch">context switches</a>, as a thread blocks and then unblocks, are costly. We can make more efficient use of threads and processing resources by using the calling thread to <em>invoke</em> the operation, and when the IO operation completes, send a notification to a callback, on another thread. This can be done as follows:</p><pre><strong>let</strong> downloadCallback <br>  (url:string) <br>  (callback:string → unit) : unit =      <br>  <strong>let</strong> client = <strong>new</strong> WebClient()<br>  client.DownloadStringCompleted <br>  |&gt; Event.add <br>    (<strong>fun</strong> args <strong>→</strong> callback args.Result)<br>  client.DownloadStringAsync url</pre><p>In this case, the call to <strong>downloadCallback</strong> returns immediately, and the provided callback is subscribed to an event that triggers when the invoked operation completes. This allows the callback to be called from a different thread, and allows the calling thread to continue doing useful work rather than remaining blocked. If you squint a little, you can see that the type of <strong>downloadCallback url </strong>is<strong> </strong><strong>(string → unit) → unit</strong> and if we generalize that to a generic type <strong>&#39;a</strong> we end up with the definition of <strong>Async&lt;&#39;a&gt;</strong> above.</p><p>Using the <strong>Async</strong> type, we have the following signature for the operation:</p><pre><strong>val </strong>downloadAsync<strong> : string </strong>→<strong> Async</strong>&lt;<strong>string</strong>&gt;</pre><p>At this point, it is possible to understand why this computation is <em>asynchronous</em>. It is asynchronous because there are two core steps involved — the invocation of the operation and the receipt of the response. Furthermore, we can see how the async type allows us to manage OS threads more efficiently — rather than blocking the calling thread, the calling thread remains free to do other work. We’ll cover this in more detail below.</p><p>The actual implementation of the <strong>Async</strong> type, available on the F# repo, is more involved due to the need to support exceptions, cancellations, a growing stack - some of which are discussed later on. The central &#39;constructor&#39; for an <strong>Async</strong> value is the <strong>Async.FromContinuations</strong> function:</p><pre><strong>Async</strong>.FromContinuations : <br>  ((&#39;a → unit) * <br>   (exn → unit) * <br>   (OperationsCancelledExceptions → unit) → unit) → <strong>Async</strong>&lt;&#39;a&gt;</pre><p>In addition to the successful completion callback <strong>&#39;a</strong> → unit , it takes callbacks (continuations) for errors and cancellations.</p><h3>Thread Pool</h3><p>Rather than managing threads directly, the <strong>Async</strong> type works along with the <a href="https://msdn.microsoft.com/en-us/library/system.threading.threadpool(v=vs.110).aspx">.NET Thread Pool</a> to schedule work. The thread pool maintains a pool of threads, growing and shrinking as needed and provides the following key interface:</p><pre><strong>ThreadPool</strong>.QueueUserWorkItem : (unit → unit) → unit</pre><p>This operation queues an action <strong>unit → unit</strong> to be run on a thread pool thread. The operation of the ThreadPool can be visually depicted as follows:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*XPR7maWLTNanxc_8" /></figure><p>While it is possible to simply start a new thread whenever an action needs to be scheduled, using a thread pool allows thread creation costs as well as context switching costs to be amortized. Rather than blocking threads, threads are kept busy with a queue of work maintained by the thread pool. In the example above, the <strong>DownloadStringCompleted</strong> event might be triggered from a thread pool thread. This approach to scheduling work items is sometimes referred to as <a href="https://en.wikipedia.org/wiki/Green_threads">green threads</a>. The relationship between <strong>Async</strong>computations and OS threads is not one-to-one — more <strong>Async</strong> computations does not automatically result in more threads, and in particular, increasing parallelism isn&#39;t achieved by increasing the number of threads, but rather, by increasing the number of in-flight computations. In effect, the <strong>Async</strong> type encapsulates callbacks and the ThreadPool into a higher-level programming model as described below. With that said, in some cases, tuning the thread count limits on the ThreadPool can improve performance.</p><h3>Async Workflows</h3><p><a href="https://docs.microsoft.com/en-us/dotnet/fsharp/language-reference/asynchronous-workflows">F# async workflows</a> provide a syntax that permits expressing sequential workflows in terms of <strong>Async</strong> computations. For example, given the <strong>downloadAsync</strong> operation above, we may want to perform another download based on the result of the first and then perform a transformation on both results:</p><pre><strong>let</strong> callApi (url:string) = <strong>async</strong> {<br>  <strong>let</strong>! data1 = downloadAsync url<br>  <strong>let</strong>! data2 = downloadAsync (url + data1)<br>  <strong>return</strong> data1,data2 }</pre><p>While this workflow is expressed sequentially, the underlying computation runs asynchronously and avoids blocking an OS thread during the processing of <strong>downloadAsync</strong>. This is achieved by translating the workflow syntax into <strong>Bind</strong> and <strong>Return </strong>operations defined on the <strong>Async</strong> type as follows:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/905/0*_XP9fPuY2Yq4YFzi" /></figure><p>The result of the call to <strong>downloadAsync</strong> is passed to <strong>Bind</strong> and the portion of the workflow after the call to <strong>downloadAsync</strong> is passed to <strong>Bind</strong> as the continuation. The <strong>Return</strong> operation takes a value, in this case the pair of <strong>data1</strong> and <strong>data2</strong>, and lifts it into an <strong>Async</strong>value. The async workflow in F# also defines operations to support other control flow constructs - loops, delayed execution and exception handling.</p><p>A chain of bind operations forms a sequential computation. Much of the theory of computation (ie <a href="https://en.wikipedia.org/wiki/Turing_machine">Turing machine</a>, <a href="https://en.wikipedia.org/wiki/Lambda_calculus">lambda calculus</a>) models <a href="https://en.wikipedia.org/wiki/Sequential_algorithm">sequential</a> computation, where steps happen sequentially, one after another. Of course, things wouldn’t be much fun if we were limited to sequential computation. The F# Async type also allows us to also express parallel computations, using the <strong>Async.Parallel : Async&lt;&#39;a&gt;[] → Async&lt;&#39;a[]&gt;</strong> operation, for example. This operation takes an array of async computations and returns a single async computation that will yield their aggregated results, thereby expressing <a href="https://en.wikipedia.org/wiki/Fork%E2%80%93join_model">fork-join parallelism</a>. The &quot;fork&quot; part is the starting of the provided computations, and the &quot;join&quot; part is awaiting their results. Another way to express parallelism is with the <strong>Async.StartChild : Async&lt;&#39;a&gt; → Async&lt;Async&lt;&#39;a&gt;&gt;</strong> operation. This operation starts a computation and returns a &#39;handle&#39; to a computation that can be used to rendezvous with the result at a later time. This makes it possible to start multiple computations to be run in parallel, but still cleanly gather their results without any low level threading constructs in play. This in turn can be used to implement an operation such as <strong>Async.Parallel : Async&lt;&#39;a&gt; * Async&lt;&#39;b&gt; → Async&lt;&#39;a * &#39;b&gt;</strong>. This operation can also be implemented using the sequential workflow with the <strong>Bind</strong> operation; however the provided computations would run sequentially rather than in parallel, changing the semantics significantly.</p><h3>Hazards</h3><p>There are several hazards to programming F# Async. Some are already covered in Tomas Petricek’s excellent <a href="http://tomasp.net/blog/csharp-async-gotchas.aspx/">Async in C# and F# Asynchronous gotchas in C#</a> article, but we discuss a few more here.</p><h3>Async.RunSynchronously</h3><p>The <strong>Async.RunSynchronously : Async&lt;&#39;a&gt; → &#39;a</strong> operation provides a way to commence and then obtain the result from an async computation. The name of the operation is deliberately made cumbersome to type because it must be used judiciously. F# novices or those new to functional programming in general often struggle with as they seek to access the value produced by a computation and end up &quot;cheating&quot; by calling <strong>Async.RunSynchronously</strong>. Ideally, <strong>Async.RunSynchronously</strong> would only be invoked once for the entire program and passed an async computation representing the program. Most importantly, calls to <strong>Async.RunSynchronously</strong> from within loops should be avoided. The reason for this is that <strong>Async.RunSynchronously</strong> is implemented by blocking the calling thread until the async computation completes. This in effect undoes much of the benefit of using the <strong>Async</strong> type in the first place, but is necessary in order for the async computation to take effect. If the call is made only once for the entire program, only one thread remains blocked waiting for the program to complete, which of course is fine. Frequent calls to <strong>Async.RunSynchronously</strong> however don&#39;t play well with the .NET ThreadPool. Blocking threads will pressure the ThreadPool to create more threads, eventually causing it to reach its limits, inducing a high number of context switches and wasted stacks. Instead of calling <strong>Async.RunSynchronously</strong>, either use async workflows or operations on the <strong>Async</strong> type such as <strong>async.Bind</strong> to access the produced value.</p><p>See also: <a href="http://joeduffyblog.com/2015/11/19/asynchronous-everything/">Asynchronous Everything</a> by Joe Duffy</p><h4>Summary</h4><ul><li>Avoid calling <strong>Async.RunSynchronously</strong> except for at the entry point for the executable.</li></ul><h3>Async.Start</h3><p>The <strong>Async.Start : Async&lt;&#39;a&gt; → unit</strong> operation starts an async computation without waiting for the result by scheduling the computation on a ThreadPool thread. This operation is what actually puts the async computation chain constructed by calls to operations as defined above into motion. As described, the call to <strong>Async.RunSynchronously</strong> is implemented by starting a computation which stores its result in a wait handle on which the calling thread waits. It is akin to forking a thread. The related operations <strong>Async.StartChild : Async&lt;&#39;a&gt; → Async&lt;Async&lt;&#39;a&gt;&gt;</strong> and <strong>Async.StartChildAsTask</strong> also start an operation without awaiting the result, however they also return a handle making it possible to await the result. Care should be taken with these operations because they can result in overly non-deterministic executions. It may cause too many operations to be running in parallel, potentially degrading performance. Moreover, exceptions raised by computations passed to <strong>Async.Start</strong> aren&#39;t propagated to the caller and are easily overlooked. In fact, it should rarely be needed to make use of <strong>Async.Start</strong> in application code. Instead, favor calls to <strong>Async.Parallel</strong> or <strong>Async.ParallelThrottled</strong> for expressing parallelism.</p><p>For example, suppose you’ve a sequence of async computations that need to be run. One way to run them is to iterate the sequence, starting each computation with <strong>Async.Start</strong>. However, this:</p><ol><li>May cause more than the desired number of computations to be run in parallel.</li><li>Doesn’t provide a way to await the completion of the sequence and</li><li>Leaves exceptions thrown by individual computations unhandled.</li></ol><pre>// a sequence of computations<br><strong>let</strong> comps : <strong>Async</strong>&lt;unit&gt; seq = ...</pre><pre>// start each computation<br>// do not await the results<br>comps |&gt; Seq.iter <strong>Async</strong>.Start</pre><p>Instead, it is possible to run the computations in parallel using a call to Async.Parallel which will address the aforementioned issues:</p><pre>// run computations in parallel, <br>// await the results, exceptions<br>// escalated to the caller<br><strong>do</strong>! comps |&gt; <strong>Async</strong>.Parallel</pre><p>Another way a need to call <strong>Async.Start</strong> may come up is to start a background process of some sort. For example, a program may have a health check or reporting process to be run along the core logic. If however this background process is run using <strong>Async.Start</strong>, exceptions raised by the background process may be left unhandled, preventing the program from reporting its health.</p><pre><strong>val</strong> coreProcess : <strong>Async</strong>&lt;unit&gt;<br><strong>val</strong> backgroundProcess : <strong>Async</strong>&lt;unit&gt;<br><strong>Async</strong>.Start backgroundProcess<br><strong>Async</strong>.RunSynchronously coreProcess</pre><p>If this is undesirable, the fate of the background process should be tied to the fate of the core logic of the program using <strong>Async.Parallel : Async&lt;&#39;a&gt; → Async&lt;&#39;b&gt; → Async&lt;&#39;a * &#39;b&gt; </strong>:</p><pre><strong>Async</strong>.Parallel <br>  [coreProcess; backgroundProcess]</pre><p>With the alternative approach, if exceptions raised by the background process should be discarded without causing the program to crash, this can be done explicitly by catching the exceptions and logging as appropriate.</p><h4>Summary</h4><ul><li>Consider using a higher-level construct before using <strong>Async.Start</strong>.</li><li>Determine whether exceptions raised by computations started with <strong>Async.Start</strong> should affect the calling computation.</li><li>Be sure to propagate a <strong>CancellationToken</strong> to <strong>Async.Start</strong> if applicable.</li></ul><h3>Async.Parallel</h3><p>As described above, <strong>Async.Parallel</strong> is a way to express fork-join parallelism. However, an important consideration when using this operation is the number of input computations provided. If the number of input computations is too high, then the call to <strong>Async.Parallel</strong> may create too much contention for both memory and IO resulting in performance degradation. Additionally, if the sequence of computations is unbounded, the call to <strong>Async.Parallel</strong> will run out of memory before starting any of the computations because internally, it allocates an array to store the result of each computation. Instead, consider using either <strong>Async.ParallelThrottled : int → Async&lt;&#39;a&gt;[] → Async&lt;&#39;a[]&gt;</strong> or <strong>Async.ParallelThrottledIgnore : int → Async&lt;unit&gt; seq → Async&lt;int&gt;</strong>. The former is like <strong>Async.Parallel</strong> except it bounds the degree of parallelism, and the latter also bounds parallelism, but doesn&#39;t store the result of computations, only the count of the number completed, making it possible to use with unbounded sequences of computations. Care must be taken to tune for the appropriate degree of parallelism, especially for IO bound computations where there aren&#39;t rules of thumb such as for CPU bound computations (ie a thread per core). The best value may depend on the nature of the computations and may even change over time. An even more ideal scheduler would automatically control the degree of parallelism with a strategy to either maximize throughput or minimize latency.</p><p>The <strong>Async.ParallelThrottledIgnore</strong> operation can be implemented as follows:</p><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/6184f2387b912a6c74e4a4af8c8844e7/href">https://medium.com/media/6184f2387b912a6c74e4a4af8c8844e7/href</a></iframe><h4>Summary</h4><ul><li>Ensure that the number of input computations passed to <strong>Async.Parallel</strong> is bounded.</li><li>Consider using a throttled variant as described above to reduce contention.</li><li>Consider using a non-Async based parallelization mechanism for compute-bound computations which don’t use Async.</li></ul><h3>Compute-Bound Computations</h3><p>While it is possible to express parallelism with <strong>Async</strong>, as described in the previous section, using this approach for compute-bound computations may not be the most efficient. A compute-bound computation is one where a majority of time is spent on computational tasks rather than awaiting IO operations. In these cases, it is better to use something like <a href="https://msdn.microsoft.com/en-us/library/ff963552.aspx">Parallel.For</a> or PLinq to take advantage of parallelism. This method avoids the overhead involved in the <strong>Async</strong> continuation mechanism. However, it is important to note that if a compute-bound operation does make an IO request, using <strong>Async.RunSynchronously</strong> to await it will cause blocking and may reduce performance over using <strong>Async.Parallel</strong>.</p><h3>MailboxProcessor</h3><p>As described above, the <strong>MailboxProcessor</strong> (MBP) provides an actor-based concurrent programing model. However, for most applications, this model is fairly low-level and requires considerable care to avoid common pitfalls. The MBP is best suited for implementing higher-level library constructs, but it should be avoided in domain code for reasons described below. One of the most common hazards with the MBP is that it is easy to overlook exceptions thrown by the processing computations. These exceptions are published on the <strong>Error</strong> event, however this event needs to be explicitly subscribed to in order to observe the errors. Even if the error is caught, it may not be clear how to proceed as the context is lost. Next, the <strong>PostWithAsyncReply</strong> operation together with the <strong>AsyncReplyChannel</strong> type do not provide a way to propagate exceptions, forcing users to express exceptions using an explicit <strong>Result</strong> value or by using a <strong>TaskCompletionSource</strong> instead.</p><p>For example:</p><pre><strong>let</strong> <strong>rec</strong> proc mbp = <strong>async</strong> {<br>  <strong>let</strong>! (data,replyCh) = mbp.Receive ()<br>  <strong>let</strong>! result = .... // logic<br>  replyCh.Reply result<br>  <strong>return</strong>! proc mbp }</pre><pre><strong>let</strong> mbp = <br>  MailboxProcessor.Start proc</pre><pre><strong>let</strong> handle (data:string) : <strong>Async</strong>&lt;string&gt; =<br>  mbp.PostAndAsyncReply <br>    (<strong>fun</strong> replyCh -&gt; data,replyCh)</pre><p>Here, if the processing logic throws an exception, the caller in <strong>handle</strong> will be suspended indefinitely and the exception will be swallowed. Moreover, the <strong>MailboxProcessor</strong> will halt and be unable to process any additional messages. One might instead expect the exception to be escalated to the caller, and for the <strong>MailboxProcessor</strong> to continue processing. This can be done by explicitly catching exceptions inside of the processing loop and then propagating to the caller, either using an explicit <strong>Result</strong> value or through a <strong>TaskCompletionSource</strong> rather than an <strong>AsyncReplyChannel</strong>. For example:</p><pre><strong>let</strong> postAndAwaitResult <br>  (mbp:<strong>MailboxProcessor</strong>&lt;&#39;a&gt;) <br>  (f:<strong>TaskCompletionSource</strong>&lt;&#39;b&gt; <strong>→</strong> &#39;a) = <strong>async</strong> {<br>    <strong>let</strong> ivar = <strong>TaskCompletionSource</strong>&lt;_&gt;()<br>    mbp.Post (f ivar)<br>    <strong>return</strong>! ivar.Task |&gt; <strong>Async</strong>.AwaitTask }</pre><pre><strong>let</strong> <strong>rec</strong> proc mbp = <strong>async</strong> {<br>  <strong>let</strong>! (data,ivar) = mbp.Receive ()<br>  <strong>try<br>    let</strong>! result = ....<br>    ivar.SetResult result<br>  <strong>with</strong> ex -&gt;<br>    ivar.SetException ex<br>    <strong>return</strong>! proc mbp }</pre><pre><strong>let</strong> mbp = <strong>MailboxProcessor</strong>.Start proc</pre><pre>// exceptions will be escalated <br>// to the caller<br><strong>let</strong> handle (data:string) : <strong>Async</strong>&lt;string&gt; =<br>  postAndAwaitResult mbp <br>    (<strong>fun</strong> ivar <strong>→</strong> data,ivar)</pre><p>Another thing to keep in mind with MBP is that the mailbox is unbounded and therefore, has the potential to overflow. In the context of a producer-consumer scenario, the producer may produce messages at a higher rate than the consumer is able to consume them, resulting in an unstable system. An explicit backpressure mechanism is needed to coordinate the consumer and the producer for preventing overflow. One way to do this is using the <a href="https://github.com/jet/kafunk/blob/master/src/kafunk/Utility/BoundedMb.fs">BoundedMb</a> type which places a bound on the number of messages in the mailbox. If the bound is reached, the <strong>BoundedMb</strong> exerts back-pressure on the producer.</p><p>Beyond these nuances with exceptions and back-pressure, the <strong>MailboxProcessor</strong> programming model can lead to needless layers of indirection. In the example above, if the desired outcome is to invoke the processing logic, it is much more reasonable to simply invoke the logic directly rather than routing through the MBP. Of course the MBP can do more than simply forwarding messages, but if more complex behaviors behaviors are required, it is better to encapsulate these behaviors in a reusable data structure.</p><p>Examples of higher-level async structures that can be implemented with MBP are:</p><ul><li><a href="https://github.com/jet/kafunk/blob/master/src/kafunk/Utility/MVar.fs">MVar</a> — a serialized variable with lazy initialization, akin to a <strong>ref</strong> but with support for serialized, async-based mutation. Beware of deadlocks when mutating!</li><li><a href="https://github.com/jet/kafunk/blob/master/src/kafunk/Utility/SVar.fs">SVar</a> — like <strong>MVar</strong> but with an additional tap operation which returns an <strong>AsyncSeq</strong> of values stored.</li><li><a href="http://t0yv0.blogspot.com/2011/12/making-async-5x-faster.html">Channel</a> — synchronizes a producer and a consumer of a message. Similar in spirit to channels in Go and Concurrent ML, however without support for selective communication.</li><li><a href="https://github.com/jet/kafunk/blob/master/src/kafunk/Utility/BoundedMb.fs">BoundedMb</a> — a bounded mailbox, similar in functionality to <strong>BlockingCollection</strong>, however using <strong>Async</strong> to represent waiting. This is an effective way to include back-pressure for produce-consumer scenarios.</li><li><a href="https://github.com/fsprojects/FSharpx.Async/blob/master/src/FSharpx.Async/BatchProcessingAgent.fs">BatchProcessingAgent</a> — a buffer which forms and publishes batches of publishes messages.</li></ul><p>In many cases, it is better to rely on these data structures rather than implementing a custom MBP for a domain-specific use-case.</p><p>Another way to approach this programming model is to turn the processing logic “inside out” using <strong>AsyncSeq</strong>. First, we repurpose the MBP to act as solely as a mailbox:</p><pre><strong>let</strong> mbp : <strong>MailboxProcessor</strong>&lt;&#39;a&gt; = <br>  <strong>MailboxProcessor</strong>.Start <br>    (<strong>fun</strong> _ -&gt; async.Return())</pre><p>Then we represent the incoming messages as a stream using <strong>AsyncSeq</strong>:</p><pre><strong>let</strong> stream : <strong>AsyncSeq</strong>&lt;&#39;a&gt; = <br>  <strong>AsyncSeq</strong>.replicateInfiniteAsync mbp.Receive</pre><p>Now we can publish messages to the mailbox asynchronously, and consume the resulting <strong>AsyncSeq</strong> explicitly. This allows us to use existing operations on <strong>AsyncSeq</strong> to filter, transform and buffer the messages, it allows us to merge the stream with other streams, and represents the process explicitly as an <strong>Async</strong> operation such that we can join it with other operations:</p><pre><strong>let</strong> proc : <strong>Async</strong>&lt;unit&gt; =<br>  stream<br>  |&gt; <strong>AsyncSeq</strong>.bufferByTimeAndCount 100 100<br>  |&gt; <strong>AsyncSeq</strong>.iterAsync processBatch</pre><p>This approach makes the processing logic explicit and provides a more convenient way to handle exceptions.</p><h4>Summary</h4><ul><li>Beware of exceptions raised by processing logic used inside a <strong>MailboxProcessor</strong>.</li><li>Consider using <strong>TaskCompletionSource</strong> rather than <strong>AsyncReplyChannel</strong> to signal from within a <strong>MailboxProcessor</strong>, particularly when exceptions may be raised.</li><li>Consider using or implementing a higher-level component rather than using a <strong>MailboxProcessor</strong> for domain-specific code.</li></ul><h3>CancellationToken</h3><p>A <strong>CancellationToken</strong> is used to cancel computations in response to cancellation requests that are external to the computation itself. Several of the operations on <strong>Async</strong>, such as <strong>Async.Start</strong> and <strong>Async.RunSynchronously</strong>, are parameterized with an optional <strong>CancellationToken</strong>, such that if a cancellation is requested on that token, the computation can be notified, allowing it to terminate. There are many reasons to cancel a computation. One of the most common is to impose a timeout on a computation. More generally, the reason could be as a response to new information, invalidating the inflight computation. Care must be taken to ensure that a computation will actually respond to a cancellation request. In many cases, this is done automatically by machinery inside <strong>Async</strong> itself. For example, before each <strong>async.Bind</strong> is invoked, the cancellation token is checked. Also, calls to <strong>Async.Sleep</strong> will be cancelled as expected. However, if an async computation has a prolonged compute-bound section, the cancellation token must be checked manually.</p><p>Each async computation is bound to a <strong>CancellationToken</strong> and is accessible with <strong>Async.CancellationToken : Async&lt;CancellationToken&gt;</strong>. If a token isn&#39;t provided explicitly as described above, <strong>Async.DefaultCancellationToken</strong> is used. The default cancellation token can be cancelled by calling <strong>Async.CancelDefaultToken</strong>, however this will signal a cancellation for all computations bound to this token. To explicitly bind an async computation to a token, the token can be passed along with the computation to <strong>Async.Start</strong> or other operations.</p><p>As a convenience:</p><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/a5ef3b97b99babb6b46b1d8dac7a9b40/href">https://medium.com/media/a5ef3b97b99babb6b46b1d8dac7a9b40/href</a></iframe><p>Note how in this case, the argument <strong>CancellationToken</strong> is linked with the ambient <strong>CancellationToken</strong>, and the linked token is passed to <strong>Async.Start</strong>. As a result, the computation will be cancelled in response to either the argument <strong>CancellationToken</strong> or in response to the ambient <strong>CancellationToken</strong>. This may not be desired in all cases.</p><p>Cancellation tokens are not a first-class concept within the <strong>Async</strong> type and require special treatment. In some cases, it is possible to use a first-class <em>selective communication</em> mechanism, or at least a best-effort attempt. What would it mean for cancellation to be first-class? A cancellation token establishes a race between two computations: the core computation at hand and the computation that represents a cancellation. For example, a timeout can be viewed as a race between a computation and a timer.</p><p>More generally, we can implement cancellations using the <strong>Async.Choice : Async&lt;&#39;a option&gt; seq → Async&lt;&#39;a option&gt;</strong> operation. Given a sequence of input computations, this operation will start all of them, return the result of the first one to complete, and cancel the others. However, cancellation is a best-effort attempt, and therefore, does not represent true selective communication. For example, if we apply <strong>Async.Choice</strong> to the <strong>Receive</strong> operations on two <strong>MailboxProcessor</strong> instances, the message received from the second one of the two to complete will be lost. A more elaborate synchronization mechanism is required to implement true selective communication wherein the message remains in the second mailbox.</p><h4>Summary</h4><ul><li>Be explicit about propagating cancellation tokens when calling <strong>Async.Start</strong> and related operations accepting a cancellation token.</li><li>Avoid calling <strong>Async.CancelDefaultToken </strong>to avoid interference with unrelated computations.</li><li>Be sure to extract and reference the ambient cancellation token via <strong>Async.CancellationToken</strong> when an computation has an extensive compute-bound section to ensure that it is properly cancelled.</li><li>Consider using <strong>Async.Choice</strong> in scenarios requiring first-class flow control.</li><li>Take note of the issue when using <strong>Async.AwaitTask</strong> on cancelled <strong>Task</strong> instances as described in the next section.</li></ul><h3>Async.AwaitTask</h3><p>The <strong>Async.AwaitTask : Task&lt;&#39;a&gt; → Async&lt;&#39;a&gt;</strong> operation translates a <strong>Task</strong> value to an <strong>Async</strong> value. Many asynchronous operations in the .NET Framework return <strong>Task</strong> and this operation is used to map them to <strong>Async</strong>. In versions of F# prior to 4.1, the implementation of <strong>Async.AwaitTask</strong> had a bug wherein cancellations to <strong>Task</strong> computations would be lost, resulting in indefinitely suspended <strong>Async</strong> computations. This would lead to difficult to find bugs in the program. Many have encountered this when using <a href="https://msdn.microsoft.com/en-us/library/system.net.http.httpclient(v=vs.118).aspx">HttpClient</a> from F#. Indefinitely suspended <strong>Async</strong> computations are a broader hazard discussed next.</p><p>Another hazard involving <strong>Task</strong> and <strong>Async</strong> is in attempting to use selective communication among them. For example, suppose you&#39;ve a component such as a <strong>Socket</strong> or state representing a node&#39;s view of a cluster. We can represent the state of this component using a <strong>TaskCompletionSource</strong> which is set to the <strong>Completed</strong> state when the component is closed, or to the <strong>Faulted</strong> state when the component fails. Suppose also that you&#39;ve component-dependent <strong>Async</strong> operations, such as sends and receives. We&#39;d like to cancel an in-flight operation whenever the component is closed or faulted, so that they can be retried on a new component instance. This calls for selective communication - we&#39;d like to select between awaiting the completion of an operation or the closing of a resource. More precisely, we&#39;re looking for a function of type <strong>chooseTaskOrAsync : Task&lt;&#39;a&gt; → Async&lt;&#39;a&gt; → Async&lt;&#39;a&gt;</strong> where the first argument would correspond to the component state and the second to the operation. If the component is closed, we&#39;d like to raise an exception, and to do that, we could use <strong>Task.ContinueWith</strong>. However, since for each instance of a component we might have a large number of component-dependent operations, we&#39;d add a large number of continuations to the <strong>Task</strong> corresponding to the component. If those continuations aren&#39;t properly cleaned up, we end up with a memory leak. The <strong>Task.WhenAny</strong> operation on the other hand ensures that orphaned continuations are properly cleaned up and allows us to avoid a memory leak.</p><h4>Summary</h4><ul><li>Ensure that you’re using a correct implementation of <strong>Async.AwaitTask</strong> to await <strong>Task</strong> instances which may be cancelled.</li></ul><h3>Indefinite Suspension</h3><p>Nothing in the <strong>Async</strong> type ensures that the computation terminates. It is possible to impose a timeout, as described in the previous section, but this isn&#39;t done automatically. As a result, it is quite possible to end up with an async computation that never terminates, causing an indefinite suspension in the program. On the one hand, this accurately depicts the nature of asynchrony, but on the other hand, it can lead to some adventurous bug hunting.</p><p>A helpful operation to impose timeouts is as follows:</p><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/7e616249457c5635bc42d27bbfa45e66/href">https://medium.com/media/7e616249457c5635bc42d27bbfa45e66/href</a></iframe><p>This operation can be applied onto top-level handler functions where it isn’t certain whether internal operations take care of timeouts, but where there is an evident upper bound on the time the operation should require. Of course some computations are deliberately non-terminating, such as a heartbeating process, for example. In this case, timeouts aren’t needed, and it may be helpful to explicitly signal this fact by returning a constructor-less <strong>Void</strong> type from the computation.</p><h4>Summary</h4><ul><li>Consider imposing a limit on the duration of an async computation.</li><li>Take care to propagate all forms of completion for an async computation, including errors and cancellations.</li></ul><h3>Laziness</h3><p>While F# is, by default, eagerly evaluated, <strong>Async</strong> computations are <a href="https://en.wikipedia.org/wiki/Lazy_evaluation">lazy</a>, albeit with important exceptions. Laziness implies that simply having a reference to an <strong>Async</strong> computation does not imply that that computation is running. This is in contrast to <strong>Task</strong>, for example, which usually represents a computation which is already running. In addition, unlike lazy evaluation in languages like Haskell, <strong>Async</strong> computations are not memoized, which means they will be reevaluated each time they are run. This is again in contrast to <strong>Task</strong>, which is idempotent - once it completes, the produced value is memoized. The lazy nature of <strong>Async</strong> is evident through the <strong>async.Delay : (unit → Async&lt;&#39;a&gt;) → Async&lt;&#39;a&gt; </strong>operation which takes a function producing an async value, and represents it as an async value. The function will be evaluated each time the <strong>Async</strong> computation is evaluated. The <strong>Delay</strong> operation is used as part of a syntactic transformation of an async workflow, making everything inside an async block lazy. However, it is also possible to explicitly memoize an Async computation and it is impossible to determine whether a given async computation is memoized or not. For example, an Async computation can be memoized by using a <strong>TaskCompletionSource</strong> to store its result:</p><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/7ebfe954b4bac2e96804ef17f441e046/href">https://medium.com/media/7ebfe954b4bac2e96804ef17f441e046/href</a></iframe><p>Another example where an <strong>Async</strong> computation is a already in flight is the result of the <strong>Async.StartChild : Async&lt;&#39;a&gt; → Async&lt;Async&lt;&#39;a&gt;&gt;</strong> operation. When the outer <strong>Async</strong> computation is bound, the input computation is started, and the inner <strong>Async </strong>computation is a handle to the started computation, which when bound, awaits its result. Awaiting the inner computation multiple times does not reevaluate the input computation.</p><p>The (mostly) lazy nature of Async can lead to unexpected results. For example, suppose you want to run two Async computations in parallel, and be notified when the first one completes, but also be able to retrieve the result of the second computation once it completes. Using the <strong>Async.choose</strong> operation as defined above would cause the second computation to be cancelled. If the calling code were to await its result, the computation would be reevaluated. Instead, the following operation might be better suited to this task:</p><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/3dddfd61c6a2a77400ff053f55c52a1a/href">https://medium.com/media/3dddfd61c6a2a77400ff053f55c52a1a/href</a></iframe><p>The <strong>Async.race</strong> operation explicitly memoized the result of the second computation. We can compare this with the <strong>Task.WhenAny</strong> operation which will also returns the first computation to complete, however the other computations are not cancelled and can still be awaited by the caller.</p><h3>Thread Local Storage</h3><p>As described in the Thread Pool section, async computations aren’t bound to specific threads, and a given workflow may execute across several thread pool threads throughout its lifecycle. As such, the <a href="https://docs.microsoft.com/en-us/dotnet/standard/threading/thread-local-storage-thread-relative-static-fields-and-data-slots">Thread Local Storage (TLS)</a> mechanism can’t be used to store contextual data for a workflow. However, cross-cutting concerns often require a notion of workflow-local storage, for example to store a tracing context. Even though this mechanism isn’t provided out of the box, it is possible to implement it explicitly by building a workflow for the following type:</p><pre><strong>type</strong> Context = <strong>Dictionary</strong>&lt;string, obj&gt;</pre><pre><strong>// </strong>An async computation explicitly<br>// depending on a context<br><strong>type</strong> AsyncEnv&lt;&#39;a&gt; = <strong>Context</strong> <strong>→</strong> <strong>Async</strong>&lt;&#39;a&gt;</pre><p>This type can be treated in the same way as the existing <strong>Async</strong> type by implementing a computation workflow, however it can also provide operations for reading and writing into the context. In fact, the existing <strong>Async</strong> type already stores the ambient <strong>CancellationToken</strong> in its context and it should be possible to extend the implementation to support arbitrary data items. Note that workflow context should be used judiciously as it can lead to unexpected results and leaks.</p><h4>Summary</h4><ul><li>Don’t rely on thread-local storage from within Async computations.</li><li>If you need workflow-local storage, consider implementing a extended Async computation workflow.</li></ul><h3>Related Programming Models</h3><p>In this section, we compare the <strong>Async</strong> type to similar concepts in .NET and other programming languages.</p><h4>.NET System.Threading.Tasks.Task</h4><p>The <a href="https://msdn.microsoft.com/en-us/library/dd321424(v=vs.110).aspx">System.Threading.Tasks.Task</a> type in the .NET Framework serves a very similar purpose to <strong>Async</strong>. It also represents a computation that eventually produces a value. <strong>Async</strong> has operations to map to and from <strong>Task</strong>. However, there are some important differences. First, a <strong>Task</strong> is idempotent (monotonic): once it produces a value, the task is completed and will no longer perform additional computation. <strong>Async</strong> on the other hand can be evaluated many times. It is possible to cache the result of an <strong>Async</strong> computation, however this must be done explicitly. Second, in most cases, a <strong>Task</strong> represents an in-progress computation, whereas an <strong>Async</strong> represents a computation which must be explicitly evaluated. The <strong>Task.ContinueWith</strong> operation is similar to <strong>async.Bind</strong> - it binds a continuation to the result of the computation. Since <strong>Task</strong> is monotonic and idempotent, it is important to note that <strong>Task.ContinueWith</strong> adds the continuations to a list in the target computation, whereas <strong>async.Bind</strong> returns a copy of the workflow which will be reevaluated. As a shoutout to the monad people, <strong>Task.ContinueWith</strong> is actually the comonadic <strong>extend</strong> operation, whereas <strong>async.Bind</strong> is the monadic <strong>bind</strong> operation. <strong>Task</strong> has the additional <strong>Unwrap</strong> operation corresponding to the monadic <strong>join</strong>. It is possible to map between <strong>Async</strong> and <strong>Task</strong> using the <strong>Async.StartAsTask</strong> and <strong>Async.AwaitTask</strong> operations. In F# this is commonly done to interact with existing C# libraries, or to take advantage of <strong>Task</strong> in scenarios where it is a better fit.</p><h4>Java java.util.concurrent.Future</h4><p>The <a href="https://docs.oracle.com/javase/7/docs/api/java/util/concurrent/Future.html">Future</a> type in Java is essentially the same as the <strong>Task</strong> type above.</p><h4>Akka</h4><p><a href="https://akka.io/">Akka</a> is an actor framework for the JVM. It is heavily inspired by Erlang, and in addition to the actor model itself, provides facilities for routing, fault tolerance and distribution. As described in the MailboxProcessor section, the actor model is too low-level for many use-cases, making it <a href="http://pchiusano.blogspot.com/2013/09/actors-are-overly-nondeterminstic.html">easy</a> to make mistakes. To that end, Akka also provides a <a href="https://doc.akka.io/docs/akka/2.5/futures.html">Future</a> type to express request-reply interactions. The <a href="https://www.nuget.org/packages/Akka.FSharp/">FSharp.Akka</a> library is a wrapper for the Akka.net port of Akka.</p><h4>Go Goroutine</h4><p>A <a href="https://gobyexample.com/goroutines">Goroutine</a> is very similar to F# Async. The Go concurrency model is heavily inspired by CSP, and in addition to goroutines, it includes <a href="https://gobyexample.com/channels">channels</a>. A channel is a junction across which goroutines can exchange messages. The <a href="https://gobyexample.com/select">select</a> statement provides selective communication amongst channels. Note that selective communication is not an entirely trivial concept.</p><h4>JavaScript Promise</h4><p>A JavaScript <a href="https://developers.google.com/web/fundamentals/primers/promises">promise</a> is essentially the same thing as <strong>Task</strong> and <strong>Future</strong>, and also similar to <strong>Async</strong>. NodeJS users are familiar with the pain of callback-style programming, and JavaScript promises adapt it to the more convenient sequential flow control style.</p><h4>Haskell Control.Concurrent.Async</h4><p>The Haskell <a href="https://hackage.haskell.org/package/async-2.2.1/docs/Control-Concurrent-Async.html">Async</a> type is a thin layer atop the IO monad and is very similar to the F# Async type. There are additional constructs in the Control.Concurrent namespace, such as <strong>MVar</strong>, <strong>IVar</strong> and <strong>Chan</strong>. <strong>IVar</strong> is essentially <strong>TaskCompletionSource</strong> and <strong>MVar</strong> is described above in the <strong>MailboxProcessor</strong> section. <strong>Chan</strong> is similar to channels in Go and Concurrent ML. In addition, Haskell has other concurrent programming models such as Software Transactional Memory (STM) and <a href="https://hackage.haskell.org/package/transactional-events">Transactional Events</a>. Simon Marlow&#39;s book <a href="https://simonmar.github.io/pages/pcph.html">Parallel and Concurrent Programming in Haskell</a> offers a wealth of information on concurrent programming in Haskell.</p><h4>Concurrent ML</h4><p><a href="https://en.wikipedia.org/wiki/Concurrent_ML">Concurrent ML</a> is a concurrency library for the ML programming language. The <strong>Event</strong> construct is very similar to F# Async, however at closer inspection it supports a richer set of operations. In particular, Event and the accompanying Channel construct in ML support selective communication. Selective communication forms a proper disjunction between computations, committing to one and ensuring the other is not committed to. Hopac is an implementation of Concurrent ML in F#, with a vast array of operations and types. In essence, it is an implementation of the pi-calculus.</p><h4>Joinads</h4><p><a href="http://tomasp.net/academic/papers/joinads/">Joinads</a> is a research extension of F# based on the <a href="https://en.wikipedia.org/wiki/Join-calculus">join-calculus</a> programming model. Joinads also include a syntactic construct extending the existing <strong>match</strong> syntax in F#, allowing the expression of join patterns among multiple channels. This provides a richer and more convenient set of synchronization mechanisms beyond F# Async - in particular, selective communication. <a href="https://fslang.uservoice.com/forums/245727-f-language/suggestions/5663965-integrate-joinads-extension">With any luck</a>, the programming model will make it into the core F# language at some point.</p><h4>Hopac</h4><p><a href="https://github.com/Hopac/Hopac">Hopac</a> is an implementation of Concurrent ML in F#. It provides a much richer set of operations than the F# Async type, in particular for selective communication. It is also more efficient than F# Async or Task for many workloads. In addition, the library is accompanied by a <a href="https://github.com/Hopac/Hopac/blob/master/Docs/Programming.md">wealth of documentation</a> which is useful for programmers in any language.</p><h4>Clojure Async</h4><p>F# Async is similar to and is <a href="http://clojure.com/blog/2013/06/28/clojure-core-async-channels.html">motivated</a> by many of the same reasons that <a href="https://github.com/clojure/core.async/">Clojure Async</a> is.</p><h3>Concepts</h3><p>This section is a narrative on concepts of concurrent and parallel programming used throughout the post.</p><h3>Concurrency &amp; Parallelism</h3><p>Concurrency refers to the absence of ordering information among events. In other words, given two events, if we don’t know which came first, we call the events <em>concurrent</em>. Furthermore, even if we impose a total order on the events in the system, operations, consisting of an invocation and completion event, are regarded as concurrent when they overlap. Even though one operation may start before the other, overlap in their spans makes the ordering between operations a partial order. Concurrent programming refers to programming in the face of absence of ordering information among some subset of events in the system. Various <a href="https://en.wikipedia.org/wiki/Concurrency_(computer_science)">models of concurrency</a> have been developed in order to better understand the semantics of concurrency and/or to provide a programming model suited to concurrent domains. We shall discuss a few of these models and relate them to F#.</p><p>One model of concurrency from the <a href="https://en.wikipedia.org/wiki/Process_calculus">process calculi</a> family is called<a href="https://en.wikipedia.org/wiki/Communicating_sequential_processes"> Communicating Sequential Processes </a>(CSP). CSP models a concurrent system as a collection of independent, sequential processes (i.e. threads) which interact at explicit junctions. An interaction event is a point of synchronization between processes, allowing the exchange of information. Another model of concurrency is the <a href="https://en.wikipedia.org/wiki/Actor_model">actor model</a> wherein actors, which are sequential threads of control, are a core computational primitive. Both processes in CSP and actors in the actor model interact using explicit message passing, rather than through shared memory, such as in the <a href="https://en.wikipedia.org/wiki/Parallel_random-access_machine">PRAM</a> model. Note however that this distinction between shared memory and message passing becomes blurred since interactions with shared memory can also be modeled using message passing. Indeed, it takes a non-negligible amount of time to send a read request across the memory bus, and moreover, modern memory systems rely on <a href="https://en.wikipedia.org/wiki/Cache_coherence">cache coherence</a> protocols in order to provide consistent guarantees. Both CSP and the actor model are notable because they’ve been very influential in the design of programming models for concurrency. The actor model is well known through the Erlang programming language, or the Akka actor framework on the JVM. CSP influenced the Concurrent ML programming model as well as the concurrency model in Go.</p><p>In .NET, we’ve the <a href="https://docs.microsoft.com/en-us/dotnet/csharp/programming-guide/concepts/threading/thread-synchronization">fundamental synchronization primitives</a> which include locks, synchronization events, wait handles, interlocked operations, etc. A lock or mutex, for example, facilitates interaction among threads by delimiting a section of code — called the critical section — that can only be accessed by one thread at a time, providing <a href="https://en.wikipedia.org/wiki/Mutual_exclusion">mutual exclusion</a>. Multiple threads can execute a critical section, but just one at a time, which makes it much easier to reason about memory access and mutation. Synchronization events also facilitate interaction among threads by allowing one thread to wait on a signal from another thread or process. Interlocked operations are essentially locks at the hardware level. The introduction of <a href="https://docs.microsoft.com/en-us/dotnet/standard/collections/thread-safe/">concurrent collections</a> in .NET provided access to the higher-level <a href="https://en.wikipedia.org/wiki/Producer%E2%80%93consumer_problem">producer-consumer</a> pattern. The <a href="https://msdn.microsoft.com/en-us/library/dd449174(v=vs.110).aspx">TaskCompletionSource</a> type is similar to a synchronization event, however the signal can be accompanied by data, and waiting is expressed using the <a href="https://msdn.microsoft.com/en-us/library/dd321424(v=vs.110).aspx">Task</a> type.</p><p>In F# we also have the <strong>MailboxProcessor</strong> (MBP) which, as alluded by the name consists of a mailbox and a processor. The mailbox can be posted to and received from, and the processor is a thread of control interacting with the mailbox. Semantically, the <strong>MailboxProcessor</strong> can be associated to the actor model of concurrency, though typical actor model implementations (such as <a href="https://getakka.net/">Akka.NET</a>) are accompanied by support for distribution as well as a range of facilities for routing and fault-tolerance. The MBP manages concurrency by (FIFO) ordering messages posted to the mailbox. The thread of control processes a single message at a time without any need to consider parallelism in the implementation as only a single message is processed from the queue at any point in time. MBPs are particularly useful for implementing higher-level constructs such as producer-consumer queues, buffers, channels, etc.</p><p>Concurrency and parallelism are related notions and are often used interchangeably. However, upon a closer inspection, their relationship is more of a duality. Parallelism is the idea of launching operations to be run in parallel. This in turn results in events, generated by those operations, which are concurrent, because ordering information is absent. Concurrency, on the other hand, typically refers to synchronization among concurrent events. Speaking loosely, parallelism generates disorder and concurrency synchronizes it. As an example, the <strong>Async.Parallel</strong> operation involves both - it first parallelizes the input computations, but then it synchronizes the parallel computations into a single converged result.</p><h3>Asynchronous &amp; Synchronous</h3><p>The Async type is so called because it enables controlled use of <a href="https://en.wikipedia.org/wiki/Asynchrony_(computer_programming)">asynchrony</a> by decoupling the invocation of an action from the handing of its result, while retaining sequential flow control. Asynchrony allows for more efficient use of threads, as well as for expression of parallelism and concurrency. A related notion is that of an <a href="https://www.quora.com/What-is-an-asynchronous-network-in-the-context-of-distributed-systems">asynchronous network</a> wherein there is no bound on message transmission delay. The underlying substrate is that of asynchrony — the event that represents a message being transmitted is decoupled from the event representing receipt or completion, resulting in temporal decoupling. However, complete asynchrony wouldn’t be of much use without synchronization. In terms of events, synchronization is the act of combining multiple events into one. For example, an interaction between two processes can be represented by two events, one at each process. In the theory of concurrency this is known as <a href="https://en.wikipedia.org/w/index.php?title=Synchronous_rendezvous&amp;redirect=no">synchronous rendezvous</a>. In .NET, <strong>TaskCompletionSource</strong> is a way to implement a form of rendezvous between threads, with one thread waiting for a value and another signaling the value. In Go and Hopac, for example, channels are used as a rendezvous mechanism. It should be noted that synchronization requires coordination among participants. This can be costly in the context of a single process and even more so across network boundaries. As such, systems should be designed to be asynchronous to the extent possible, but with principled use of synchronization where it is required, keeping locality in mind.</p><h3>Selective Communication</h3><p>Selective communication is a concept involving channels, as seen in Go, Haskell, Concurrent ML, and F# Hopac. Selective communication is the idea of selecting a message from a set of channels, picking the first one to produce a message, while leaving the others intact. A critical component of selective communication is that only one channel is picked and received from, with the others left intact. Simply invoking a receive operation from multiple channels in parallel doesn’t quite do the trick since it may cause multiple channels to dequeue a message where only one will be received by the caller. F# Async doesn’t provide a selective communication mechanism out of the box. More broadly in .NET, we’ve the <strong>BlockingCollection.TakeFromAny</strong> operation, but of course <strong>BlockingCollection</strong> uses blocking as its synchronization mechanism. The need for selective communication is quite common. Whenever a choice needs to be made among a set of possible events, there&#39;s a need for selective communication. In this sense, selective communication is the dual to parallelism. However, selective communication is typically implemented in ad-hoc ways; in .NET it is usually done using <strong>CancellationToken</strong>.</p><p>See also: <a href="https://github.com/Hopac/Hopac/blob/master/Docs/Programming.md">The Hopac Programming Manual</a>.</p><h3>Acknowledgements</h3><p>Thanks to <a href="https://github.com/gusty">Gustavo Leon</a>, <a href="https://github.com/eiriktsarpalis">Eirik Tsarpalis</a>, <a href="https://github.com/bartelink">Ruben Bartelink</a> and many others at Jet for comments, edits, suggestions.</p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=eb3c8a2d180a" width="1" height="1" alt="">]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[Scaling Event-Sourcing at Jet]]></title>
            <link>https://medium.com/@eulerfx/scaling-event-sourcing-at-jet-9c873cac33b8?source=rss-a8651abcb80a------2</link>
            <guid isPermaLink="false">https://medium.com/p/9c873cac33b8</guid>
            <category><![CDATA[event-sourcing]]></category>
            <category><![CDATA[database]]></category>
            <category><![CDATA[distributed-systems]]></category>
            <category><![CDATA[distributed-tracing]]></category>
            <dc:creator><![CDATA[Leo Gorodinski]]></dc:creator>
            <pubDate>Tue, 24 Oct 2017 17:45:51 GMT</pubDate>
            <atom:updated>2017-10-27T13:10:26.938Z</atom:updated>
            <content:encoded><![CDATA[<p>At Jet, we’ve been using event-sourcing since the very beginning and learned some lessons along the way. There are several dimensions along which we had to scale our event-sourcing platform. The one which most teams using event-sourcing have to overcome early on is scaling reads — as streams increase in size it becomes prohibitive to read the entire stream to perform an operation. Another dimension of scaling is redundancy — in order to function continuously, the platform needs to tolerate not only failures of individual machines in a data center, but failures of an entire data center. The projection system needs to be scaled to support a growing number of consumers with varying workloads. Meanwhile, as the number of moving parts increases, it becomes essential to verify safety and liveness guarantees advertised by the system. Of course in time, the benefits of a highly modular architecture afforded by event-sourcing start to weigh on our ability to obtain accurate pictures of system states. To that end, we need a tracing platform which, in addition to request-reply, must support tracing of asynchronous interactions. All things considered, the challenges of operating an event-sourcing platform are noteworthy, but its sound foundational principles continue to pay dividends as we evolve.</p><h3>Event Sourcing</h3><p>There are a few definitions of event sourcing floating around. <a href="https://martinfowler.com/eaaDev/EventSourcing.html">Martin Fowler’s</a> is perhaps the most cited one and it states that:</p><blockquote>Event sourcing is a paradigm where changes to application state are recorded as a series of events.</blockquote><p>To make this more concrete, it is helpful to model applications using <a href="https://en.wikipedia.org/wiki/Input/output_automaton">IO automata</a>. An IO automaton is defined as a set of states, a special starting state, a set of input events, a set of output events and a transition function taking pairs of state and input event to pairs of state and output event:</p><ul><li><strong>State </strong>| — a set of states.</li><li><strong>S</strong>∅ | — a starting state.</li><li><strong>Input </strong>| — a set of inputs.</li><li><strong>Output</strong> | — a set of outputs.</li><li><strong>τ </strong>:<strong> State </strong>×<strong> Input </strong>→<strong> State </strong>×<strong> Output</strong> | — a transition function.</li></ul><p>The hosting service manages state as well as interaction with input and output mediums. During an <em>operation</em>, the service receives an input event from the input medium, retrieves state corresponding to the input, executes the transition function, persists the resulting state and sends the output event to the output medium. A service typically manages multiple state machines concurrently— one for each entity (<a href="https://martinfowler.com/bliki/DDD_Aggregate.html">aggregate</a>) in the system.</p><blockquote>For example, consider the shopping cart system. <strong>State</strong> correspond to states of individual shopping carts, consisting of a list of items, prices and promo code information. <strong>Input</strong> corresponds to requests to perform actions on the cart, such as adding items, or checking out. <strong>Output </strong>corresponds to changes in the cart, such as items being added or removed. Finally, the transition function <strong>τ </strong>encodes the logic for handling requests on a given a cart.</blockquote><p>Event-sourcing makes the observation that rather than persisting state, we can persist the output events. An instance of state can be reconstituted by running a <a href="https://en.wikipedia.org/wiki/Fold_(higher-order_function)">fold</a> over past outputs using a <em>delta</em> function:</p><pre><strong>Δ : State </strong>×<strong> Event </strong>→<strong> State </strong>|—<strong> </strong>defines how an event changes state.</pre><p>We assign <em>sequence numbers</em> to output events and define the sequence number of an instance of state as the sequence number of the last event used to derive it. The transition function defined above can be factored into a delta function and an <em>execute</em> function:</p><pre><strong>ε : State </strong>× <strong>Input </strong>→ <strong>Event </strong>|— takes inputs to outputs at a state.</pre><p>In order to run an event-sourced service, we need a storage mechanism that can store event <em>streams</em> for each entity in the system. These capabilities can be summarized as follows:</p><pre><strong>get : StreamId </strong>×<strong> SN → Events </strong>|— returns events in a stream.</pre><pre><strong>add : StreamId </strong>×<strong> SN </strong>×<strong> Events → Ack </strong>|— adds new events to a stream.</pre><p>The <strong>get</strong> operation returns the set of events in a stream starting at the specified sequence number <strong>SN</strong>. The <strong>add</strong> operation appends a set of events to a stream at a specified sequence number. If the sequence number does not match the stream an error is returned — <a href="https://en.wikipedia.org/wiki/Optimistic_concurrency_control">optimistic concurrency control</a>.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/509/1*0L-acJa6KEgxh3y95z4YxA.png" /><figcaption>Figure 1: Event stream index.</figcaption></figure><p>In addition, the event store should also provide access to the <em>log</em> of all events in a collection:</p><pre><strong>log : LSN</strong> <strong>→ Events </strong>|— returns all events in a partition.</pre><p><strong>LSN</strong> herein refers to a logical point in the log of <em>all</em> events in a collection, and it may be a sequence number, or a more complex structure such as a vector if the log is partitioned. The <em>log</em> enables service orchestration — downstream services can perform operations in response to events in an upstream system, or the upstream system itself can be replicated — <a href="https://en.wikipedia.org/wiki/State_machine_replication">state-machine replication</a>. Moreover, the <em>log</em> allows for communication to be consistent with respect to the state — events are used to reconstitute state and notify downstream systems of changes in the upstream system. Without a log, care must be taken to prevent missed communications, or communications with respect to uncommitted states.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/236/1*1Y1ivYhfkk5SuoHo6iEvtA.png" /><figcaption>Figure 2: Event log.</figcaption></figure><p>The collection of <em>streams</em> depicted in Figure 1 is an index of the <em>log</em> depicted in Figure 2.</p><p>One data store that has these capabilities is <a href="https://eventstore.org/">EventStore</a> and it is used for many systems at Jet. With this definition of event-sourcing in mind, we can characterize the different ways that we’ve scaled our event-sourcing platform.</p><h3>Scaling Reads</h3><p>Recall that the <strong>get</strong> function, defined above, returns the events in a stream starting at a specified sequence number. Throughout the lifetime of a system, streams can get arbitrarily large, eventually making reads of the entire stream prohibitive during an operation. A common way to scale reads with event-sourcing is using a technique called a <a href="https://cqrs.files.wordpress.com/2010/11/cqrs_documents.pdf">rolling snapshot</a>. A snapshot captures the state of a stream at a particular point in time and is constructed using the <em>delta</em> function defined above. Then, only events occurring after this point in time need to be read in order to reconstitute the last known state.</p><blockquote>Snapshotting is not to be confused with distributed snapshots using a <a href="https://en.wikipedia.org/wiki/Snapshot_algorithm">snapshot algorithm</a>— a different, albeit somewhat related notion. A snapshot algorithm approximates a global state of a distributed system, and is most often used for asserting stable properties, such as termination or deadlock.</blockquote><p>Snapshots can be managed in a few different ways. A service can persist a snapshot when performing operations. Alternatively, snapshots can be generated by a downstream services consuming the <em>log</em>. Snapshots can be generated for every event, or based on an interval. Snapshots can be stored in another stream, or in an entirely different data store. (Snapshots can also be stored alongside events in a stream, however this requires an ability to read streams backwards, couples the snapshotting interval to the read access pattern, and interleaves state — a particular interpretation of events — with the events themselves). Before an operation is performed, a snapshot can be read, followed by a read of any remaining events to reconstitute the latest state. Alternatively, the operation can be performed speculatively with respect to the retrieved state, relying on the optimistic control of the event-store to ensure consistency of the underlying stream. By regarding state snapshots as an optimization mechanism rather than a core storage pattern we had flexibility in terms of how reads could be scaled, all while retaining the entire history of events.</p><h3>Scaling Projections</h3><p>In the context of event-sourcing, a projection refers to a running fold of events. In essence, a projection embodies a state-machine whose inputs are events from the <em>log</em>. Its output events may form another stream, or the projection may be used solely for computing a state. A projection may rely on state, or it may be stateless. One type of projection is a filter — it forms a stream of events matching a predicate. Another is a transformer — it either enriches events, or translates them into another form. Since projections are state-machines, they can also perform aggregations and joins.</p><pre><strong>π </strong>: <strong>State</strong> × <strong>Event</strong> → <strong>State</strong> × <strong>Output</strong></pre><p>The EventStore projection system is quite handy and has several built-in filters, such as the stream prefix filter just described, an event-type filter and a projection running custom JavaScript. An issue with EventStore projections is that they haven’t worked well on a cluster. As such, the first step to scaling the projection system was running projections on a replica EventStore instance, downstream from the cluster. This instance could run as a single node and its sole purpose would be to generate and distribute projections. An asynchronous replication service would consume the <em>log</em> to populate the projection node.</p><p>Another issue using EventStore for projections is that its <em>log</em> isn’t partitioned, and as such, the single reader thread becomes a bottleneck. To scale the projection system, we introduced <a href="https://kafka.apache.org/">Kafka</a> as the distribution medium. A service executes projection state machines, and emits outputs to Kafka topics. This service can run filtering projections as defined above, but it can also run more complex transformations. For example, a projection can be defined to translate between internal and external contracts of a system. Stream snapshots can also be computed using a projection.</p><p>Kafka serves well as a distribution medium, however we don’t rely on it as a source of truth. The projected topics have a retention policy, and architecturally, the projection system is designed to tolerate failures in the Kafka cluster, either by reading the upstream event store, awaiting rehydration of the cluster or failing over to another region as described next.</p><h3>Geo-Replication</h3><p>Another dimension of scaling is redundancy — continuous operation becomes increasingly critical as the business grows. Redundancy of storage systems is typically achieved using <a href="https://en.wikipedia.org/wiki/State_machine_replication">state-machine replication</a> wherein data is replicated across a cluster, tolerating failures of some number of machines. It is quite common for database products to support clustering within a data center. It is much less common for database products to support clustering both within a datacenter and <em>between</em> data centers. This isn’t simply a matter of growing a cluster to include nodes across different data centers — the latency differences between a LAN and a WAN must be taken into account and reflected in the replication protocol.</p><p>EventStore runs as a cluster using a synchronous replication protocol which ensures consistency, or more precisely <a href="https://en.wikipedia.org/wiki/Linearizability">linearizability</a>, among a quorum of nodes. A clustered mode of operation is essential in a cloud environment where individual VMs are routinely restarted for maintenance. EventStore however does not support cross datacenter clustering, which we’ve had implement as a bolt-on component. Since EventStore exposes the underlying <em>log</em>, this was possible.</p><p>The bolt-on design augmented a single-region system with asynchronous cross-region replication. Since cross-region replication is asynchronous, there is a possibility of data loss, which was taken as acceptable in during regional failures. However, the regional failover and fail-back processes still need to take the system through consistent states.</p><p>Consider a system with two regions — a primary and a secondary. Each region contains a cluster, and there is an asynchronous replication channel from the primary to the secondary. The primary region accepts all writes. The secondary region can’t accept writes — this would result in conflicts — but its <em>log</em> can be consumed by downstream systems, including a projection system. During a failure in the primary region, the secondary region can be turned into a primary, re-routing all writes to it. At this point, the system can continue to operate, though possibly in a compromised state. For example, a failure to the secondary region cannot be tolerated. Moreover, some downstream systems may only operate in the primary region and must therefore await its recovery.</p><p>In order to fail-back and recover the primary region the asynchronous replication channel must be reversed and directed into a suitable replica. The logs between the primary and secondary regions may have diverged and conflicts would result if replication is reversed into the original primary. A suitable replica can be obtained by restoring a backup of the secondary region in the primary region and then reversing the asynchronous replication channel.</p><p>A more graceful way to achieve the fail-back is to extend the chain to replicate from the secondary region back to the primary region, but into a 3rd replica cluster. This makes it possible to fail-back to the 3rd replica in the primary region — turning the tail into a head. Meanwhile, a new tail can be bootstrapped resulting in a continuous rotation of the chain. This design provides a tradeoff between the costs of operating a 3rd cluster and recovery time.</p><blockquote>In essence, this design is akin to <a href="https://www.cs.cornell.edu/home/rvr/papers/OSDI04.pdf">chain replication</a>. In chain replication, nodes are organized into linearly ordered chains, wherein a head node accepts writes, which are propagated across the chain, the last node of which is the tail node. Reads can be served by any node in the chain, depending on recency and availability needs. Reads in our case are reads of the log performed by the projection system.</blockquote><p>The following diagram depicts the architecture of the Jet event-sourcing platform:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*d88qGuWsdPRotLH3UBRFIg.png" /><figcaption>Figure 3: Jet Event-Sourcing Platform</figcaption></figure><p>The diagram illustrates the chain paradigm described earlier. The head of the chain is in the primary region and it accepts all writes and reads used for writes. The secondary region hosts the middle of the chain, and an F# service asynchronously replicates evens from the head. A third replica is again hosted in the primary region. The projection system is situated downstream from a replica in each region, and because it is asynchronous by nature, it doesn’t need to consume the <em>log</em> of the head node. In the diagram, the projection system emits outputs to Kafka, however it can just as well emit outputs to another system. Moreover, we can rely on <a href="https://docs.confluent.io/current/streams/index.html">Kafka’s streaming component</a> to form downstream systems.</p><p>A nice property of this architecture is that it allows downstream systems to inherit geo-replication capabilities from the event store. For example, Kafka is not a geo-replicated system, and while <a href="https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=27846330">tools exist </a>to make it so, it is much easier to reason about the system if the source-of-truth itself is geo-replicated. Moreover, rather than geo-replicating each downstream component using its own replication mechanism, all components can piggy back on a single platform. In addition to Kafka, we’ve used this approach to add geo-replication to several other systems, including SQL databases, ElasticSearch clusters, Redis caches, etc.</p><h3>Consistency Verification</h3><p>In an asynchronous system, independent parts operate independently, which also means they fail independently. The regional replication system and the projection system are both asynchronous systems, and we needed a way to monitor their consistency with respect to the upstream event store. We did this by building an out-of-band verification system, which would compare a <em>log</em> to a downstream system. One configuration of the system compares EventStore logs, and another compares an EventStore log to a Kafka topic. The system checks to make sure that:</p><ul><li>All applicable events are transmitted</li><li>That they’re transmitted in the correct order</li><li>And with an expected latency</li></ul><p>This verification system helped us find bugs in our bolt-on projection and replication systems, in EventStore as well as our Kafka client <a href="https://github.com/jet/kafunk">Kafunk</a>. Moreover, it provides continuous monitoring of safety and liveness properties.</p><h3>Distributed Tracing</h3><p>Understanding and debugging systems involving multiple nodes is difficult. Understanding and debugging systems involving multiple nodes and asynchronous interactions is even more difficult. As a result, distributed tracing in an event-sourced system is particularly important, and unlike many existing tracing platforms, it must support tracing of asynchronous interactions. The verification system described above provides a degree of confidence in the system. However, it doesn’t provide the level granularity and scope sufficient for all scenarios. For example, we may wish to inspect the handling of a particular external request across various systems. The verification system can tell us that all events are suitably replicated, but it doesn’t record information about particular <em>traces</em>. A trace is a collection of events associated with a key, and the events denote domain-specific system state changes. The trace key is a unique id, typically generated at a system boundary, and propagates across communication mediums in accordance with the tracing protocol. The tracing system collects and indexes tracing events.</p><h3>Ongoing Work</h3><ul><li><strong>Cool Storage </strong>— while we’ve scaled reads as described above, the issue of ever-growing streams remains. A cool storage mechanism archives older events into cheaper storage mediums.</li><li><strong>Projections using Azure functions</strong> — the projection system can reference <a href="https://azure.microsoft.com/en-us/services/functions/">Azure functions</a> to support execution of arbitrary logic. While care must be taken to ensure that the resulting system is well-behaved, we can expand the scope to allow declarative definitions of microservices</li><li><strong>Event-Sourcing Engine</strong> — while we’ve gotten quite far with EventStore, we’ve set out to build a replacement event-sourcing data store to continue to meet our scaling demands. With this data store, we’re looking to have built-in support for geo-replication.</li><li><strong>Causally-consistent Geo-replication</strong> — as noted above, geo-replication is asynchronous and therefore susceptible to data loss. For some operations, we would like to synchronously replicate events before acknowledgement. This would provide <a href="https://en.wikipedia.org/wiki/Causal_consistency">causal consistency</a> with respect to individual streams across regions.</li></ul><h3>Conclusion</h3><p>Event-sourcing is founded on sound principles, and while there certainly are challenges to building such a platform — as evidenced herein — the benefits outweigh the risks. A notable benefit for the systems engineer is the stability of the architecture —it is possible to scale individual components without changing the core. Teams can build their systems autonomously, but also integrate seamlessly when required. From a theoretical standpoint, the log at the heart of event-sourcing allows disparate components to reach <a href="https://medium.com/@eulerfx/universality-of-consensus-feceead50641">consensus</a> in a <a href="https://en.wikipedia.org/wiki/Non-blocking_algorithm">non-blocking manner</a>. Moving forward, we will continue to enhance the event-sourcing platform to continue meeting the demands of a world-class shopping experience at Jet.</p><h3>Acknowledgements</h3><p>The event-sourcing platform was made possible by efforts of many individuals across several teams at Jet.</p><p><strong>Contributors</strong>: <a href="https://twitter.com/coledutcher">Cole Dutcher</a>, <a href="https://www.linkedin.com/in/andrew-duch-31727421/">Andrew Duch</a>, <a href="https://twitter.com/egerhardess">Erich Ess</a>, <a href="https://twitter.com/eulerfx">Lev Gorodinski</a>, <a href="https://twitter.com/ScottHavens">Scott Havens</a>, <a href="https://twitter.com/Mike_Hanrahan">Mike Hanrahan</a>, <a href="https://twitter.com/wiredsis">Gina Maini</a>, <a href="https://twitter.com/3xplus1">Brian Mitchell</a>, <a href="https://www.google.com/url?sa=t&amp;rct=j&amp;q=&amp;esrc=s&amp;source=web&amp;cd=1&amp;cad=rja&amp;uact=8&amp;ved=0ahUKEwjzt_299YbXAhUJ9YMKHSrEAhoQFggnMAA&amp;url=https%3A%2F%2Fwww.linkedin.com%2Fin%2Fjohn-turek-0660634&amp;usg=AOvVaw2VGPNWONiqB2C6QX7v8UW4">John Turek</a>, <a href="https://twitter.com/IBootstrap">Ido Samuelson</a>.</p><h3>We’re Hiring</h3><p>We’re hiring — if you’d like to get involved in some of these efforts, reach out to <a href="https://jet.com/careers/technology">Jet Technology Careers</a> or to me directly.</p><h3>References</h3><ul><li><a href="https://martinfowler.com/eaaDev/EventSourcing.html">Event-Sourcing</a> — Martin Fowler’s definition of event-sourcing.</li><li><a href="https://eventstore.org/">EventStore</a> — The open-source, functional database with Complex Event Processing in JavaScript.</li><li><a href="https://cqrs.files.wordpress.com/2010/11/cqrs_documents.pdf">CQRS Documents</a> — CQRS/Event-sourcing article by Greg Young.</li><li><a href="https://kafka.apache.org/">Apache Kafka</a></li><li><a href="https://www.cs.cornell.edu/home/rvr/papers/OSDI04.pdf">Chain Replication for Supporting High Throughput and Availability</a> — introduces the chain replication protocol.</li><li><a href="https://groups.csail.mit.edu/tds/papers/Lynch/ioa-leavens.pdf">Using I/O Automata for Developing Distributed Systems</a> — the IO automaton formalism introduced by Stephen Garland and Nancy Lynch.</li><li><a href="https://engineering.linkedin.com/distributed-systems/log-what-every-software-engineer-should-know-about-real-time-datas-unifying">The Log: What every software engineer should know about real-time data’s unifying abstraction</a></li><li><a href="https://www.confluent.io/blog/bottled-water-real-time-integration-of-postgresql-and-kafka/">Bottled Water: Real-time integration of PostgreSQL and Kafka</a> — demonstrates the use of a log in a data store to integrate with Kafka.</li><li><a href="https://research.google.com/pubs/pub36356.html">Dapper, a Large-Scale Distributed Systems Tracing Infrastructure</a> — Google’s distributed tracing platform.</li><li><a href="https://github.com/jet/kafunk">Kafunk</a> — F# Kafka client.</li><li><a href="https://github.com/eulerfx/DDDInventoryItemFSharp">Event-sourcing with F# and EventStore</a>.</li><li><a href="https://medium.com/@eulerfx/universality-of-consensus-feceead50641">Universality of Consensus</a> — for details on the relationship between event-sourcing and consensus.</li></ul><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=9c873cac33b8" width="1" height="1" alt="">]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[Universality of Consensus]]></title>
            <link>https://medium.com/@eulerfx/universality-of-consensus-feceead50641?source=rss-a8651abcb80a------2</link>
            <guid isPermaLink="false">https://medium.com/p/feceead50641</guid>
            <category><![CDATA[synchronization]]></category>
            <category><![CDATA[consensus]]></category>
            <category><![CDATA[programming]]></category>
            <category><![CDATA[distributed-systems]]></category>
            <category><![CDATA[concurrency]]></category>
            <dc:creator><![CDATA[Leo Gorodinski]]></dc:creator>
            <pubDate>Wed, 18 Oct 2017 12:14:43 GMT</pubDate>
            <atom:updated>2017-10-18T12:14:43.709Z</atom:updated>
            <content:encoded><![CDATA[<p>Consensus is a fundamental problem in distributed computing and in this post we will see exactly why that is the case. In the spirit of modularity, we investigate the essence of distributed computation and seek fundamental building blocks with which we can compose larger systems. A consensus object is one such building block forming the core of a <em>universal construction</em> which provides a linearizable wait-free implementation of any other object given its sequential specification. In what follows, we shall define the notions of wait-freedom, linearizability and consensus. Furthermore, we discuss sequential specifications and what it means for one object to implement another. That consensus is at the heart of a universal construction is a reasonably natural notion. After all, the difficulty of distributed computation comes from the absence of common knowledge about the system and consensus gives us just that — common knowledge. To make our intuitions more precise, we begin by establishing a system model on which we base our assertions.</p><h3>System Model</h3><p>The system consists of <em>objects</em> accessed concurrently by <em>processes, </em>and both can be modeled by <a href="https://en.wikipedia.org/wiki/Input/output_automaton">IO automata</a>. In essence, an IO automaton is a state machine, consisting of a set of states and a set of events depicting transitions between states. A process is a sequential thread of control and its events model interactions with objects — an <em>invoke</em> event corresponds to the invocation of an operation on an object and a <em>receive</em> event corresponds to the receipt of a response from an object. An object is a data structure shared by processes and its events model invocations of <em>operations</em> by processes — an <em>invoke</em> event corresponds to the invocation of an operation and a <em>respond</em> event corresponds to the response. An operation is therefore delimited by two events — the invocation and the response. The invoke event on a process is an output event — an outgoing communication, while the invoke event on an object is an input event — an incoming communication. Similarly, the response event on a process in an input event, and the response event on an object is an output event. Notice how an invoke event on a process and object are output and input events respectively, forming a symmetric matching pair when they refer to the same process and object. Automata can be composed by matching their respective events. The states of the resulting composite automaton are tuples of states of the constituent automata, and the set of events is the union of events of the constituent automata. For example, a queue is an object with two operations: <em>enqueue</em> which given an object, adds it to the queue, and <em>dequeue</em> which returns the object at the head of the queue, or a null value if the queue is empty. A queue may be accessed concurrently by multiple processes which together form a concurrent system.</p><h4>Histories</h4><p>A collection of events resulting from execution of processes and objects in a system is called a <em>history</em>. Events at individual automata are totally ordered, which means that given any pair of events we can determine which came first. We can extend this total order of events to a partial order on operations. An operation <strong>O</strong> comes before operation <strong>P</strong> if the response event to <strong>O</strong> is received before the invocation event of <strong>P</strong>. The reason this ordering is partial is because some operation may overlap. That is to say, operations may be <em>concurrent</em>. Recall that we referred to a process as a sequential thread of control. We can formalize this notion using histories. Given a history <strong>H</strong>, a sub-history belonging to a specific automaton <strong>A</strong> is denoted <strong>H|A</strong>. A history of events for a process <strong>H|P</strong> consists of alternating, matching invocation-response pairs. In other words, a process consists of a totally ordered set of operations. A totally ordered history of operations is called a <em>sequential history</em>. The difficulty in implementing concurrent objects is that in general, histories are not sequential and operations from different processes can overlap, resulting in a possibility of ill-defined states. For example, the dequeue operation on a queue might first check if the queue is empty and if it isn’t return the item at the head of the queue, otherwise return null. However, if a process invokes the dequeue operation while a dequeue operation from another process is still in progress, they could be dequeue the same item, violating the invariant of a queue.</p><h4>Sequential Specifications</h4><p>A sequential specification is a specification of an object assuming a sequential history. Under these circumstances, we can regard an object as a state machine with a transition function <strong>δ</strong>, such that <strong>δ(s,op(args)) — </strong>given a state and an invocation of an operation — returns a pair consisting of a new state <strong>s’</strong> and a value <strong>res</strong> to return to the calling process. The sequential specification can then be defined as a set of pre-conditions and post-conditions on object states before and after the execution of operations. For example, we might say that the state of a queue object doesn’t contain an item before an enqueue operation, and does contain the enqueued item after an enqueue operation. Sequential specifications serve as a recipe for implementing an object in a universal construction.</p><h3>Wait Freedom</h3><p>A simple way to make a concurrent implementation of an object is through the use of locks. Locks or mutexes enclose critical sections wherein only a single process can execute at a given time. This allows the programmer to use sequential reasoning to assert the correctness of an algorithm. The use of locks however isn’t ideal — a faulty process can stall or arbitrarily delay the execution of other processes. The absence of locks in an implementation is called <em>lock-freedom</em>. An implementation is lock free if at least one thread is guaranteed to make progress. Use of a lock makes this guarantee impossible — if two threads are contending for a resource, and the one who acquires it first stalls, the second thread will be deadlocked. Wait-freedom is a stronger progress condition which guarantees that <em>each</em> process can make progress in a finite number of steps regardless of the behavior of other processes. Thus any wait-free implementation is also lock-free, but not necessarily so the other way around. In what follows, we will consider wait-free implementations of an object based on another object. Progress conditions are a particularly important consideration in a distributed system, where independent failures are more common and the scheduler may consist of multiple disparate components.</p><blockquote>Wait-freedom is also an important notion for theoretical reasons because it applies to all processes and it is independent of the process scheduler — so long as processes are scheduled, wait-freedom guarantees progress for all processes. By contrast, lock-freedom only guarantees progress for some process. And a lock-based procedure relies on the scheduler to provide starvation-freedom, by ensuring that processes eventually leave the critical section. Wait-freedom is therefore starvation-freedom in presence of failures.</blockquote><h3>Linearizability</h3><p>In order to be able to transfer the correctness of a sequential system to a concurrent system, we must define a correctness condition for concurrent objects. <em>Linearizabilty</em> is such a correctness condition and it states that a concurrent object is linearizable if it all possible histories involving that object and any number of processes can be linearized. A <em>linearization</em> of a concurrent history <strong>H</strong> is a valid sequential history <strong>S</strong> wherein the partial order on operations imposed by <strong>H</strong> is respected by <strong>S</strong>. Therefore, <strong>S</strong> completes the partial order defined by <strong>H</strong>, but agrees with <strong>H</strong> on orderings that <strong>H</strong> does define. Linearizabilty is useful because it allows us to transfer the reasoning about a sequential system to reasoning about a concurrent system, significantly simplifying analysis. Moreover, linearizabilty is a <em>local property</em> in that if a system consists of linearizable objects, the system is itself linearizable. In other words, linearizabilty composes.</p><p>Figure 1 bellow depicts two processes interacting with an object. The solid disks represent events, histories are horizontal lines, operations are delimited by invoke and respond events and form matching pairs on process and object histories. The interaction involves two operations, and the operations overlap. Even though we may impose a total order on events across all three histories, the two operations are concurrent. There are therefore two linearizations of this history — one where operation 1 happens first and another where operation 2 happens first.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*Hd_9NNYP8TaCKaqmyatHJg.png" /><figcaption>Figure 1. Space-time diagram with processes, objects, events and operations.</figcaption></figure><h3>Consensus Objects</h3><p>A consensus object is a concurrent object with a single operation <strong>propose</strong> which behaves in accordance with the following sequential specification:</p><pre><strong>let</strong> propose value <strong>=</strong><br>  <strong>if</strong> state <strong>=</strong> <strong>null</strong> <strong>then</strong><br>    state <strong>&lt;-</strong> value<br>  state</pre><p>A consensus object must adhere to the following conditions:</p><ul><li><strong>Consistency</strong>: all processes invoking the consensus object receive the same value.</li><li><strong>Integrity</strong>: the value returned by the consensus object is a value proposed by some process.</li><li><strong>Termination</strong>: the operation is wait-free.</li></ul><p>The first two conditions are safety properties — they assert that consensus does what one would expect it to do. The last condition is a liveness property — it asserts that the operation completes and tolerates failures in participating processes. The transition function <strong>δ </strong>in the sequential specification of a consensus object would take the state as the first argument, and the proposed value as the second argument and execute in accordance with the code above.</p><h3>Consensus Numbers</h3><p>To show that there is a wait-free implementation of an object using another, we can map the objects to their <em>consensus number</em> and compare the numbers. A consensus number is the number of processes for which an object can solve consensus. If one object <strong>X</strong> can solve consensus for an equal to or greater than number of processes than object <strong>Y</strong>, then <strong>X</strong> can be used to implement <strong>Y</strong>. For example, a <a href="http://www.cs.yale.edu/homes/aspnes/pinewiki/AsynchronousSharedMemory.html">shared register</a> cannot be used to solve consensus for even two processes. An <a href="https://en.wikipedia.org/wiki/Test-and-set">test-and-set</a> object can be used to solve consensus for at most two processes. A <a href="https://en.wikipedia.org/wiki/Compare-and-swap">compare-and-exchange</a> object on the other can be used to solve consensus for any number of processes — its consensus number is <strong>∞</strong>. In this way, a hierarchy is formed enumerating objects by their synchronization power:</p><pre>|----------------|------------------|<br>| <strong>Object</strong>         | <strong>Consensus Number</strong> |<br>|-----------------------------------|<br>| MRSW register  | 1                | <br>| test-and-set   | 2                |<br>| cas            | ∞                |<br>| queue w/ peek  | ∞                |<br>|----------------|------------------|</pre><p>Consensus numbers were introduced by <a href="http://cs.brown.edu/~mph/">Maurice Herlihy</a> in <a href="https://cs.brown.edu/~mph/Herlihy91/p124-herlihy.pdf">Wait-Free Synchronization</a>. It is a natural progression from his earlier work on <a href="https://cs.brown.edu/~mph/HerlihyW90/p463-herlihy.pdf">Linearizability</a>. A central result proved in the paper is the following theorem:</p><blockquote>If [object] X has consensus number n, and Y has consensus number m &lt; n, then there exists no wait-free implementation of X by Y in a system of more than m processes.</blockquote><p>The proof proceeds by contradiction — it assumes the existence of an implementation, then represents the implementation as a composite IO automaton and finally shows the contradiction. This theorem is significant because it mathematically defines the <em>implementable-by</em> relation. Herein, it allows us to establish that a universal construction can wait-free implement any other object.</p><h3>Universal Constructions</h3><p>An object is <em>universal</em> if it can implement any other object given a sequential specification, and this implementation is both wait-free and linearizable. Such implementations are called <em>universal constructions</em>. One such object is the consensus object defined above. An implementation may also make use of any number of <a href="http://www.cs.yale.edu/homes/aspnes/pinewiki/AsynchronousSharedMemory.html">shared read/write registers</a>.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*HkvxkH4QKf8pwmwgE40JRg.png" /><figcaption>Figure 2. A visual depiction of a universal construction.</figcaption></figure><p>The following is a universal construction due to <a href="http://www.irisa.fr/prive/raynal/">Michel Raynal</a> and it is based on the <a href="https://en.wikipedia.org/wiki/State_machine_replication">state-machine replication paradigm</a>. Each process maintains a copy of the constructed object, and uses consensus object to keep the copies consistent. The construction consists of two parts— the operation and a background helper process. The operation assigns the proposal and waits for the background process to assign a result. The background process runs in a loop, checking if a proposal has been assigned. If it has, then it sends that proposal to a consensus object which returns the first received proposal for that round. The proposal itself consists of the operation and its arguments, and the proposing process. The background process then executes the state machine transition function on the latest state and the proposed operation. Each process keeps a local copy of the state and the consensus protocol ensures that operations are applied to local states in the same order. As such, all processes have a common view of the object. If the proposal that was returned by the consensus object is from the calling process, we assign the result so that it operation can return it to the caller.</p><p>The operation is defined as follows:</p><pre><strong>1. let</strong> perform args <strong>=</strong><br><strong>2.</strong>  result[i] <strong>&lt;-</strong> null<br><strong>3.</strong>  proposal[i] <strong>&lt;-</strong> (&quot;op(args)&quot;,i)<br><strong>4.</strong>  <strong>wait</strong> (result[i] <strong>!=</strong> null)<br><strong>5.</strong>  result[i]</pre><p>The local variables <strong>result[i]</strong> and<strong> proposal[i]</strong> refer respectively to the result and proposal of process <strong>i. </strong>On line 3 process <strong>i</strong> stores its proposal which is proposed by the background process bellow.</p><blockquote>The operation on line 3 is quoted to indicate that we’re referring to a representation of an operation rather than an invocation. In programming language, this can be implemented in a few different ways, such as using reflection, <a href="https://docs.microsoft.com/en-us/dotnet/fsharp/language-reference/code-quotations">quotations as in F#</a>, or using a <a href="https://hackage.haskell.org/package/free-4.12.4/docs/Control-Monad-Free.html">free monad</a>.</blockquote><p>The background process:</p><pre><strong>1. while</strong> <strong>true</strong> <strong>do</strong><br><strong>2.</strong>  <strong>if</strong> proposal[i] <strong>!=</strong> <strong>null</strong> <strong>then</strong><br><strong>3.</strong>    k[i] <strong>&lt;-</strong> k[i] <strong>+</strong> 1<br><strong>4.</strong>    exec[i] <strong>&lt;-</strong> CONS[k[i]].propose(proposal[i])<br><strong>5.</strong>    (s[i],res) <strong>&lt;-</strong> <strong>δ</strong>(s[i],exec[i].op)<br><strong>6.</strong>    <strong>if</strong> (i <strong>=</strong> exec[i].proc) <strong>then</strong> <br><strong>7.</strong>      proposal[i] <strong>&lt;-</strong> <strong>null</strong><br><strong>8.</strong>      result[i] <strong>&lt;-</strong> res</pre><p>The local variable <strong>k[i]</strong> refers to the execution round for process <strong>i</strong>, <strong>CONS</strong> refers to the shared consensus object indexed by the round. The local variable <strong>exec</strong> stores the result of the proposal. The pair <strong>s</strong> and <strong>res</strong> store the state and the result of the state machine transition function <strong>δ</strong> applied to the proposed operation and the current state.</p><p>The problem with the construction above is that it is not wait-free. Depending on the scheduler, the background processing loop may never terminate. We can fathom an adverse schedule wherein a given process never “wins” consensus, thereby creating an unbounded delay. In order to make the construction wait-free we have to make a few adjustments. To account for the possibility of failures of other processes, each process can keep an array of sequence numbers representing the last operation applied by every other process. Only if sequence numbers of other processes are increasing — as communicated by a shared register — do they participate in proposals. Therefore, if a process fails, its proposals no longer count and can’t prevent other processes from making progress.</p><p>The operation is defined much in the same way as before, however each process stores the last operation it applied in a shared register <strong>LAST_OP</strong> and keeps track of the sequence number of the operations applied across other processes using the local variable <strong>last_sn</strong>.</p><pre><strong>1. let</strong> perform args <strong>=</strong><br><strong>2.</strong>   result[i] <strong>&lt;-</strong> <strong>null</strong><br><strong>3.</strong>   LAST_OP[i] <strong>&lt;-</strong> (&quot;op(param)&quot;,last_sn[i][i] + 1)<br><strong>4.</strong>   <strong>wait</strong> (result[i] <strong>!=</strong> <strong>null</strong>)<br><strong>5.</strong>   result[i]</pre><p>The background process constructs a proposal by collecting operations executed by processes since the last round. The local variable <strong>last_sn</strong> keeps track of the operations applied across all other processes. Then, for each pair of operation and process in the decided proposal, the background task executes the transition function <strong>δ</strong> and increments <strong>last_sn</strong> for the corresponding process. If the process is the calling process we assign the result as before.</p><pre><strong> 1. while true do</strong><br> <strong>2.</strong>   prop[i] = []<br> <strong>3.</strong>   <strong>for</strong> j <strong>in</strong> [1..n] <strong>do</strong><br> <strong>4.</strong>     <strong>if</strong> (LAST_OP[j].sn <strong>&gt;</strong> last_sn[i][j]) <strong>then</strong><br> <strong>5.</strong>       prop[i] <strong>+=</strong> (LAST_OP[j].op,j)<br> <strong>6.</strong>   <strong>if</strong> (prop[i] <strong>!=</strong> []) <strong>then</strong><br> <strong>7.</strong>     k[i] <strong>&lt;-</strong> k[i] + 1<br> <strong>8.</strong>     exec[i] <strong>&lt;-</strong> CONS[k[i]].propose(prop[i])<br> <strong>9.</strong>     <strong>for</strong> r = 1 <strong>to</strong> <strong>|</strong>exec[i]<strong>|</strong> <strong>do</strong><br><strong>10.</strong>       (s[i],res) <strong>&lt;-</strong> <strong>δ(</strong>s[i],exec[r].op)<br><strong>12.</strong>       <strong>let</strong> j <strong>=</strong> exec[i].proc<br><strong>13.</strong>       last_sn[i][j] <strong>&lt;-</strong> last_sn[j] + 1<br><strong>14.</strong>       <strong>if</strong> (i <strong>=</strong> j) <strong>then</strong> <br><strong>15.</strong>         result[i] <strong>&lt;-</strong> res</pre><p>To prove that this construction is wait-free we must ascertain that for a non-failing process <strong>i</strong>, line 4 in the definition of the operation eventually completes, which means that line 15 in the background task must eventually execute. We claim that there is a decided proposal containing process <strong>i</strong>. If the claim is true, then at some point, the test on line 14 will be true and the desired outcome is achieved. We can prove the claim by considering the loop on line 3. It traverses the entire list of processes and includes the operation of each process in the proposal of the calling process. This means that at some point, the proposals of <em>all</em> processes contain process <strong>i</strong>,<strong> </strong>proving our claim<strong>.</strong></p><h3>Related Work</h3><ul><li><strong>The Glitch Phenomenon </strong>— Leslie Lamport regards consensus to be an intrinsic puzzle of the universe and <a href="https://www.microsoft.com/en-us/research/publication/on-the-glitch-phenomenon/">gave this idea a topological presentation</a>, which of course <a href="https://medium.com/@eulerfx/the-asynchronous-computability-theorem-171e9d7b9423">the author is quite fond of</a>. The idea is perhaps even more fundamental than the consensus object considered herein. Consider an <a href="https://en.wikipedia.org/wiki/Arbiter_(electronics)">arbiter</a> — an electronic device which must decide which input signal from a processor arrived first in order to determine whom to grant access to a shared resource. If the requests arrive within a short duration of each-other, the arbiter may have a hard time deciding. While an engineering solution was found at some point the 70s, the underlying problem endures. Lamport provides a formalism wherein the device is represented as a continuous function between inputs and outputs, where inputs and outputs are themselves time-dependent function spaces (<em>behaviors</em> for those familiar with <a href="http://conal.net/papers/icfp97/">FRP</a>). The continuity of this function representing the device is what ultimately allows for the possibility of an unstable state. It can be shown that the the space of outputs for which a timely decision is made is not connected — only becoming connected if time is allowed to tend to infinity. If we assume that the input space is connected, a continuous mapping between the input space and the output space requires use of a <a href="http://mathworld.wolfram.com/ZeroFunction.html">zero-function</a> — which corresponds to an inability to decide.</li><li><strong>Event-Sourcing —</strong>Among other nice properties, event-sourcing enables state-machine replication. Recall that in the consensus hierarchy described above, a queue with a peek operation has an infinite consensus number. Such a queue is what we get from the log at the heart of an event-sourcing architecture. This log is not just a simple queue, because rather than providing a dequeue operation, it allows consuming processes to traverse the queue in order without mutating the queue itself. As a result, each process can reach a common state without interfering with other processes. With event-sourcing however, it is more typical for this common state to be reached asynchronously, thus foregoing linearizability with respect to the log, but it still maintains order and therefore sequential consistency.</li></ul><h3>Summary</h3><p>We’ve defined a universal construction — a mechanism for providing a linearizable wait-free implementation of any object given its sequential specification. This result exhibits a great degree of generality and demonstrates the fundamental nature of a consensus object at the core of the construction. Intuitively, it makes sense — consensus is fundamental because it is fundamentally lacking in a distributed system where each participant has their own, independent world view. Somewhat ironically, while consensus is universal it is also impossible <a href="https://medium.com/@eulerfx/the-asynchronous-computability-theorem-171e9d7b9423">as we have seen earlier</a>. So what are we to conclude from the impossibility and universality of consensus? It is important to regard consensus as a limiting notion — we can have consensus but with weaker termination properties, or stronger termination properties, but with a weaker notion of consensus. But once we have a consensus object, we can use it to implement any other object, and this implementation will be wait-free outside of any restrictions on wait-freedom imposted by the consensus object itself. Moreover, as we have seen in the aside on event-sourcing, consensus is useful with a weaker consistency condition.</p><h3>References</h3><ul><li><a href="https://www.microsoft.com/en-us/research/publication/make-multiprocessor-computer-correctly-executes-multiprocess-programs/">How to Make a Multiprocessor Computer That Correctly Executes Multiprocess Program</a></li><li><a href="https://cs.brown.edu/~mph/Herlihy91/p124-herlihy.pdf">Wait-Free Synchronization</a></li><li><a href="https://cs.brown.edu/~mph/HerlihyW90/p463-herlihy.pdf">Linearizability: A Correctness Condition for Concurrent Objects</a></li><li><a href="http://www.cs.tau.ac.il/~shanir/progress.pdf">On The Nature of Progress</a></li><li><a href="https://arxiv.org/abs/1704.01154">On The Glitch Phenomenon</a></li><li><a href="http://www.springer.com/us/book/9783642320262">Concurrent Programming: Algorithms, Principles, and Foundations</a></li><li><a href="https://groups.csail.mit.edu/tds/papers/Lynch/ioa-leavens.pdf">Using I/O Automata for Developing Distributed Systems</a></li><li><a href="http://delivery.acm.org/10.1145/80000/72992/p159-plotkin.pdf">Sticky Bits and Universality of Consensus</a></li></ul><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=feceead50641" width="1" height="1" alt="">]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[The Asynchronous Computability Theorem]]></title>
            <link>https://medium.com/@eulerfx/the-asynchronous-computability-theorem-171e9d7b9423?source=rss-a8651abcb80a------2</link>
            <guid isPermaLink="false">https://medium.com/p/171e9d7b9423</guid>
            <category><![CDATA[programming]]></category>
            <category><![CDATA[algebraic-topology]]></category>
            <category><![CDATA[concurrency]]></category>
            <category><![CDATA[distributed-systems]]></category>
            <category><![CDATA[topology]]></category>
            <dc:creator><![CDATA[Leo Gorodinski]]></dc:creator>
            <pubDate>Mon, 18 Sep 2017 13:18:20 GMT</pubDate>
            <atom:updated>2017-09-18T13:18:20.774Z</atom:updated>
            <content:encoded><![CDATA[<p>In this post I describe the <a href="http://cs.brown.edu/people/mph/HerlihyS99/p858-herlihy.pdf">Asynchronous Computability Theorem</a>, which uses tools from Algebraic Topology to show whether a task is solvable in a distributed system. The theorem yields quite readily to intuition, however we will take care to make our statements as precise as possible without getting too caught up in the details. Loosely speaking, the theorem states that a task is solvable with a distributed system if its input space maps continuously into its output space. In particular, a continuous mapping fails to exist when the spaces at hand are incompatible with respect to holes within them. The beauty of the theorem lies in that it provides a geometric interpretation of the execution of a distributed protocol, allowing us to draw on our innate intuition for space to tame the combinatorial explosion of states in concurrent computations. In what follows, we shall answer the following questions: What is a task? What does it mean for a task to be solvable with a distributed system? What are input and output spaces? What is a continuous map? Finally, we shall state the Asynchronous Computability theorem and use it to demonstrate the impossibility of consensus.</p><h3><strong>The Impossibility of Consensus</strong></h3><p>To gain intuition for tasks and solvability, we start with a well known impossibility result — <a href="https://groups.csail.mit.edu/tds/papers/Lynch/jacm85.pdf">Impossibility of Distributed Consensus with One Faulty Process</a>. Known colloquially as the FLP result, it states that processes communicating over an asynchronous network can’t all agree on a proposal if even one of the processes fails. An asynchronous network is one where communication delays can be arbitrarily long. The task in this case is consensus. Solvability corresponds to whether there exists a terminating algorithm which the participating processes can run to come to consensus. Since it is impossible to distinguish between a communication delay and a failed process, any consensus algorithm may not be able to terminate if a process fails. More generally, k-set agreement refers to tasks where processes must agree on at most k distinct values. The consensus task in FLP is 1-set agreement. It has been shown that as long as <strong>k &lt; n</strong>, the k-set agreement task is unsolvable with a wait-free protocol <a href="http://dl.acm.org/citation.cfm?id=167122">[8]</a>. That the algorithm is deterministically terminating is a key stipulation of this impossibility result. In the next section, we discuss solutions to consensus wherein the termination property is relaxed.</p><h3><strong>The Alpha &amp; Omega of Consensus</strong></h3><p>In the analysis of distributed algorithms, we make assertions about the correctness of system states — safety properties, and the temporality of system states — liveness properties. Safety ensures that nothing bad happens, and liveness ensures that something eventually happens. The problem revealed by the FLP result is that of liveness — due to arbitrary timing delays, we can’t assert that a consensus algorithm will eventually terminate. Consensus protocols such as Paxos and Raft overcome this limitation through elaborate use of quorums and timeouts, or more generally, failure detectors <a href="http://dl.acm.org/citation.cfm?id=234549">[4]</a>. Indeed, one can decompose the problem of consensus along safety and liveness properties — the alpha and omega of consensus [3]. Alpha captures the safety properties of consensus — agreement, validity, and integrity, whereas omega captures liveness properties — the eventual termination of the algorithm. It is this latter component of consensus that does not admit a wait-free protocol.</p><h3><strong>Wait Freedom</strong></h3><p>Consensus is solvable, but as demonstrated by FLP and the Asynchronous Computability Theorem, it is not <em>wait-free</em> solvable. Wait freedom is a particular type of progress condition:</p><blockquote>No process can be prevented from completing an operation by undetected halting failures of other processes, or by arbitrary variations in their speed. <a href="https://cs.brown.edu/~mph/Herlihy91/p124-herlihy.pdf">[6]</a></blockquote><p>In this way, wait-freedom gives us a type of fault-tolerance particularly important in a distributed system where components fail independently. This is a remarkable global reasoning property that ensures that each non-faulty process of a distributed protocol will make progress regardless of failures of other processes, and do so in a bounded number of steps. It should be noted that the wait freedom property is perhaps too strong — the absence of a wait-free protocol for a task does not mean it can’t be solved with weaker progress guarantees. For example, an achievable progress property for consensus is anon-blocking guarantee which states that eventually progress can be made — albeit slowed arbitrarily by retries in case of collision.</p><h3><strong>System Model</strong></h3><p>The system model used in defining the Asynchronous Computability Theorem is based on <a href="https://en.wikipedia.org/wiki/Input/output_automaton">I/O automata</a>. An I/O automaton is defined as a set of states, events and a transition relation. Individual processes are represented as automatons — Turing machines — with well-defined start and finish events, as well as events corresponding to interaction with a shared memory object. The shared memory object is a linearizable atomic read/write memory, represented as an automaton with events corresponding to read and write operations. Processes have exclusive write access to their memory cell, but may read the contents of memory cells belonging to other processes. A <em>protocol</em> is defined as a set of processes and a shared memory object.</p><blockquote>Interestingly, the methodology used to establish the result in this post has been used to <a href="http://dl.acm.org/citation.cfm?id=277722">unify</a> the synchronous and asynchronous message passing models.</blockquote><h3><strong>Tasks</strong></h3><p>Informally, a task is a problem in a distributed system. Formally, a task is a tuple <strong>⟨I</strong>,<strong>O,Δ⟩</strong> where <strong>I</strong> is a set of input vectors, <strong>O </strong>a set of output vectors and <strong>Δ</strong> a specification relation mapping inputs to outputs. An input vector represents a possible assignment of input values to the processes of a distributed system. Similarly, an output vector represents process outputs at the end of execution. Components of input and output vectors correspond to processes, such that <strong>I[i]</strong> represents the input to the i-th process and <strong>O[i]</strong> its output. If <strong>I[i] = ⊥ </strong>then process <strong>i</strong> does not participate in the execution. If a process <strong>i</strong> fails, then <strong>O[i]=⊥</strong>.<strong> </strong>The task specification relation <strong>Δ</strong> maps input vectors to sets of allowable output vectors.</p><p>Take for example the binary consensus task. Each process starts with an input value of <strong>0</strong> or <strong>1 </strong>and at the end of execution, chooses <strong>0</strong> or <strong>1</strong> such that they all agree and the chosen value was some processes’s input. A task specification for a two process binary consensus task is shown in Figure 1 bellow:</p><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/bffd697988186f079a04ba87bb606c27/href">https://medium.com/media/bffd697988186f079a04ba87bb606c27/href</a></iframe><h3><strong>Solvability</strong></h3><p>Recall that a protocol is a collection of processes interacting through shared memory to solve a task. A protocol <em>solves</em> a task if 1) no process takes a infinite number of steps to finish, and 2) the output value of a process is consistent with the task specification <strong>Δ</strong>. A protocol <em>wait-free solves</em> a task if it solves the task even if all but one of the processes fail.</p><h3><strong>Topological Definitions</strong></h3><p>Topology is a branch of mathematics which studies continuous transformations of space. It abstracts the concept of the familiar Euclidean space to spaces that are considerably more general. In particular, a <a href="https://en.wikipedia.org/wiki/Topological_space">topological space</a> need not impose a notion of distance between its points, while still retaining a rigorous treatment of continuity and connectedness. The points in a topological space can represent points of a Euclidean space, but they can also represent functions, other topological spaces, and in our present discussion, configurations of a distributed system.</p><p>Connectedness refers to a space being “all in one piece”, and allows us to formally classify spaces according to the number of “holes” they contain. A familiar example is the correspondence of a bagel and a coffee mug — topologically speaking, these spaces are equivalent because one can be continuously transformed into the other. On the other hand, a bagel is incompatible with a solid ball because the former contains a hole. If we represent the state of distributed system as a particular type of topological space, we can characterize computability as compatibility between spaces. In the next section, we will introduce the concept of a <em>simplex</em> — a member of a topological space known as a <em>simplicial complex</em>. We will then represent a configuration of a distributed system of n + 1 processes as an n-dimensional simplex, a set of possible configurations as a simplicial complex, and wait-free solvability as a continuous mapping between an input complex and an output complex. Despite the generality of topological spaces, we can still appeal to our geometric intuitions in order to visualize their properties. Indeed, every topological space <strong>X</strong> has a <em>geometric realization</em> |<strong>X</strong>| which regards the space as a subset of a high-dimensional Euclidean space.</p><h3><strong>Simplexes</strong></h3><p>A simplex is a generalization of a triangle to arbitrary dimensions. A triangle is a 2-simplex, a 1-simplex is a line segment, a 0-simplex a point, a 3-simplex a tetrahedron, etc. Simplices themselves are built from lower dimensional simplices called <em>(proper) faces</em> of the simplex. For example, a triangle can be formed by gluing together line segments at the appropriate ends. A line segment can be formed by gluing its two endpoints.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/898/1*Pg_IU_LiESGl3D9ksrARSw.png" /><figcaption>Figure 2: a 0-simplex, 1-simplex, 2-simplex, 3-simplex.</figcaption></figure><p>More precisely, a simplex consists of a set of vertices which are <em>affinely independent</em>. In essence, this means that the vertices are distinct such that if they were represented as vectors, there is no pair where one vector can be scaled into the other. In case of a 2-simplex, this means that there doesn’t exist a line segment on which all three vertices lie.</p><p>In distributed system, a simplex can be used to represent a configuration — an assignment of states to each process. For example, a triangle can represent a system of three processes, each vertex corresponding to a particular state of each process. The dimension of a simplex corresponds to the notion of a process <em>participating</em> in a protocol. If a process fails, it ceases to participate, and the resulting simplex has a smaller dimension.</p><h3><strong>Simplicial Complexes</strong></h3><p>A simplicial complex is a collection of simplices <strong>K</strong>, such that every face of every simplex in <strong>K</strong> is also a simplex of <strong>K</strong>, and every intersection of a pair of simplexes is also a simplex of <strong>K</strong>. Due to the second property, the formation of a complex does not result in any new simplices. A simplicial complex thereby forms a topological space, with simplexes as the points.</p><p>For our purposes, a simplicial complex corresponds to a set of possible configurations of a distributed system. We take individual configurations and connect them at their intersections. In this way, we obtain a geometric interpretation of similar system states — similar configurations are those which intersect at their boundaries. Moreover, unlike graph-theoretic models, we have the dimension of the intersection as the <em>degree</em> of similarity between states.</p><p>The following diagram depicts a 2 dimensional complex — a 2-complex — consisting of two simplices — <strong>⟨P0, Q1, R2⟩</strong> and <strong>⟨P3, Q1, R2⟩</strong>. These in turn consist of 1-simplices such as <strong>⟨P0, Q1⟩ </strong>and 0-simplices such as <strong>⟨P0⟩</strong>. The intersection of the 2-simplices is the 1-simplex <strong>⟨Q1, R2⟩ </strong>and represents the fact that the two system states are similar — the only differ in the state of process <strong>P</strong>.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/832/1*UuU7UrY_r2vbhfX5If9VlQ.png" /><figcaption>Figure 3: a complex for a 2 state system with 3 processes. [Credit 1]</figcaption></figure><p>The simplex <strong>⟨P0, Q1, R2⟩ </strong>can represent a configuration of three processes, <strong>P</strong>, <strong>Q</strong> and <strong>R</strong> with local states <strong>0</strong>,<strong>1</strong> and <strong>2</strong> respectively. The simplex <strong>⟨P3, Q1, R2⟩ </strong>represents a configuration different only in that <strong>P</strong> has local state <strong>3</strong>.</p><h3><strong>Simplicial Maps</strong></h3><p>A <em>simplicial map</em> between complexes <strong>K</strong> and <strong>L</strong> carries vertexes of <strong>K</strong> to vertexes of <strong>L</strong> and also simplexes of <strong>K</strong> to simplexes of <strong>L</strong>. In other words, if a set of vertexes span a simplex in <strong>K</strong>, their image under a simplicial map will also span a simplex. A simplicial map can however <em>collapse</em> a simplex if the dimension of the simplex under the map is zero. A simplicial map that does not collapse any simplices it is called <em>non-collapsing</em>. We shall make use of simplicial maps two both formalize the notion of associating processes with vertexes and to represent transitions between system states.</p><h3><strong>Chromatic Complexes</strong></h3><p>A <em>coloring</em> of a n-dimensional complex <strong>K</strong> is a non-collapsing simplicial map from <strong>K</strong> to an n-dimensional simplex. The vertexes of the target simplex correspond to distinct colors, and a coloring is thus a labeling of the vertexes such that no two neighboring vertexes have the same color. Take note of the non-collapsing property of the simplicial map at play— if two neighboring vertexes in the complex were to map to the same vertex in the simplex — or in other words, share a color — it would result in the collapse of the simplex formed by the vertexes in the complex. A complex <strong>K</strong> together with a coloring <strong>χ</strong> is called a <em>chromatic complex </em>(<strong>K</strong>,<strong>χ</strong>). A simplicial map is <em>color-preserving</em> if the color of a source vertex is the same as the color of the destination vertex. Coloring provides a mechanism to associate the vertexes of a complex to individual processes. Color-preserving maps represent state transitions wherein the processes maintain their identity.</p><p>We can now provide a topological rendering to the formal definition of tasks described above as triples <strong>⟨I</strong>,<strong>O,Δ⟩. </strong>We represent an input vector <strong>i ∈ I</strong> as a simplex <strong>S(i)</strong> whose vertices are pairs of processes and their values. An output vector <strong>o ∈ O</strong> is represented as the simplex <strong>S(o) </strong>accordingly <strong>. </strong>The topological task specification is set of all pairs (<strong>S(i), S(o)) </strong>wherein <strong>Δ</strong> holds.</p><h3><strong>Subdivisions</strong></h3><p>A <em>subdivision</em> of a complex <strong>K</strong> is a complex <strong>σ(K) </strong>such that:</p><ul><li>Each simplex of <strong>σ(K) </strong>is contained in a simplex in<strong> K</strong></li><li>Each simplex of <strong>K</strong> can be formed by combining simplices in <strong>σ(K)</strong></li></ul><p>A subdivision can be thought of as a triangulation of a complex. A <em>carrier</em> of a simplex <strong>S </strong>in a subdivision <strong>σ(K), </strong>denoted by <em>carrier</em>(<strong>S</strong>,<strong>K</strong>), is the unique smallest simplex <strong>T</strong> in the original complex <strong>K</strong> such that <strong>S</strong> is contained in <strong>T</strong>. A <em>chromatic subdivision</em> is a subdivision of a chromatic complex (<strong>K</strong>,<strong>χ</strong>) such that <strong>σ(K)</strong> is a subdivision of <strong>K</strong>, and for all <strong>S</strong> in <strong>σ(K), </strong>its set of colors in the subdivision is a subset of the colors of its carrier. A chromatic subdivision is therefore a subdivision wherein all constituent simplices are assigned a color in accordance with the colors assigned to their carriers. In Figure 4, the 2-simplex shaded orange on the right hand side of the diagram is a simplex of a chromatic complex. The chromatic subdivision of this complex is shown on the left. On the left is also shown a 2-simplex in the chromatic subdivision. The shaded 2-simplex on the right is its carrier. Moreover, the colors of the vertexes of the 2-simplex on the right are a taken from colors of its carrier.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/914/1*aj1u2QHClKEbP9Q61-HJxw.png" /><figcaption>Figure 4: a simplex in a subdivision (left) and its carrier (right). [Credit 1]</figcaption></figure><p>Chromatic subdivisions model the <em>execution</em> of a protocol. If a protocol starts with a given input simplex, a round of execution subdivides the simplex, such that the subdivision represents the reachable states of the system after the round. These reachable states form the <em>protocol complex</em> which reflects the structure of the protocol.</p><p>To illustrate the unfolding of a protocol complex, consider the following 2-process task. The two processes, <strong>P</strong> and <strong>Q</strong> have inputs <strong>p</strong> and <strong>q</strong> respectively, represented by the input simplex in Figure 5. The processes interact using a shared memory object with two cells — one for each process. Each process first writes its input to shared memory and then scans the entire memory object. The processes run concurrently, but there are only three possible executions:</p><ol><li>If <strong>P</strong> reads before <strong>Q</strong> writes then <strong>P</strong>’s view is <strong>(p,⊥) </strong>and <strong>Q</strong>’s is <strong>(p,q)</strong>.</li><li>If both <strong>P</strong> and <strong>Q</strong> read after both have written then both have view <strong>(p,q)</strong>.</li><li>If <strong>Q</strong> reads before <strong>P</strong> writes then <strong>Q</strong>’s view is <strong>(⊥,q) </strong>and P’s is <strong>(p,q)</strong>.</li></ol><p>Each executon is represented as a simplex in the diagram bellow:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/428/1*rnq9mwD8ECKtK6FfKnklUQ.png" /><figcaption>Figure 5: protocol complex for a simple protocol. [Credit 1]</figcaption></figure><p>Execution 1 is the 1-simplex <strong>⟨P(p,⊥),Q(p,q)⟩ </strong>at the bottom of Figure 5, execution 2 is <strong>⟨P(p,q),Q(p,q)⟩ </strong>and execution 3 is <strong>⟨P(p,q),Q(⊥,q)⟩</strong>. These three simplexes form the <em>protocol complex</em>. If we consider the original input simplex, then the protocol complex is a subdivision, dividing the input simplex into three pieces. Note also how the vertex <strong>P(p,q) </strong>is an intersection of two 1-simplexes. This is a geometric representation of the fact that in this state, <strong>P</strong> can’t tell between executions 2 and 3.</p><h3><strong>Connectivity</strong></h3><p>A graph is connected if there is a path between every pair of vertices. Similarly, a simplicial complex is 0-connected if there is a path between every pair of vertices. More generally, a simplex is 1-connected if any loop can be continuously deformed to a point, a ball for a 2-connected space, and so on. This generalization can be visualized with the following diagram:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/634/1*aG6GqjV8giOYznFDgZdKTw.png" /><figcaption>Figure 6: left — contractible loop, right — not contractable. [Credit 1]</figcaption></figure><p>The loop at the top of Figure 5 can indeed be contracted to a point. On the other hand, the loop at the bottom cannot be contracted to a point because of the “hole” in the way. Therefore, while the shape in Figure 5 above is 0-connected, it isn’t 1-connected. We can also view connectivity from a combinatorial perspective — if <strong>K</strong> and <strong>L</strong> are n-connected complexes such that their intersection <strong>K</strong> ∩ <strong>L</strong> is (n-1) connected, then their union <strong>K ∪ L</strong> is n-connected. Thus we can infer connectivity of a composite complex via connectivity of its constituents.</p><p>Connectivity has the pleasant property that if a complex is n-connected, the complex with a dimension removed — a faulty process — is (n-1) connected, thereby extending all assertions to configurations where some processed have faulted. A complex is disconnected if it fails to be connected. For example, a complex representing the final states of a system, can be disconnected if certain configuration are not admitted under the task specification, creating “holes” in the complex. These holes, in turn, present a challenge to constructing a simplicial map to this complex. In the context of distributed systems, connectivity maps to the notion of <em>reachability</em> of system states. A particularly notable property of a wait-free protocol is that the protocol complex for any input n-simplex is n-connected <a href="http://people.csail.mit.edu/shanir/publications/HS-topology.pdf">[1]</a>.</p><h3><strong>The Asynchronous Computability Theorem</strong></h3><p>The definitions provided so far should suffice to state the Asynchronous Computability Theorem:</p><blockquote><em>A task </em><strong><em>⟨I</em></strong><em>,</em><strong><em>O,Δ⟩</em></strong><em> has a wait-free protocol if and only if there exists a chromatic subdivision </em><strong><em>σ</em></strong><em> of </em><strong><em>I</em></strong><em> and a color-preserving simplicial map</em></blockquote><blockquote><strong><em>μ</em></strong><em> : </em><strong><em>σ</em></strong><em>(</em><strong><em>I</em></strong><em>) → </em><strong><em>O</em></strong></blockquote><blockquote><em>such that for each vertex </em><strong><em>s</em></strong><em> </em><strong><em>∈</em></strong><em> </em><strong><em>σ</em></strong><em>(</em><strong><em>I</em></strong><em>), </em><strong><em>μ(s) </em></strong><em>∈</em><strong><em> Δ</em></strong><em>(carrier(</em><strong><em>s</em></strong><em>,</em><strong><em>I</em></strong><em>)).</em></blockquote><p>The task tuple consists of an input complex, output complex and a task specification determining valid outputs. The subdivision reflects the execution of the protocol. The existence of the map, and specifically a simplicial map is what governs the possibility of a wait-free solution. The beauty of this theorem lies in that it gives a purely static, topological approach to problems which are typically modeled operationally as computations unfolding in time.</p><p>The theorem can be visualized using Figure 7 bellow:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/956/1*pSoVu5tcycF_yp5uUhx5VA.png" /><figcaption>Figure 7: The Asynchronous Computability Theorem. [Credit 1]</figcaption></figure><p>On the left is an input complex a 3-simplex (tetrahedron). We have a subdivision of the input complex and one simplex in the subdivision is shaded orange. We also have an output complex — a torus (this actually represents an output complex for a <a href="http://www.cs.yale.edu/homes/aspnes/pinewiki/Renaming.html">renaming task</a>). The simplicial map <strong>μ </strong>takes a simplex in the subdivision of the input complex to a simplex in the output complex. Moreover, this simplex in the output complex is a member of the sub-complex which represents the acceptable outputs of the task under <strong>Δ</strong> given the input simplex containing the subdivision (colored white).</p><h3>Impossibility of Consensus, Topologically</h3><p>We can now demonstrate the impossibility of consensus topologically, using the Asynchronous Computability Theorem. We begin by illustrating the challenges presented by consensus in a geometric way. Suppose we have two processes <strong>P</strong> and <strong>Q</strong> which are given inputs <strong>0</strong> or <strong>1</strong>. A state of a system where process <strong>P</strong> has input <strong>0</strong> and only process <strong>P</strong> executes, is represented as a vertex labeled <strong>P0</strong>. A configuration wherein both <strong>P</strong> and <strong>Q</strong> participate, starting with inputs both equal to <strong>0</strong> is represented as a 1-simplex <strong>⟨P0,Q0⟩</strong>. The decision of the protocol in both of these configurations is simple — it always decides <strong>0 </strong>since there are no alternatives to choose from. The situation gets more complicated when the processes have distinct inputs. According to the task, they must both decide one value or the other. The following diagram shows the input and output complexes for this example:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/580/1*C3533H7Dc-2akeWBQU-2hA.png" /><figcaption>Figure 8: Complexes for 2-process consensus. [Credit 1]</figcaption></figure><p>The input complex for the 2-process binary consensus task consists of four vertexes, corresponding to the pairs of process and input value. The four edges correspond to the four possible 2-process configurations. The output complex captures the fact that the processes must decide on the same value. The output complex is disconnected and therefore incompatible with the input complex which is connected. Since subdivisions and simplicial maps preserve connectivity, invocation of the Asynchronous Computability Theorem leads us to a contradiction — if we suppose such a map <strong>μ</strong> exists, it cannot be simplicial.</p><p>Given this intuition, we can depict the impossibility of consensus in a more rigorous manner. In fact, we shall make a stronger statement by showing the impossibility of k-set agreement described above. Recall that k-set agreement is a task wherein the participating processes are allowed to select up to k distinct values. If k is less than the number of processes, then this task is impossible to solve wait-free. Consensus is 1-set agreement. To express this result using the Asynchronous Computability Theorem, we make use of a standard result from algebraic topology known as <em>Sperner’s Lemma</em>.</p><h4>Sperner’s Lemma</h4><p><a href="https://en.wikipedia.org/wiki/Sperner%27s_lemma">Sperner’s Lemma</a> is a combinatorial analog of the <a href="https://en.wikipedia.org/wiki/Brouwer_fixed-point_theorem">Brouwer Fixed Point Theorem</a>:</p><blockquote><em>Let </em><strong><em>σ(S)</em></strong><em> be a subdivision of an n-simplex </em><strong><em>S</em></strong><em>. If </em><strong><em>f : σ(S)</em></strong><em> → </em><strong><em>S</em></strong><em> is a map sending each vertex of </em><strong><em>σ(S)</em></strong><em> to a vertex in its carrier, then there is at least one n-simplex </em><strong><em>T</em></strong><em> in </em><strong><em>σ(S)</em></strong><em> such that for each vertex </em><strong><em>v</em></strong><em> of </em><strong><em>T</em></strong><em>, the </em><strong><em>f(v)</em></strong><em> are distinct.</em></blockquote><p>A good way to think of Sperner’s lemma is in terms of subdivisions as discussed earlier. Take the chromatic subdivision shown on the left of Figure 4. Sperner’s lemma states that there always exists a simplex in this subdivision who’s vertexes are all distinct colors. This simplex with distinctly colored vertexes is the “fixed point” of the subdivision, and the lemma states that such a fixed point always exists. The following diagram is an example:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/498/1*8TlWE5-HAtnauaNlqnVSrQ.png" /><figcaption>Figure 9: Sperner’s lemma. [Credit: Wikipedia]</figcaption></figure><p>The vertexes of the outer 2-simplex are colored red, green and blue. The 2-simplexes in a subdivision are colored with colors in this set, and the 2-simplexes shaded gray make use of all three colors. Sperner’s lemma guarantees that at least one such simplex exists in the subdivision.</p><p>Sperner’s lemma applies to impossibility of consensus in the following way. Consider a configuration with three processes each with a distinct input value. Suppose also that k is 2 — we can allow up to two distinct output values. We can represent this configuration as a 2-simplex <strong>I</strong> colored with three distinct colors and labeled with three distinct values. The output complex <strong>O</strong> consists of simplexes labeled with up to two distinct values. Assume by way of contradiction that a protocol does exist. This means there is a subdivision <strong>σ</strong>(<strong>I</strong>) and a simplicial map <strong>μ</strong> : <strong>σ</strong>(<strong>I</strong>) → <strong>O</strong>.<strong> </strong>This map satisfies the conditions of Sperner’s lemma, and therefore carries some simplex in the subdivision to an output simplex labeled with all three values. However, the output complex has no such simplex — we would thus map to a “hole” in the output complex. Note that when using Sperner’s lemma we considered the labeling of vertexes not only with the process it represents, but also with the local state of the process.</p><h3><strong>Proof Sketch</strong></h3><p>For the complete details of the proof, refer to [1] where Herlihy and Shavit prove both directions — that the conditions of the theorem are necessary and sufficient.</p><p>Necessity states that a decision task has a wait-free protocol <em>only if</em> there is a subdivision and simplicial map as required by the theorem. Informally, the proof of necessity is based on the notion of a protocol complex described earlier. Recall that a protocol complex represents the possible executions of a protocol. It is shown that a protocol complex <strong>P(I)</strong> corresponds to a subdivision of the input complex, and that the simplicial map is induced by a final decision transition in the protocol called the <em>decision map</em>. The simplicial map <strong>μ</strong> then expressed as a composition of <strong>φ :</strong> <strong>σ(I) </strong><em>→</em> <strong>P(I)</strong>, the map defining the protocol complex,<strong> </strong>and <strong>δ : P(I) </strong><em>→</em><strong> O</strong>, the decision map. The proof proceeds by stating and proving lemmas about connectivity, which as we discussed earlier corresponds to reachability in a distributed system.</p><p>Sufficiency states that <em>if</em> we have such a subdivision and simplicial map, <em>then</em> the protocol wait-free solves the task. This is done by providing a construction of an algorithm for any task satisfying the conditions of the theorem. They proceed by reducing the problem to approximate agreement which is solved by the participating set protocol <a href="http://dl.acm.org/citation.cfm?id=164056">[10]</a> and invoking the <a href="https://en.wikipedia.org/wiki/Simplicial_approximation_theorem">Simplicial Approximation Theorem</a>, which essentially states that a solution can be attained after a sufficient number of execution rounds. Interestingly, this line of reasoning leads to results about the complexity of asynchronous computations [12].</p><h3><strong>Related Work</strong></h3><ul><li>The <a href="http://bloom-lang.net/calm/">CALM Principle</a> states that if a program is expressed using monotonic logic — or loosely speaking logic where you can’t revoke past statements — then it can be made eventually consistent in a distributed system in a way that does not require coordination. So, a protocol refuting an earlier claim necessitates coordination amongst the involved processes. With the Asynchronous Computability Theorem in mind, one would expect that protocols expressed using monotonic logic are able to solve their tasks in a wait-free manner.</li><li><a href="http://www.vldb.org/pvldb/vol8/p185-bailis.pdf">Coordination Avoidance in Database Systems</a> presents necessary and sufficient conditions for safe, coordination-free execution. Similarly, <a href="http://www.vldb.org/pvldb/vol7/p181-bailis.pdf">Highly Available Transactions: Virtues and Limitations</a> discusses consistency guarantees which can be provided in a highly-available manner. If we express an application consistency requirement as a task using the framework presented herein, it should be possible to asses wait-free solvability of this task using the Asynchronous Computability Theorem. If the task is wait-free solvable, it should be possible to implement it without coordination.</li><li>In our discussion of connectivity, the combinatorial connectivity theorem had the property that allowed us to infer connectivity of complexes based on their constituent complexes. This notion of crafting global observations from local ones is captured by the <a href="https://en.wikipedia.org/wiki/Gluing_axiom">gluing axiom</a> which forms the foundation of mathematical structures called <a href="https://en.wikipedia.org/wiki/Sheaf_(mathematics)">sheaves</a>. Using sheaves to provide a semantics for distributed systems is attempted in <a href="http://www.sciencedirect.com/science/article/pii/S1571066108005264">Sheaves, Objects, and Distributed Systems</a>.</li><li>Geometric renderings of distributed systems have been around for quite a while. Indeed one can regard <a href="https://en.wikipedia.org/wiki/Minkowski_space">Minkowski space</a> — the mathematical formulation of <a href="https://en.wikipedia.org/wiki/Special_relativity">Special Relativity</a> — as a <a href="http://courses.csail.mit.edu/6.852/01/papers/VirtTime_GlobState.pdf">model for a distributed system</a>. See also: <a href="http://dl.acm.org/citation.cfm?id=99625">Modeling Concurrency with Geometry</a>, <a href="http://www-home.math.uwo.ca/~jardine/papers/lectures/Dagstuhl3.pdf">Homotopy Theory and Concurrency,</a> and <a href="https://pdfs.semanticscholar.org/2cae/af7f3722ecbf5d830ce13b42572672c3da51.pdf">Interaction Categories</a>. All make use of tools for understanding space as a means for understanding concurrency. Perhaps if we look at <a href="https://en.wikipedia.org/wiki/Homotopy_type_theory">Homotopy Type Type</a> theory as the <a href="http://Homotopy type theory: the logic of space">logic of space</a>, we can inch <a href="https://golem.ph.utexas.edu/category/2011/04/higher_categories_for_concurre.html">closer</a> toward semantic foundations for concurrency and distributed systems.</li><li>Perhaps by computing the <a href="https://jeremykun.com/2013/01/12/the-fundamental-group-a-primer/">Fundamental Group</a> of a topological space we can automate the reasoning required to determine whether a task is solvable, and allow tasks to adapt dynamically to their environment.</li></ul><h3>Acknowledgements</h3><p>Most of the theoretical content and diagrams in this post are taken from the <a href="http://cs.brown.edu/people/mph/HerlihyS99/p858-herlihy.pdf">The Topological Structure of Asynchronous Computability</a> by M. Herlihy and N. Shavit.</p><h3><strong>References</strong></h3><p>[1] <a href="http://cs.brown.edu/people/mph/HerlihyS99/p858-herlihy.pdf">The Topological Structure of Asynchronous Computability</a></p><p>[2] <a href="http://Impossibility of Distributed Consensus with One Faulty Process">Impossibility of Distributed Consensus with One Faulty Process</a></p><p>[3] <a href="https://infoscience.epfl.ch/record/89695/files/bxl046.pdf">The Alpha of Indulgent Consensus</a></p><p>[4] <a href="http://dl.acm.org/citation.cfm?id=234549">The weakest failure detector for solving consensus</a></p><p>[5] <a href="https://cs.brown.edu/~mph/HerlihyW90/p463-herlihy.pdf">Linearizability: A Correctness Condition for Concurrent Objects</a></p><p>[6] <a href="https://cs.brown.edu/~mph/Herlihy91/p124-herlihy.pdf">Wait-free Synchronization</a></p><p>[7] <a href="https://www.amazon.com/Concurrent-Programming-Algorithms-Principles-Foundations/dp/3642446159/">Concurrent Programming: Algorithms, Principles, and Foundations</a></p><p>[8] <a href="http://dl.acm.org/citation.cfm?id=167122">Wait-free k-set agreement is impossible: the topology of public knowledge</a></p><p>[9] <a href="http://www.sciencedirect.com/science/article/pii/S1571066108005264">Sheaves, Objects, and Distributed Systems</a></p><p>[10] <a href="http://dl.acm.org/citation.cfm?id=164056">Immediate atomic snapshots and fast renaming</a></p><p>[11] <a href="http://dl.acm.org/citation.cfm?id=277722">Unifying synchronous and asynchronous message-passing models</a></p><p>[12] <a href="https://groups.csail.mit.edu/tds/papers/Hoest/HS-podc_final.pdf">Towards a Topological Characterization of Asynchronous Complexity</a></p><p>[13] <a href="http://www.springer.com/us/book/9780387943275">Algebraic Topology</a></p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=171e9d7b9423" width="1" height="1" alt="">]]></content:encoded>
        </item>
    </channel>
</rss>