WWDC 2025

I recently found my bag from WWDC 2000. It’s stunning that was 25 years ago.

A photo of a satchel from Apple's World Wide Developer Conference held in 2000.
The satchel I got from WWDC 2000, held in San Jose, CA just prior to my birthday.

This year, I’m looking forward to WWDC from a different perspective – as a full time Apple employee. I contracted with Apple in 2020 for roughly 18 months. I had an opportunity to switch to full time then, but it didn’t quite align at the time. Fast forward a few years, and I ran across a position that was perfectly aligned for me, so I dove on it – and it worked out. As of March 2025, I’m employed by Apple in the Open Source Program Office. That has got to be the coolest part – not only for Apple, but almost everything I work on aligns with open source as well. Like I said, perfect for me.

As I’m writing this, it’s the Saturday before WWDC 2025, and I’m as excited as ever for what’s coming. I’m delighted with the swift.org redesign and its new tutorial for Swift on Server (The “getting started” link from Cloud Services on the swift.org website). And the neat tidbit? The tutorial, and the sample code for it, are open source: https://github.com/swiftlang/swift-server-todos-tutorial.

Looking back 25 years ago – before I picked up that bag – I was living in Columbia, MO. I worked as a staff member at the university in a central computing department, and was both trepidatious and hopeful for what Apple had in store as WWDC came around. WWDC was still quite small then – it hadn’t moved to the Moscone in San Francisco yet, or moved back. Steve Jobs had returned to Apple a couple years earlier, and my hopes were high. This was also about the time of the transition from macOS 9 to macOS X, and the overwhelming changes from that.

Later that year, I choose to take a flyer (leave of absence, technically) from the university, move to Seattle, live with some friends, and try out new opportunities. I’m glad I did, as Seattle has been great for both my wife and I, and we’ve been here since. That was right before the “dot.com” bust, but we weathered that – and the extractive economic insanity / recession in 2008 as well.

With the passing of Bill Atkinson, I’m also remembering the inspiration and excitement from years before due to Hypercard. It was one of the first tools that felt like an honest-to-god superpower. A lever that was “long enough to move the world”. I had similar feelings about the Cocoa frameworks and Objective-C. So many more amazing flowers of possibility have bloomed since then — Web Browsers and JavaScript, the iPhone and iOS, and more recently Swift and SwiftUI.

I wasn’t all that sure about Swift in its earliest years, but by the transition to Swift 4 it was very interesting. I think there’s a lot more potential there, and so many more things that Swift can enable; ideas it can power. I never imagined that I’d be working so closely with a programming language, as I spent most of my career working on (or with) backend and infrastructure services. I appreciate Swift for what it enables – and maybe more so for the people in the Swift community.

For good measure, I want to be clear that this blog still – as ever – represents my own voice, and my sometimes flawed ideas, expectations, explorations, or whatever. I don’t speak for my employer, or anyone else, here – never did.

Code Spelunking in DocC

Head’s up: this post is a technical deep dive into the code of DocC, the Swift language documentation system. Not that my content doesn’t tend to be heavily technical, but this goes even further than usual.

The Setup

While I was working on some documentation for the snippets feature in DocC, I ran into an issue with the mechanism to preview documentation. As soon as I added a snippet to an example project, the documentation would fail to preview about half the time. The command I use is:

swift package --disable-sandbox preview-documentation --target MyTarget

When I first started debugging this, I wasn’t sure what caused the issue. I opened a bug in swift-docc-plugin (spoiler: the bug wasn’t in swift-docc-plugin), thinking at first that it was always failing. As it turns out, it wasn’t always failing – my luck was just poor, and the issue intermittent. I had several commits in one of my side projects that added snippets, which I used to work through my documentation of the feature. In order to write up the issue with reasonable reproduction steps, I created a series of commands to verify the behavior I saw. The flow is pretty simple:

  1. clone the example project that illustrates the problem
  2. go to the commit that shows it working
  3. invoke the preview
  4. switch to the commit that shows it failing
  5. invoke the preview

At this point, I didn’t realize that the issue was intermittent, so I iterated back and forth between commits, cleaning the .build directory to see if that made a difference, and then ultimately noticed a change in behavior. At one point where I expected it to fail, it worked. Ah, glorious: A heisenbug. At least now I knew that I’d have to repeat process multiple times to get the issue to show. With that in mind, I was able to nail down the change in my project that started to illustrate the issue – it was when I added the first snippet.

There’s another project (the exemplar, really) that hosts snippets in its documentation – swift-markdown – that *never* exhibited this problem. That was a real head scratcher. But I did have a reliable reproduction, so I focused on that.

When I work on an intermittent bug, I try to get a debugger attached on the code that’s behaving badly. Because this was invoked through a SwiftPM plugin, I had my work cut out for me. Command plugins are separate executables that Swift package manager invokes internally. It is obnoxious to get a debugger attached to it. You can’t easily do it directly from within Xcode, because Xcode isn’t launching the executable. There’s a conversation about how to wrangle debugging a SwiftPM plugin on the Swift Forums that covers some of this. The way I resolved it this time is to put in a long sleep() command in the code of the plugin, run it through SwiftPM, use the terminal to hunt down the process ID that SwiftPM invoked, and attach the debugger to that ID. This is kind of a nasty manual process, so I used sleep(30) – I’m just not that fast at wrangling all the tools for this. I managed to get attached… and then realized I didn’t need to.

While I was looking at the process list through the terminal to get the process ID, I spotted that the process in question (the plugin) was invoking yet ANOTHER process in turn. I actually knew this previously, and just plain forgot. The swift package preview-documentation command is a light wrapper around docc’s preview command. While I wasted some time with the plugin, this made debugging significantly less painful. I could invoke an example using the docc binary directly. And yeah, it moved the target for what had the bug – it wasn’t in swift-docc-plugin.

Debugging preview in DocC

I closed the issue in the plugin and opened a new issue in swift-docc, summarizing what I’d learned and how to reproduce the issue. It was the end of a day when I got to this point, so I left things alone and came back the next morning. When I opened the issue, I verified the issue using the version of docc released with the Swift toolchain. In my case – that meant the version included in the toolchain that ships with Xcode 16.1.

When I jumped back in, I had a the intent to verify the same issue exhibits with the latest code – against the main branch. The issue request form asked for any issues to be verified against main in order to verify that it hasn’t already been resolved. There were also some comments in the issue – David referenced some other work pending that resolved some flaky tests, that – at a guess – might have an impact. So I buckled down to use the latest development branch of docc and repeat the process to verify the issue.

One of the quirks of verifying this issue is that docc is a separate project from the javascript single-page browser app (swift-docc-render) that displays the content. When you’re running docc from the main branch, it doesn’t know where that content lives – you need to tell it. Fortunately, that’s pretty easy. You set a specific environment variable and docc uses that to know where to load the content.

With that in place, and the example process invocation from my debugging the prior day, I had a way to run this directly. In the terminal, it looks something like:

export DOCC_HTML_DIR=/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/share/docc/render/

/Users/heckj/src/swift-project/swift-docc/.build/debug/docc preview \
/Users/heckj/src/Voxels/Sources/Voxels/Documentation.docc \
--emit-lmdb-index \
--fallback-display-name Voxels \
--fallback-bundle-identifier Voxels \
--additional-symbol-graph-dir /Users/heckj/src/Voxels/.build/plugins/Swift-DocC\ Preview/outputs/.build/symbol-graphs/unified-symbol-graphs/Voxels-7 \
--output-path /Users/heckj/src/Voxels/.build/plugins/Swift-DocC\ Preview/outputs/Voxels.doccarchive

I prefer to use Xcode when debugging and fortunately that’s not too hard to arrange. In order to set it up, I opened the Package.swift file of the docc project with Xcode. It sets up the targets for you, and with a package like this, tends to default to the package target. I shifted the platform I was building for to “My Mac”, and the target to the docc executable. With those set, I opened “edit scheme” for that combination.

Xcode lets you set environment variables and pass arguments to the executable when you invoke “run”. That’s perfect for what I was doing – working to easily reproduce the issue where I could debug it.

Image
The scheme editor in Xcode for the run mode, that shows arguments set and an environment variable to make it easier to debug docc.

I set the DOCC_HTML_DIR environment variable and set up the arguments from my example. One thing I had not caught when I first did this was that the path in one of the arguments included a space. Once I realized there was space, and run wasn’t working, I added a / character to escape the space in the name (within “Swift-DocC Preview/outputs/”). With that in place, I was able to run the code and see the results, as well as run the debugger. The issue was, indeed, repeating itself with the main branch.

Once I had that set up, checking the pull request that David mentioned in the issue was a piece of cake. I’ve had a poor time with Xcode handling changing a git branch underneath it, so I closed Xcode and updated the branch, using the gh executable as a helper. When you look at a pull request on GitHub, the CODE button provides you with a command line snippet that you can copy and paste to get it on your local machine. In this case:

gh pr checkout 1070

Once it was checked out, I re-opened Xcode, and was just a matter of running a few more times. Fortunately, scheme settings that Xcode uses when you tweak run arguments aren’t generally overwritten when you switch to another branch. I had some hopes this might solve the issue, but they were dashed pretty quickly. With 5 runs, I was able to verify that the code update didn’t make a difference in my example.

Verifying beyond swift-docc-plugin

Since I didn’t get the quick win with the pull request, it was time to dig further. Switching back to the main branch, I took it from the top. I start by looking at how code gets executed within docc.

The entrance point for docc (https://github.com/swiftlang/swift-docc/blob/main/Sources/docc/main.swift) very quickly leads to a swift argument parser setup (https://github.com/swiftlang/swift-docc/blob/main/Sources/SwiftDocCUtilities/Docc.swift). It is quick to see its broken up into subcommands, one of which is preview. Finding the code for that subcommand is less than obvious just scanning at the folders and files, but command-clicking in Xcode gets you right there: https://github.com/swiftlang/swift-docc/blob/main/Sources/SwiftDocCUtilities/ArgumentParsing/Subcommands/Preview.swift. The preview subcommand code, in turn, uses a PreviewAction that has a perform() function where the “work gets done”. The gist of which is:

  • run a convert action on the project
  • spin up a local HTML server and host the content that was just converted
  • display the details to view that server

When I first ran into this issue, what I saw from invoking a preview was an output that was missing the name of the module when it displayed the preview. I was reasonably familiar with what the contents of that directory should look like, so I ran it multiple times and captured a copy of the directory structure that it built when it worked, and another when it didn’t. Comparing the two, the difference was that the top-level module for my example project just wasn’t appearing. The directories included all the files for the symbols in my module, just not the module itself.

With that knowledge in hand, when I got to this “convert first, then display” setup, I knew the path to search down into was the convert action. I also knew that it had something to do with the top-level module, since all the symbols were there – it was missing the top level module in the output.

Spelunking into convert

If I had to pick a “heart of DocC”, it would be this conversion process.

The high-level workflow takes in two different kinds of data – one or more symbol graphs and a documentation catalog – and assembles a documentation archive from them. The result isn’t plain files with HTML inside them. Instead it’s a collection of the data that represent those symbols (in JSON) that can be rendered – into HTML, or really any other target. The rendering happens with a different bit of code (that’s the swift-docc-render project that I mentioned).

Symbol graphs are generated by the compiler (or other code, really – but generally by the compiler). But symbol graphs along don’t have all the details in a way that’s easy to collect and render them. The relationships between symbols, the type of symbol, and so on, gets cleaned and re-arranged in the convert process. It also mixes in the writing and resources that you provide in the docc catalog. This lets DocC override or add in content, as well as the provide things that don’t exist in the raw code symbols, such as articles and tutorials.

The code in ConvertAction is fairly complex as there’s a bit of abstraction there that makes it a little harder to parse. It abstracts the producer of data and the consumer, and has additional bits to support tracking documentation coverage, captures diagnostics for issues mixing the files together so they can be played back to tooling, and other options, such as building an index. All this is encapsulated in the _perform method. That method, in turn, runs this bit of code:

conversionProblems = try ConvertActionConverter.convert(
  bundle: bundle,
  context: context,
  outputConsumer: outputConsumer,
  sourceRepository: sourceRepository,
  emitDigest: emitDigest,
  documentationCoverageOptions: documentationCoverageOptions
)

While ConvertActionConverter is a jump around the project code, it’s encapsulated pretty well. It’s fairly straight forward to read and understand what’s happening. There’s an inner function with a lot of comments in the flow of that method that made it harder to for me to track what was happening, and where the function boundaries were. Once I realized it what it was, I read around it to again look for that “where’s the core work happening”.

The heart of the convert function is:

let entity = try context.entity(with: identifier)

guard let renderNode = converter.renderNode(for: entity) else {
    // No render node was produced for this entity, so just skip it.
    return
}
                    
try outputConsumer.consume(renderNode: renderNode)

This code is wrapped inside a function context.knownPages.concurrentPerform that iterates through the known pages. I wasn’t sure where it might be dropping the top-level module, so I started with good old fashioned printf debugging. That also exposed a bunch of new types to explore and learn about:

I started off with a breakpoint on the bit of code that sets the entity from the identifier (the ResolvedTopicReference). Pretty quickly I realized there were a LOT of these, even in my smaller sample project, and stepping through each iteration was kind of horrible. To work around this, I reverted to a variation on printf debugging. I started adding in code to see what was happening, and more specifically to look for what I was after – the node in the end result that represented the top-level module. My first printf debugging worked on printing the name (a String) of the entity.

if entity.name.plainText == "Voxels" {
    print("FOUND IT!: \(entity.name)")
}

The first run through just printed them all – and generated something just over 500 lines, so I took a bit of time and looked through them all. Sure enough, somewhere in the middle of that list (they’re not processed in alphabetical order) was the top-level module name that I was looking for.

I started tracking how and where that got created, and its properties set. I expanded my exploration code to only track what was in knownPages to see what it was providing.

for page in context.knownPages {
    //print(page.absoluteString)
    if page.url.absoluteString == "doc://Voxels/documentation/Voxels" {
        print("PING")
    }
}

The knownPages is a computed property, filtering what’s stored in the context:

public var knownPages: [ResolvedTopicReference] {
    return topicGraph.nodes.values
        .filter { !$0.isVirtual && $0.kind.isPage }
        .map { $0.reference }
}

What I didn’t track at the time, and found a bit later, was that the filter statement turned out to be important. I didn’t fully understand the details of what made up a “resolved topic node”, and didn’t know what it meant to be “virtual” or not. While kind being page was fairly obvious, virtual can have a lot of meanings and implications.

In either case, when it was working correctly, the known pages included the url, and when it wasn’t, the page didn’t appear to exist. Because of that, I knew the issue was somewhere in the execution flow prior to where I was looking. I read back to see where things get set up and initialized.

The convert action sets up the context with the following code:

let context = try DocumentationContext(bundle: bundle, dataProvider: dataProvider, diagnosticEngine: diagnosticEngine, configuration: configuration)

The DocumentationContext initializer is pretty lean, deferring the more complex setup to an internal register function. I continued to trace that back further, and the register function uses and ultimately references another type: SymbolGraphLoader.

I was repeating my printf debugging by dropping in that a of code that looked for the URL I wanted to find earlier and earlier, making sure it was there (or not) as I went. As I was getting to the registerSymbols function on DocumentationContext, I realized the data types didn’t include the URL. I needed to understand what was beneath. I started off looking for name on the underlying types, and quickly found a surprise – it was always there, even when the URL wasn’t.

That’s when I clued in that filter added on knownPages. I realized the key difference wasn’t if the node existed, but how it was set up. The node always existed, and when the code was working, the isVirtual property was false. When it failed, isVirtual was true.

What the hell is isVirtual, what’s it mean, and where does it come from?

I was a bit confused and frustrated at this point. Not that I couldn’t find a comment that spelled out that ‘isVirtual’ meant that it shouldn’t be rendered, but that I just didn’t understand the implications and where it all came from, what it meant across all those contexts, and why it was needed. Turns out I didn’t really need to know all that detail, but since it was what was different, I wanted to understand.

I took a bit of time to look at the raw JSON of a symbolgraph file, and found that isVirtual comes from the compiler itself, and is carried through, for most symbol nodes. In that same process, I also realized that the symbol graph from the compiler did not have a symbol for the module itself. So something in the code that was loading the symbol graphs was adding a node for the module, and setting its value, and sometimes incorrectly. I continued to have this hypothesis that it was some wacky race condition in the code that hadn’t been spotted. And it sort of was, but at the algorithm level, not the code-threading level.

As a side note, isVirtual in a documentation node fundamentally means “don’t render this”. The idea being that it’s only there to link things together – relationships, overlays, etc.

SymbolGraphLoader and SymbolKit

The symbol graph loader takes in one or more symbol graph files, merges them all together, cleans them up a smidge, and creates the nodes needed to represent the higher level connections and relationships. While I still didn’t fully grok the isVirtual property and its implications, I knew that how it was being set for the module level node was what I cared about.

When I was looking for that code, I found the following:

private static func moduleNameFor(_ symbolGraph: SymbolGraph, at url: URL) -> (String, Bool)

I hadn’t yet joined the dots to see where it was set up, but knowing if something was a “main” symbol graph or not sounded promising. I kept that in mind while I dug further. I kept digging and found where the loader was collecting the symbol graphs, annotating them, and merging them. It SymbolGraph loader fed them all into an instance of GraphCollector. And that code is from a different library: SymbolKit.

Scanning through that code, I found the same function: moduleNameFor. Same parameters and outputs, a public symbol in SymbolKit and private in Docc. I’m guessing it started in Swift Docc, and was later extracted out into the library. The end result was identical logic in two places, so I made a note to clean that up later.

The GraphCollector turned out to be key. It holds the source for the data that’s used to determine the top-level module node.

The mergeSymbolGraph method in the graph collector pulls everything together. Within the collector, the data about the graphs are stored in dictionaries keyed by the name of the module. In addition to providing a unified graph by the module name, it also keeps track of each modules it loads, and marks it as a primary module or an extension module. The “is it the primary module graph” setting uses the logic in moduleNameFor.

In this function, you provide a loaded symbol graph and the name of the url, and it returns back the name of the module described in the graph and if it is a primary module. The key logic that makes this determination was the following line:

let isMainSymbolGraph = !url.lastPathComponent.contains("@")

The presumption when this was written was that all symbol graphs would come from the compiler, and the ones that extend an existing symbol graph will have an @ symbol in the name. Snippets blow up the assumption. The snippet-extractor code, that creates the symbol graph from snippets, names the symbol graph file YourModule-snippets.symbols.json. Because the name didn’t include an @ symbol, snippet graph files were being regarded as “primary” graphs.

Back in the DocumentationContext, there’s an extension on SymbolGraphLoader that provides the URL for the module: mainModuleURL – and this is where the flaw exhibits. The extension’s method uses first on the list of graphLocations from the collector to get the primary module, which assumes there’s only one. When more than one exists, it returns a non-deterministic result. Sometimes it was returning the one referenced from the snippet symbol graph, and other times it was the “right” symbol graph.

It’s not a direct path to easily find where and how that’s used to set up the URL. The line of code that pulls this detail:

let fileURL = symbolGraphLoader.mainModuleURL(forModule: moduleName)

uses that later to mix together the topic graph:

addSymbolsToTopicGraph(symbolGraph: unifiedSymbolGraph, url: fileURL, symbolReferences: symbolReferences, moduleReference: moduleReference)

This is what ultimately mixes in the isVirtual property into the topic graph, and that’s what sets the module to isVirtual – assuming there’s only one and using the “first” one it grabs.

Fixing the issue

The work above took place over the course of 3 days and resulted in 3 issues reported, one each to swift-docc-plugin, swift-docc, and swift-docc-symbolkit. The first of those I closed as soon as I realized it had nothing to do with the issue.

I stripped back out my printf debugging code, and in the end there were two relatively small changes that I made into pull requests – one for the fix in SymbolKit, and a supplement that just cleaned things up a bit in DocC.

I’ve proposed a solution that changes the logic in SymbolKit to support the fundamental assumptions that “there can be only one” main symbol graph. I did think about trying to represent the snippets as a different type in the collector (other than primary and extension), but I spotted a number of other places in the code that had that idea heavily built-in. They also leveraged .first() to get at the primary module, if it existed. Since snippets were added a couple of years ago, and this hadn’t been identified and debugged, I wasn’t sure what was expected for the function returning the name and processing of snippet symbol graphs. I opted for a change that tweaks the logic in SymbolKit, adding an inspecting of the isVirtual property in the metadata of the module in the symbolgraph in addition to the verifying there wasn’t an @ symbol in the filename.

I also opened a supplemental PR in docc to de-duplicate that logic and keep it all in once place. The main issue (the comments in which are a summarized, short-form of this post) looks like it’ll be fully resolved by the change in SymbolKit. But since I’d done the digging, and noticed the duplication, I figured it wouldn’t hurt to help clean things up a bit.

My Favorite Swift 6 feature: static library compilation on Linux

There is a lot of great stuff coming in the Swift programming language. I love the focus and effort on validating data-race safety, and is probably the feature set that I’ll spend the most time with. But my favorite new tidbit? Swift 6 now supports a Linux SDK and the ability to compile a stand-alone, statically linked binary.

The real detail for all this is in the blog post on Swift.org: Getting Started with the Static Linux SDK. It’s a capability that makes deploying server-side applications to Linux much easier. I’ve been looking forward to this capability for quite a while.

Statically linked binaries are a standard Go language feature. To me, it was a huge enabling feature of seeing Go sweep the cloud-services open-source space over the past decade (mostly with the wave of Kubernetes). I hope that after Swift 6 is fully released, it helps serve the same purpose.

Beyond using Swift for server-side apps, there’s a whole realm of expansion on where and how you can use Swift. The other place that really calls it out this year is the project’s fantastic push “into the small” with an explicit “Embedded Swift” mode – a strict subset of the features that make it both possible, and effective, to take advantage of the Swift language safety features while deploying to extremely constrained compute – microcontrollers. Watch the video from this year’s WWDC: Go small with Embedded Swift, to get more details if that sounds interesting.

Class 5 Geomagnetic Storm

Images from adjacent to downtown Seattle (meaning a LOT of light pollution), from 11:20 to 11:50pm local time, May 10th.

Most of this was nearly impossible to see at this color with the naked eye. They seemed like wispy clouds, and only the very brightest tints of red or green would start to hint against the sky. Use an iPhone camera though…

Designing a Swift library with data-race safety

I cut an initial release (0.1.0-alpha) of the library automerge-repo-swift. A supplemental library to Automerge swift, it adds background networking for sync and storage capabilities. The library extends code I initially created in the Automerge demo app (MeetingNotes), and was common enough to warrant its own library. While I was extracting those pieces, I leaned into the same general pattern that was used in the Javascript library automerge-repo. That library provides largely the same functionality for Automerge in javascript. I borrowed the public API structure, as well as compatibility and implementation details for the Automerge sync protocol. One of my goals while assembling this new library was to build it fully compliant with Swift’s data-race safety. Meaning that it compiles without warnings when I use the Swift compiler’s strict-concurrency mode.

There were some notable challenges in coming up to speed with the concepts of isolation and sendability. In addition to learning the concepts, how to apply them is an open question. Not many Swift developers have embraced strict concurrency and talked about the trade-offs or implications for choices. Because of that, I feel that there’s relatively little available knowledge to understand the trade-offs to make when you protect mutable state. This post shares some of the stumbling blocks I hit, choices I made, and lessons I’ve learned. My hope is that it helps other developers facing a similar challenge.

Framing the problem

The way I try to learn and apply new knowledge to solve these kinds of “new fangled” problems is first working out how to think about the problem. I’ve not come up with a good way to ask other people how to do that. I think when I frame the problem with good first-principles in mind, trade-offs in solutions become easier to understand. Sometimes the answers are even self-obvious.

The foremost principle in strict-concurrency is “protect your mutable state”. The compiler warnings give you feedback about potential hazards and data-races. In Swift, protecting the state uses a concept of an “isolation domain”. My layman’s take on isolation is “How can the compiler verify that only one thread is accessing this bit of data at a time”. There are some places where the compiler infers the state of isolation, and some of them still changing as we progress towards Swift 6. When you’re writing code, the compiler knows what is isolated (and non-isolated) – either by itself or based on what you annotated. When the compiler infers an isolation domain, that detail is not (yet?) easily exposed to developers. It really only shows up when there’s a mismatch in your assumptions vs. what the compiler thinks and it issues a strict-concurrency warning.

Sendability is the second key concept. In my layman’s terms again, something that is sendable is safe to cross over thread boundaries. With Swift 5.10, the compiler has enough knowledge of types to be able to make guarantees about what is safe, and what isn’t.

The first thing I did was lean heavily into making anything and everything Sendable. In hindsight, that was a bit of a mistake. Not disastrous, but I made a lot more work for myself. Not everything needs to be sendable. Taking advantage of isolation, it is fine – sometimes notably more efficient and easier to reason about – to have and use non-sendable types within an isolation domain. More on that in a bit.

My key to framing up the problem was to think in terms of making explicit choices about what data should be in an isolation region along with how I want to pass information from one isolation domain to another. Any types I pass (generally) need to be Sendable, and anything that stays within an isolation domain doesn’t. For this library, I have a lot of mutable state: networking connections, updates from users, and a state machine coordinating it all. All of it is needed so a repository can store and synchronize Automerge documents. Automerge documents themselves are Sendable (I had that in place well before starting this work). I made the Automerge documents sendable by wrapping access and updates to anything mutable within a serial dispatch queue. (This was also needed because the core Automerge library – a Rust library accessed through FFI – was not safe for multi-threaded use).

Choosing Isolation

I knew I wanted to make at least one explicit isolation domain, so the first question was “Actor or isolated class?” Honestly, I’m still not sure I understand all the tradeoffs. Without knowing what the effect would be to start off with, I decided to pick “let’s use actors everywhere” and see how it goes. Some of the method calls in the design of the Automerge repository were easily and obviously async, so that seemed like a good first cut. I made the top-level repo an actor, and then I kept making any internal type that had mutable state also be it’s own actor. That included a storage subsystem and a network subsystem, both of which I built to let someone else provide the network or storage provider external to this project. To support external plugins that work with this library, I created protocols for the storage and network provider, as well as one that the network providers use to talk back to the repository.

The downside of that choice was two-fold – first setting things up, then interacting with it from within a SwiftUI app. Because I made every-darn-thing an actor, I hade to await a response, which meant a lot of potential suspension points in my code. That also propagated to imply even setup needed to be done within an async context. Sometimes that’s easy to arrange, but other times it ends up being a complete pain in the butt. More specifically, quite a few of the current Apple-provided frameworks don’t have or provide a clear path to integrate async setup hooks. The server-side Swift world has a lovely “set up and run” mechanism (swift-service-lifecycle) it is adopting, but Apple hasn’t provided a similar concept the frameworks it provides. The one that bites me most frequently is the SwiftUI app and document-based app lifecycle, which are all synchronous.

Initialization Challenges

Making the individual actors – Repo and the two network providers I created – initializable with synchronous calls wasn’t too bad. The stumbling block I hit (that I still don’t have a great solution to) was when I wanted to add and activate the network providers to a repository. To arrange that, I’m currently using a detached Task that I kick off in the SwiftUI App’s initializer:

public let repo = Repo(sharePolicy: .agreeable)
public let websocket = WebSocketProvider()
public let peerToPeer = PeerToPeerProvider(
    PeerToPeerProviderConfiguration(
        passcode: "AutomergeMeetingNotes",
        reconnectOnError: true,
        autoconnect: false
    )
)

@main
struct MeetingNotesApp: App {
    var body: some Scene {
        DocumentGroup {
            MeetingNotesDocument()
        } editor: { file in
            MeetingNotesDocumentView(document: file.document)
        }
        .commands {
            CommandGroup(replacing: CommandGroupPlacement.toolbar) {
            }
        }
    }

    init() {
        Task {
            await repo.addNetworkAdapter(adapter: websocket)
            await repo.addNetworkAdapter(adapter: peerToPeer)
        }
    }
}

Swift Async Algorithms

One of the lessons I’ve learned is that if you find yourself stashing a number of actors into an array, and you’re used to interacting with them using functional methods (filter, compactMap, etc), you need to deal with the asynchronous access. The standard library built-in functional methods are all synchronous. Because of that, you can only access non-isolated properties on the actors. For me, that meant working with non-mutable state that I set up during actor initialization.

The second path (and I went there) was to take on a dependency to swift-async-algorithms, and use its async variations of the functional methods. They let you “await” results for anything that needs to cross isolation boundaries. And because it took me an embarrasingly long time to figure it out: If you have an array of actors, the way to get to an AsyncSequence of them is to use the async property on the array after you’ve imported swift-async-algorithms. For example, something like the following snippet:

let arrayOfActors: [YourActorType] = []
let filteredResults = arrayOfActors.async.filter(...)

Rethinking the isolation choice

That is my first version of this library. I got it functional, then turned around and tore it apart again. In making everything an actor, I was making LOTS of little isolation regions that the code had to hop between. With all the suspension points, that meant a lot of possible re-ordering of what was running. I had to be extrodinarily careful not to assume a copy of some state I’d nabbed earlier was still the same after the await. (I still have to be, but it was a more prominent issue with lots of actors.) All of this boils down to being aware of actor re-entrancy, and when it might invalidate something.

I knew that I wanted at least one isolation region (the repository). I also want to keep mutable state in separate types to preserve an isolation of duties. One particular class highlighted my problems – a wrapper around NWConnection that tracks additional state with it and handles the Automerge sync protocol. It was getting really darned inconvenient with the large number of await suspension points.

I slowly clued in that it would be a lot easier if that were all synchronous – and there was no reason it couldn’t be. In my ideal world, I’d have the type Repo (my top-level repository) as an non-global actor, and isolate any classes it used to the same isolation zone as that one, non-global, actor. I think that’s a capability that’s coming, or at least I wasn’t sure how to arrange that today with Swift 5.10. Instead I opted to make a single global actor for the library and switch what I previously set up as actors to classes isolated to that global actor.

That let me simplify quite a bit, notably when dealing with the state of connections within a network adapter. What surprised me was that when I switched from Actor to isolated class, there were few warnings from the change. The changes were mostly warnings that calls dropped back to synchronous, and no longer needed await. That was quick to fix up; the change to isolated classes was much faster and easier than I anticipated. After I made the initial changes, I went through the various initializers and associated configuration calls to make more of it explicitly synchronous. The end result was more code that could be set up (initialized) without an async context. And finally, I updated how I handled the networking so that as I needed to track state, I didn’t absolutely have to use the async algorithsm library.

A single global actor?

A bit of a side note: I thought about making Repo a global actor, but I prefer to not demand a singleton style library for it’s usage. That choice made it much easier to host multiple repositories when it came time to run functional tests with a mock In-Memory network, or integration tests with the actual providers. I’m still a slight bit concerned that I might be adding to a long-term potential proliferation of global actors from libraries – but it seems like the best solution at the moment. I’d love it if I could do something that indicated “All these things need a single isolation domain, and you – developer – are responsible for providing one that fits your needs”. I’m not sure that kind of concept is even on the table for future work.

Recipes for solving these problems

If you weren’t already aware of it, Matt Massicotte created a GitHub repository called ConcurrencyRecipes. This is a gemstone of knowledge, hints, and possible solutions. I leaned into it again and again while building (and rebuilding) this library. One of the “convert it to async” challenges I encountered was providing an async interface to my own peer-to-peer network protocol. I built the protocol using the Network framework based (partially on Apple’s sample code), which is all synchronous code and callbacks. A high level, I wanted it to act similarly URLSessionWebSocketTask. This gist being a connection has an async send() and an async receive() for sending and receiving messages on the connection. With an async send and receive, you can readily assemble several different patterns of access.

To get there, I used a combination of CheckedContinuation (both the throwing and non-throwing variations) to work with what NWConnection provided. I wish that was better documented. How to properly use those APIs is opaque, but that is a digression for another time. I’m particular happy with how my code worked out, including adding a method on the PeerConnection class that used structured concurrency to handle a timeout mechanism.

Racing tasks with structured concurrency

One of the harder warnings for me to understand was related to racing concurrent tasks in order to create an async method with a “timeout”. I stashed a pattern for how to do this in my notebook with references to Beyond the basics of structured concurrency from WWDC23.

If the async task returns a value, you can set it up something like this (this is from PeerToPeerConnection.swift):

let msg = try await withThrowingTaskGroup(of: SyncV1Msg.self) { group in
    group.addTask {
        // retrieve the next message
        try await self.receiveSingleMessage()
    }

    group.addTask {
        // Race against the receive call with a continuous timer
        try await Task.sleep(for: explicitTimeout)
        throw SyncV1Msg.Errors.Timeout()
    }

    guard let msg = try await group.next() else {
        throw CancellationError()
    }
    // cancel all ongoing tasks (the websocket receive request, in this case)
    group.cancelAll()
    return msg
}

There’s a niftier version available in Swift 5.9 (which I didn’t use) for when you don’t care about the return value:

func run() async throws {
    try await withThrowingDiscardingTaskGroup { group in
        for cook in staff.keys {
            group.addTask { try await cook.handleShift() }
        }

        group.addTask { // keep the restaurant going until closing time
            try await Task.sleep(for: shiftDuration)
            throw TimeToCloseError()
        }
    }
}

With Swift 5.10 compiler, my direct use of this displayed a warning:

warning: passing argument of non-sendable type 'inout ThrowingTaskGroup<SyncV1Msg, any Error>' outside of global actor 'AutomergeRepo'-isolated context may introduce data races

guard let msg = try await group.next() else {
                          ^

I didn’t really understand the core of this warning, so I asked on the Swift forums. VNS (on the forums) had run into the same issue and helped explain it:

It’s because withTaskGroup accepts a non-Sendable closure, which means the closure has to be isolated to whatever context it was formed in. If your test() function is nonisolated, it means the closure is nonisolated, so calling group.waitForAll() doesn’t cross an isolation boundary.

The workaround to handle the combination of non-sendable closures and TaskGroup is to make the async method that runs this code nonisolated. In the context I was using it, the class that contains this method is isolated to a global actor, so it’s inheriting that context. By switching the method to be explicitly non-isolated, the compiler doesn’t complain about group being isolated to that global actor.

Sharing information back to SwiftUI

These components have all sorts of interesting internal state, some of which I wanted to export. For example, to provide information from the network providers to make a user interface (in SwiftUI). I want to be able to choose to connect to endpoints, to share what endpoints might be available (from the NWBrowser embedded in the peer to peer network provider), and so forth.

I first tried to lean into AsyncStreams. While they make a great local queue for a single point to point connection, I found they were far less useful to generally make a firehouse of data that SwiftUI knows how to read and react to. While I tried to use all the latest techniques, to handle this part I went to my old friend Combine. Some people are effusing that Combine is dead and dying – but boy it works. And most delightfully, you can have any number of endpoints pick up and subscribe to a shared publisher, which was perfect for my use case. Top that off with SwiftUI having great support to receive streams of data from Combine, and it was an easy choice.

I ended up using Combine publishers to make a a few feeds of data from the PeerToPeerProvider. They share information about what other peers were available, the current state of the listener (that accepts connections) and the browser (that looks for peers), and last a publisher that provides information about active peer to peer connctions. I feel that worked out extremely well. It worked so well that I made an internal publisher (not exposed via the public API) for tests to get events and state updates from within a repository.

Integration Testing

It’s remarkably hard to usefully unit test network providers. Instead of unit testing, I made a separate Swift project for the purposes of running integration tests. It sits in it’s own directory in the git repository and references automerge-repo-swift as a local dependency. A side effect is that it let me add in all sorts of wacky dependencies that were handy for the integration testing, but that I really didn’t want exposed and transitive for the main package. I wish that Swift Packages had a means to identify test-only dependencies that didn’t propagate to other packages for situations like this. Ah well, my solution was a separate sub-project.

Testing using the Combine publisher worked well. Although it took a little digging to figure out the correct way to set up and use expectations with async XCTests. It feels a bit exhausting to assemble the expectations and fulfillment calls, but its quite possible to get working. If you want to see this in operation, take a look at P2P+explicitConnect.swift. I started to look at potentially using the upcoming swift-testing, but with limited Swift 5.10 support, I decided to hold off for now. If it makes asynchronous testing easier down the road, I may well adopt it quickly after it’s initial release.

The one quirky place that I ran into with that API setup was that expectation.fulfill() gets cranky with you if you call it more than once. My publisher wasn’t quite so constrained with state updates, so I ended up cobbling a boolean latch variable in a sink when I didn’t have a sufficiently constrained closure.

The other quirk in integration testing is that while it works beautifully on a local machine, I had a trouble getting it to work in CI (using GitHub Actions). Part of the issue is that the current swift test defaults to running all possible tests at once, in parallel. Especially for integration testing of peer to peer networking, that meant a lot of network listeners, and browsers, getting shoved together at once on the local network. I wrote a script to list out the tests and run them one at a time. Even breaking it down like that didn’t consistently get through CI. I also tried higher wait times (120 seconds) on the expectations. When I run them locally, most of those tests take about 5 seconds each.

The test that was a real challenge was the cross-platform one. Automerge-repo has a sample sync server (NodeJS, using Automerge through WASM). I created a docker container for it, and my cross-platform integration test pushes and pulls documents to an instance that I can run in Docker. Well… Docker isn’t available for macOS runners, so that’s out for GitHub Actions. I have a script that spins up a local docker instance, and I added a check into the WebSocket network provider test – if it couldn’t find a local instance to work against, it skips the test.

Final Takeaways

Starting with a plan for isolating state made the choices of how and what I used a bit easier, and reaching for global-actor constrained classes made synchronous use of those classes much easier. For me, this mostly played out in better (synchronous) intializers and dealing with collections using functional programming patterns.

I hope there’s some planning/thinking in SwiftUI to update or extend the app structure to accomodate async hooks for things like setup and initialization (FB9221398). That should make it easier for a developer to run an async initializer and verify that it didn’t fail, before continuing into the normal app lifecycle. Likewise, I hope that the Document-based APIs gain an async-context to work with documents to likewise handle asynchronous tasks (FB12243722). Both of these spots are very awkward places for me.

Once you shift to using asynchronous calls, it can have a ripple effect in your code. If you’re looking at converting existing code, start at the “top” and work down. That helped me to make sure there weren’t secondary complications with that choice (such as a a need for an async initializer).

Better yet, step back and take the time to identify where mutable state exists. Group it together as best you can, and review how you’re interacting it, and in what isolation region. In the case of things that need to be available to SwiftUI, you can likely isolate methods appropriately (*cough* MainActor *cough*). Then make the parts you need to pass between isolation domains Sendable. Recognize that in some cases, it may be fine to do the equivalent of “Here was the state at some recent moment, if you might want to react to that”. There are several places where I pass back a summary snapshot of mutable state to SwiftUI to use in UI elements.

And do yourself a favor and keep Matt’s Concurrency Recipes on speed-dial.

Before I finished this post, I listened to episode 43 of the Swift Package Index podcast. It’s a great episode, with Holly Bora, compiler geek and manager of the Swift language team, on as a guest to talk about the Swift 6. A tidbit she shared was that they are creating a Swift 6 migration guide, to be published on the swift.org website. Something to look forward to, in addition to Matt’s collection of recipes!

Distributed Tracing with Testing on iOS and macOS

This weekend I was frustrated with my debugging, and just not up to digging in and carefully, meticulously analyzing what was happening. So … I took a left turn (at Alburquerque) and decided to explore an older idea to see if it was interesting and/or useful. My challenging debugging was all about network code, for a collaborative, peer to peer sharing thing; more about that effort some other time.

A bit of back story

A number of years ago when I was working with a solar energy manufacturer, I was living and breathing events, APIs, and running very distributed, sometimes over crap network connections, systems. One of the experiments I did (that worked out extremely well) was to enable distributed tracing across the all the software components, collecting and analyzing traces to support integration testing. Distributed tracing, and the now-popular CNCF OpenTelemetry project weren’t a big thing, but they were around – kind of getting started. The folks (Yuri Shkuro, at least) at Uber had released Jaeger, an open-source trace collector with web-based visualization, which was enough to get started. I wrote about that work back in 2019 (that post still gets some recurring traffic from search engines, although it’s pretty dated now and not entirely useful).

We spun up our services, enabled tracing, and ran integration tests on the whole system. After which, we had the traces available for visual review. It was useful enough that we ended up evolving it so that a single developer could stand up most of their pieces locally (with a sufficiently beefy machine), and capture and view the traces locally. That provided a great feedback loop as they could see performance and flows in the system while they were developing fixes, updates and features. I wanted to see, this time with an iOS/macOS focused library, how far I could get trying to replicate that idea (time boxed to the weekend).

The Experiment!

I’ve been loosely following the server-side swift distributed tracing efforts since it started, and it looked pretty clear that I could use it directly. Moritz Lang publishes swift-otel, which is a Swift native, concurrency supported library. With his examples, it was super quick to hack into my test setup. The library is set up to run with service-lifecycle pieces over SwiftNIO, so there’s a pile of dependencies that come in with it. To add to my library, I’d be a little hesitant, but an integration test thing, I’m totally good with that. There were some quirks to using it with XCTest, most of which I hacked around by shoving the tracer setup into a global actor and exposing an idempotent bootstrap call. With that in place, I added explicit traces into my tests, and then started adding more and more, including into my library, and could see the results in a locally running instance of Jaeger (running Jaeger using Docker).

Some Results

The following image is an overview of the traces generated by a single test (testCreate):

Image

The code I’m working with is all pushing events over web sockets, so inside of the individual spans (which are async closures in my test) I’ve dropped in some span events, one of which is shown in detail below:

Image

In a lot of respects, this is akin to dropping in os_signposts that you might view in Instruments, but it’s external to Xcode infrastructure. Don’t get me wrong, I love Instruments and what it does – it’s been amazing and really the gold standard in tooling for me for years – but I was curious how far this approach would get me.

Choices and Challenges

Using something like this in production – with live-running iOS or macOS apps – would be another great end-to-end scenario. More so if the infrastructure your app was working from also used tracing. There’s a separate tracing project at CNCF – OpenTelemetry Swift – that looks oriented towards doing just that. I seriously considered using it, but I didn’t see a way to use that package to instrument my library and not bring in the whole pile of dependencies. With the swift-distributed-tracing library, it’s an easy (and small) dependency add – and you only need to take the hit of the extra dependencies when you want to use the tracing.

And I’ll just “casually” mention that if you pair this with server-side swift efforts, the Hummingbird project has support for distributed tracing currently built in. I expect Vapor support isn’t too far off, and it’s a continued focus to add more distributed tracing support for a number of prevalent server-side swift libraries over this coming summer.

See for Yourself (under construction/YMMV/etc)

I’ve tossed up my hack-job of a wrapper for tracing during testing with iOS and macOS – DistributedTracer, if you want to experiment with this kind of thing yourself. Feel free to use it, although if you’re amazed with the results – ALL credit should go to Moritz, the contributors to his package and the contributors to swift-distributed-tracing, since they did the heavy lifting. The swift-otel library itself is undergoing some major API surface changes – so if you go looking, I worked from the current main branch rather than the latest release. Moritz shared with me that while the API was not completely solid yet, this is more of the pattern he wants to expose for an upcoming 1.0 release.

Onward from here

I might push the DistributedTracer package further in the future. I think there’s real potential there, but it is not without pitfalls. Some of the challenges stem from constantly exporting data from an iOS app, so there’s a privacy (and privacy manifest) bit that needs to be seriously considered. There are also challenges with collecting enough data (but not too much), related choices in sampling so that it aligns with traces generated from infrastructure, as well as how to reliably transfer it from device to an endpoint. Nothing that can’t be overcome, but it’s not a small amount of work either.

Weekend hacking complete, I’m calling this a successful experiment. Okay, now back to actually debugging my library…

Embedding a privacy manifest into an XCFramework

During WWDC 2023, Apple presented a number of developer-impacting privacy updates. One of the updates, introducing the concept of a privacy manifest, has a direct impact on the work I’ve been doing making the CRDT library Automerge available on Apple platforms. The two relevant sessions from WWDC 2023:

  • Get Started with Privacy Manifests (video) (notes)
  • Verify app dependencies with digital signatures (video) (notes)

During the sessions, the presenter shared that somewhere in the coming year (2024) Apple would start requiring privacy manifests in signed XCFrameworks. There was little concrete detail available then, and I’ve been waiting since for more information on how to comply. I expected documentation at least, and was hoping for an update in Xcode – specifically the xcodebuild command – to add an option that accepted a path to a manifest and included it appropriately. So far, nothing from Apple on that front.

About a week ago I decided to use a DTS ticket to get assistance on how to (properly) add privacy manifest to an XCFramework (and filed feedback: FB13626419). I hope that something is planned to make this easier, or at the minimum document a process, since it now appears to be an active requirement for new apps presented to the App Store. I highly doubt we’ll see anything between now and WWDC at this point. With any luck, we’ll see something this June (WWDC 24).

I have a hypothesis that, with the updates to enable signed binary dependencies, there could be “something coming” about a software bill-of-materials manifest. My over-active imagination thinks there are hints of that correlated with what swift is recording in Package.resolved, and seeming to start to take advantage of within the proposed new approach to swift testing. It would make a lot of sense to support better verification and clear knowledge of what you’re including in your apps, or depending on for your libraries (and extremely useful metadata for testing validation).

In the meantime, if you’re Creating an XCFramework and trying to figure out how to comply with Apple’s requests for embedded privacy manifests, hopefully this article helps you get there. As I mentioned at the top of this post, this is based on my open source work in Automerge-swift. I’m including the library and XCFramework (and show it off) in a demo application. I just finished working through the process of getting the archives validated and pushed to App Store Connect (with macOS and iOS deliverables). To be very clear, the person I worked with at DTS was both critical and super-helpful. Without this information I would have been wandering blindly for months trying to get this sorted. All credit to them for the assistance.

The gist of what needs to be done lines up with Apple’s general platform conventions for placing resources into bundles (detailed at Placing Content in a Bundle). The resource in this case is the file PrivacyInfo.xcprivacy, and the general pattern plays out as:

  • iOS and iOS simulator: place the resource at the root for that platform
  • macOS and Mac Catalyst: place the resource in a directory structure /Versions/A/Resources/

The additional quirk in this case is that with an XCFramework created from platform-specific static libraries, you also need to put that directory structure underneath the directory that is the platform signifier. (An example is shown below, illustrating this. I know it’s not super clear; I don’t either know, or have, the words to correctly describe these layers in the a directory structure.)

I do this with a bash script that copies the privacy manifest into the place relevant for each platform target. In the case of automerge-swift, we compile to support iOS, the iOS simulators (on x86 and arm architectures), macOS (on x86 and arm architectures), and Mac Catalyst (on x86 and arm architectures).

Once the files are copied into place, I code sign the bundle:

codesign --timestamp -v --sign "...my developer id..." ${FRAMEWORK_NAME}.xcframework

After which, compress it down using ditto, and compute the SHA256 checksum. That checksum is used to create a validation hash for a URL reference in a Package.swift. (If you want to see the scripts, have at – they’re on GitHub. The scripts are split at the end – one for CI that doesn’t sign, and one for release that does.)

Seeing the layout of the relevant files in an XCFramework was the most helpful piece for me to assemble this together, so let me share the directory structure of my XCFramework. The example below, called automergeFFI.xcframework, hopefully shows you the details without flooding you in extraneous files; it skips the header or code signature specific files:

automergeFFI.xcframework/
Info.plist
_CodeSignature/

macos-arm64_x86_64/Headers
macos-arm64_x86_64/libuniffi_automerge.a
Versions/A/Resources/
PrivacyInfo.xcprivacy

ios-arm64_x86_64-simulator
ios-arm64_x86_64-simulator/Headers
ios-arm64_x86_64-simulator/libuniffi_automerge.a
ios-arm64_x86_64-simulator/PrivacyInfo.xcprivacy

ios-arm64_x86_64-maccatalyst
ios-arm64_x86_64-maccatalyst/
Versions/A/Resources/
PrivacyInfo.xcprivacy
ios-arm64_x86_64-maccatalyst/Headers
ios-arm64_x86_64-maccatalyst/libuniffi_automerge.a

ios-arm64
ios-arm64/Headers
ios-arm64/libuniffi_automerge.a
ios-arm64/PrivacyInfo.xcprivacy

With this in place, signed and embedded as a normal dependency through Xcode, both the iOS demo app and the macOS demo app passed the pre-flight validation and moved on through to TestFlight.

A week on with a VisionPro

There are excellent reviews of the VisionPro “out there”, this post isn’t meant as another. It’s a record of my first experiences, thoughts, and scribbled notes for future me to look back on after a few iterations of the product.

I had been planning on getting a Vision Pro when it was first rumored. I put away funds from contracts and gigs, and when the time came and it was available for order, I still had sticker shock. When I bought one, I didn’t skimp, but I didn’t blow it out either. My goal is to learn this product – how it works and how to work with it, and to write apps that work beautifully on it. When the available-to-developers-only head-strap extension was announced, I grabbed it too. My prior experience with any headset is using an Oculus (now Meta) Quest 2, which was fun and illustrative – but I couldn’t use it more than a few hours before nausea would start to catch up with me.

Right off, the visual clarity of the Vision Pro blew me away. The displays are mind-bogglingly good, and the 3D effect is instantly crisp and clear. I found myself exploring the nooks and corners of the product that first evening, without a hint of nausea that I’d feared might happen. The two and a half hours of battery life came quickly.

Beyond the stunning visuals, I wanted to really understand and use the interaction model. From the API, I know it supports both indirect and direct interaction using hand-tracking. Most of the examples and interactions I had at the start were “indirect” – meaning that where I looked is where actions would trigger (or not) when I tapped my fingers together. It’s intuitive, easy to get started with very quickly, and (sometimes too) easy to forget it’s a control and accidentally invoke it.

In early window managers on desktop computers, there was a pattern of usage called “focus follows mouse” (which Apple pushed hard to move away from). The idea was that whichever window your mouse cursor was over is where keyboard input would be directed. The indirect interaction mode on Vision Pro is that on steroids, and it takes some getting used to. In several cases, I found myself looking away from the control while wanting to continue using it, with results that were messy – activating other buttons, etc.

Most of the apps (even iOS apps “just” running on Vision Pro) worked flawlessly and easily, and refreshingly didn’t feel as out of place as iOS designed apps feel on an iPad (looking at you Instagram). One of the most useful visual affordances is a slight sheen that the OS plays over areas that are clearly buttons or targeted controls, which makes a wonderful feedback loop so that you know you’re looking at the right control. The gaze tracking is astoundingly good – so much better than I though it would be – but it still needs some space for grace. iOS default distances mostly work, although in a densely packed field of controls I’d want just a touch more space between them myself. After wearing the device for a couple of hours, I’d find the tracking not as crisp and I’d have a bit more error. Apps that eschewed accessible buttons for random visuals and tap targets are deeply annoying in Vision Pro. You get no feedback affordances to let you know if you’re on target or not. (D&D Beyond… I’ve got to say, you’ve got some WORK to do)

Targeting actions (or not) gets even more complicated when you’re looking at touchable targets in a web browser. Video players in general are a bit of a tar pit in terms of useful controls and feedback. Youtube’s video player was better than some of the others, but web pages in general were a notable challenge – especially the ones flooded with ads, pop-overs, and shit moving around and “catching your eye”. The term becoming far more literal and relevant when you accidentally trigger an errant click after some side movement shifted my gaze, and now I’m looking at some *%&$!!# video ad that I want nothing to do with.

In a win to potential productivity for me, you can have windows everywhere. The currently-narrowish field of vision constrains it: you have move your head – instead of glance – to see some side windows. It’s a huge refresher to the “do one thing at a time” metaphor that didn’t exist on macOS, pervades iOS, and lives in some level of Dante’s inferno on iPadOS. I can see a path to being more productive with the visionOS “spatial computer” than I ever would be with an iPad. The real kicker for me (not yet explored), will be text selection – and specifically selecting a subrange of a bit of text. That use case is absolutely dreadful in Safari on iOS. For example, try and select the portion of the URL after the host name in the safari address bar. That seemingly simple task is a huge linchpin to my ability to work productively.

The weight and battery life of this first product release are definitely suboptimal. Easily survivable for me, but sometimes annoying. Given the outstanding technology that’s packed into this device, it’s not surprising. The headset sometimes feels like it’s slipping down my face, or I need to lift and reset it a bit to make it comfortable. For wearing the device over an hour or so while sitting upright, I definitely prefer to use the over-the-head strap – and I don’t give a shit what my hair looks like.

Speaking of caring what I look like – I despise the “persona” feature and won’t be using it. It’s straight into the gaping canyon of uncanny valley. I went through the process to set one up and took a look at it. I tried to be dispassionate about it, but ultimately fled in horror and don’t want a damn thing to do with it. I don’t even want to deal with FaceTime if that’s the only option. I’d far prefer to use one of those stylized Memoji, or be able to provide my own 3D animation puppet that was mapped to my facial expressions. I can make a more meaningful connection to a stylized image or puppet than I can to the necrotic apparition of the current Persona.

And a weird quirk: I have a very mobile and expressive face, and can raise and lower either eyebrow easily. I use that a lot in my facial expressions. The FaceTime facial expression tracking can’t clue in to that – it’s either both or not at all. While I’m impressed it can read anything about my eyebrows while wearing the Vision Pro, that’s a deal killer for representing my facial expressions.

Jumping back to something more positive – in terms of consuming media, the Vision Pro is a killer device right where it is now. The whole space of viewing and watching photos and video is amazing. The panoramas I’ve collected while traveling are everything I hoped for. The immersive 180° videos made me want to learn how to make some of those, and the stereoscopic images and video (smaller field of view, but same gist) are wonderful. It’s a potent upgrade to the clicking wheels of the 3D viewFinder from my childhood. Just watching a movie was amazing – either small and convenient to the side, or huge in the field of view – at my control – with with a truly impressive “immersive theater” mode that’s really effective. It’s definitely a solo experience in that respect – I can’t share watching a movie cuddled up on the couch, but even with the high price point – the video (and audio) quality of Vision Pro makes a massive theater out of the tightest cubby. In that respect, the current Vision Pro is a very comparable value to a large home theater.

Add on the environments (I’m digging Mt Hood a lot) – with slightly variable weather and environmental acoustics, day and night transitions – it’s a tremendous break. I’d love to author a few of those. A sort of crazy, dynamic stage/set design problem with a mix of lighting, sounds, supportive visual effects, and the high definition photography to backdrop it all. I was familiar with the concept from the Quest, but the production quality in the Vision Pro is miles ahead, so much more inviting because of that.

I looked at my M1 MacBook Pro and tapped on the connect button and instantly loved it. The screen on the laptop blanked out, replaced by a much larger, high resolution floating display above it. I need to transition my workspace to really work this angle, as its a bit tight for a Vision Pro. Where I work currently, there are overhead pieces nearby that impinge on the upper visual space, prompting warnings and visual intrusions when I’m looking around to keep me from hitting anything. Using the trackpad on the Mac as a pointer within Vision Pro is effective, and the keyboard is amazing. Without a laptop nearby, I’d need (or want) at least a keyboard connected – the pop-up keyboard can get the job done (using either direct or indirect interaction), but it’s horrible for anything beyond a few words.

I have a PS5 controller that I paired with my iPad for playing games, and later paired with the Mac to navigate in the Vision Pro simulator in Xcode. I haven’t paired it with the Vision Pro, but that’s something I’d really like to try – especially for a game. For the “immerse you in an amazing world” games that I enjoy, I can imagine the result. With the impressive results of the immersive environments, there’s a “something” there that I’d like to see. Something from Rockstar, Ubisoft, Hello World Games, of the Sony or Microsoft studios. No idea if that’ll appear as something streamed from a console, or running locally – but the possibilities are huge by leveraging the high visual production values that Vision Pro provides. I’m especially curious what Disney and Epic Games might do together – an expansion or side-track from their virtual sets, creating environments and scenes that physically couldn’t otherwise exist – and then interacting within them. I’m sure they’re thinking about the same. (Hey, No Man’s Sky – I’m ready over here!)

As a wrap up, my head’s been flooded with ideas for apps that lean into the capabilities of Vision Pro. Most are of the “wouldn’t it be cool!” variety, a few are insanely outlandish and would take a huge team of both artists and developers to assemble. Of the ones that aren’t so completely insane, the common theme is the visualization and presentation of information. A large part of my earlier career was more operationally focused: understanding large, distributed systems, managing services running on them, debugging things when “shit went wrong” (such as a DC bus bar in a data center exploding when a water leak dripped on it and shorted it out, scattering copper droplets everywhere). I believe there’s a real potential benefit to seeing information with another dimension added to it, especially when you want to look at what would classically be exposed as a chart, but with values that change over time. There’s a whole crazy world of software debugging and performance analysis, distributed tracing, and correlation with logging and metrics. All of which benefit from making it easier to quickly identify failures and resolve them.

I really want to push what’s available now in a volume 3D view. That’s the most heavily constrained 3D representation in visionOS today, primarily to keep anyone from knowing where you’re gazing as a matter of privacy. Rendering and updating 3D visualizations in a volume lets you “place” it anywhere nearby, change your position around it, and ideally interact with it to explore the information. I think that’s my first real target to explore.

I am curious where the overlap will appear with webGL and how that presents into the visionOS spatial repertoire. I haven’t yet explored that avenue, but it’s intriguing, especially for the data visualization use case.

Unicode strings are always harder than you think

I recently released an update to the Swift language bindings to Automerge (0.5.7), which has a couple of great updates. My favorite part of that update was work to enable WebAssembly compilation support, mostly because I learned an incredible amount about swift-wasm and fixed a few misconceptions that I’d held for unfortunately too long. The other big thing was a fix to how I’d overlaid on Automerge text – fixing an unfortunately longer-standing issue than I realized, in how it deals with Unicode strings.

To that note, let me introduce you to my new best friend for testing this sort of thing:

🇬🇧👨‍👨‍👧‍👦😀

This little string is a gem, in that the character glyphs are varying lengths of composed unicode scalars: 2, 7, and 1 – respectively. The real issue is that I mistook Automerge’s integer indexing. I originally thought it was UTF8 characters, when in fact it’s Unicode scalars – and there’s a BIG difference in results when you start trying to delete pieces of a string thinking one is the other.

When the core library updated to 0.5.6, one of the pieces added was an updateText() method, which took an updated value, computes the differences to the strings, and applies the relevant changes for you – all of which you previously had to compute yourself. I’d been using CollectionDifference in Swift, a lovely part of the built-in standard library features, to compute the diffs – but as soon as you hit some of those composed unicode scalar characters, it all fell apart. The worst part was that I thought it was working – I even had tests with emoji, I just didn’t know enough (or dig far enough) to pick the right ones to verify things earlier.

Fortunately, Miguel (an iOS developer in Spain) pointed out the mistake in a pull request that added tests illustrating the problem, including that lovely little string combination above. Thanks to Miguel’s patience – and pull request – the bugs were highlighted, and I’m very pleased to have them fixed, released, and best of all – a better understanding of unicode scalars and Swift strings.

When you’re working in Swift alone, it’s never been a topic I needed to really know – as the APIs do a beautiful job of hiding the details. While it’s sometimes a real pain to manage String.Index and distances between them for selections, the “String as a sequence of Characters” has really served me well. It’s an issue when you’re jumping into foreign libraries that don’t have full-unicode string support built in from the ground up, so I’m glad I’ve learned the detail.

Thanks Miguel!

Questions about the data to create LLMs for embeddings

Simon Willison has a fantastic article about using LLM embeddings in his October blog post: Embeddings: What they are and why they matter. The article is great, a perfect introduction, but I’ve been struggling to find the next steps. I’ve been aware of embeddings for a while, and there’s a specific use case I have: full-text, multi-lingual search. Most Full-text search (FTS) algorithms embedded today are highly language specific, and pre-date the rise of power of massive data collections slammed together into these transformer large-language models.

There’s a few out there, a number easily available, and from my ad-hoc experiments – they’re darned good and could likely fit the bill. Except… for so many of them, there’s two key problems I haven’t been able to sort. The first is – where’s the data come from that trained the model. This is the biggest problem child for me – not because the results aren’t amazing and effective, but because I couldn’t put together even a rudimentary open-source project without having some self-confidence about the providence of how everything came together. Last thing I want is for an open-source project to run deeply afoul of copyright claims. From what I’ve found in my research so far, this problem is endemic to LLMs – with OpenAI and others carefully keeping “their data” close to their chest, both because the size is outrageous to catalog and they want to have their proprietary secrets. Well, and there’s a ton of pending lawsuits and legal arguments about whether training an LLM can be considered fair-use of even clearly copyrighted content.

The second problem is size of the model and performance. I can subjectively tell that smaller models perform “less effectively” than larger models, but I’ve yet to come up with any reasonable way to quantify that loss. It’s hard enough to quantify search relevance and rankings – it’s SO subjective that it effectively becomes data intensive to get a reasonable statistical sample for evaluation. Adding in the variable of different sizes of LLMs for to use with model embeddings for search just add to it.

With the monster models hosted by OpenAI, I kind of suspect that data management for training, and updating, those models will be key going forward. It’s clear enough they’re being trained off “content on the Internet” – but more and more content is now being generated by LLMS – both images and text. The _very_ last thing you’d want to do it train an LLM on another LLM’s generated (hallucinated) content. It would, I suspect, seriously dilute the encoded knowledge and data. Seems like carefully curated data-sets are the key for going forward.

If anyone reading my blog is aware of a “clean data sourced LLM”, even just based on the English language, I’d love to know about it. Ideally I’d find something that was multi-lingual, but I think that data collection and curation would be as much (or more) work than any consumption of the data itself. Something that required the resources of an academic institution or corporation, rather than what an individual could pull together. Or at least it feels pretty damn overwhelming to me.