<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hexops' devlog</title><link>https://devlog.hexops.org/</link><description>Recent content on Hexops' devlog</description><generator>Hugo -- gohugo.io</generator><language>en-us</language><lastBuildDate>Thu, 26 Mar 2026 00:00:00 +0000</lastBuildDate><atom:link href="https://devlog.hexops.org/feed.xml" rel="self" type="application/rss+xml"/><item><title>Announcing pkgmirror: self-host your own Zig mirror</title><link>https://devlog.hexops.org/2026/announcing-pkgmirror/</link><pubDate>Thu, 26 Mar 2026 00:00:00 +0000</pubDate><guid>https://devlog.hexops.com/2026/announcing-pkgmirror/</guid><description>&lt;p>&lt;a href="https://code.hexops.org/hexops/pkgmirror">pkgmirror&lt;/a> is an open-source, self-hosted Zig toolchain and package mirror service so your builds no longer rely on third-party services or Microsoft GitHub.&lt;/p>
&lt;p>&lt;a href="https://machengine.org">Mach&lt;/a> now hosts its own pkgmirror at &lt;a href="https://pkg.hexops.org">pkg.hexops.org&lt;/a>.&lt;/p>
&lt;h2 id="features">Features&lt;/h2>
&lt;ul>
&lt;li>&lt;strong>Zig toolchain mirroring&lt;/strong>: acts as a caching reverse proxy for Zig toolchain downloads from ziglang.org&lt;/li>
&lt;li>&lt;strong>Package mirroring (optional)&lt;/strong> - instead of writing GitHub URLs in your project&amp;rsquo;s &lt;code>build.zig.zon&lt;/code>, use your own server and have it mirror GitHub, Codeberg, or your project host of choice. No more Microsoft-induced downtime harming your builds.&lt;/li>
&lt;li>&lt;strong>Artifact mirroring (optional)&lt;/strong> - mirror prebuilt binaries and CI artifacts the same exact way.&lt;/li>
&lt;li>&lt;strong>Nominated Zig support&lt;/strong> - support for Mach&amp;rsquo;s &lt;a href="https://machengine.org/docs/nominated-zig/">nominated Zig versions&lt;/a>: something less of a moving target than nightly Zig, but more frequently updated than Zig stable releases.&lt;/li>
&lt;li>&lt;strong>Proactive cache warming&lt;/strong> - automatically fetches all stable and nominated Zig versions, for each OS/arch variant, so that your mirror is ready to go.&lt;/li>
&lt;li>&lt;strong>Automatic LetsEncrypt support&lt;/strong> - Server can invoke acme.sh for you, no reverse proxy needed!&lt;/li>
&lt;li>&lt;strong>Single binary&lt;/strong> - written in Zig, ~9 MB, super fast - no runtime dependencies beyond libc and acme.sh (optional)&lt;/li>
&lt;/ul>
&lt;h2 id="why-run-a-zig-toolchain-mirror">Why run a Zig toolchain mirror?&lt;/h2>
&lt;p>The Zig community maintains a &lt;a href="https://ziglang.org/download/community-mirrors/">list of community mirrors&lt;/a> for toolchain downloads which are used throughout the community.&lt;/p>
&lt;p>For example, &lt;a href="https://codeberg.org/mlugg/setup-zig">setup-zig&lt;/a> is a GitHub Action that installs Zig and caches your build directory between runs, it cycles through the Zig mirror list automatically to reduce load on ziglang.org.&lt;/p>
&lt;p>&lt;a href="https://github.com/marler8997/anyzig">anyzig&lt;/a> - which is probably the nicest way to seamlessly switch between Zig versions in your Zig projects, uses Zig mirrors as well.&lt;/p>
&lt;p>Note: minisig verification is used to ensure you got a genuine ziglang.org binary, and not something malicious - these tools handle that verification for you automatically.&lt;/p>
&lt;p>&lt;a href="https://code.hexops.org/hexops/pkgmirror">pkgmirror&lt;/a> is a Zig toolchain mirror server, written in Zig, so you can host a mirror of your own and contribute!&lt;/p>
&lt;h2 id="zig-package-mirrors">Zig package mirrors&lt;/h2>
&lt;p>&lt;a href="https://code.hexops.org/hexops/pkgmirror">pkgmirror&lt;/a> goes beyond just simple Zig toolchain mirroring and allows mirroring &lt;em>Zig packages&lt;/em> and binary artifacts, too. This is optional.&lt;/p>
&lt;p>In &lt;a href="https://machengine.org/">Mach&lt;/a> for example, it&amp;rsquo;s really important to us that someone can check out an old repository with an old game/app that uses Mach, and be able to get it building quickly still: a key part of this is ensuring that all of Mach&amp;rsquo;s dependencies in &lt;code>build.zig.zon&lt;/code> are hosted through reliable URLs that won&amp;rsquo;t break or dissappear over time. Sure, we could link to a Github-hosted tarball of one of our libraries - but GitHub repositories get deleted, or Microsoft enshittification of GitHub may occur and &lt;a href="https://code.hexops.org/">we have to move to a self-hosted forgejo instance, code.hexops.org&lt;/a> (which we did recently) - so we need stable, long-term URLs our dependencies can be hosted at.&lt;/p>
&lt;p>&lt;a href="https://code.hexops.org/hexops/pkgmirror">pkgmirror&lt;/a> allows configuring mirroring of Zig packages (and binary CI artifacts, too) that are hosted on GitHub, Codeberg, etc. - the approach is simple: pkgmirror caches the file on disk for you forever, and gives you a super stable URL to write in your &lt;code>build.zig.zon&lt;/code> file.&lt;/p>
&lt;h2 id="not-everyone-should-run-a-mirror">Not everyone should run a mirror&lt;/h2>
&lt;p>Not everyone should run a Zig toolchain or package mirror: it&amp;rsquo;s a serious responsibility, you should handle backups, etc. especially if others are depending on your mirror to be stable.&lt;/p>
&lt;p>But when you do want to host a Zig mirror, &lt;a href="https://code.hexops.org/hexops/pkgmirror">pkgmirror&lt;/a> is now a solid option for that - written in Zig unlike the myriad of other Go-based solutions out there.&lt;/p>
&lt;h2 id="special-thanks">Special thanks&lt;/h2>
&lt;p>&lt;a href="https://code.hexops.org/hexops/pkgmirror">pkgmirror&lt;/a> would not be possible to write in Zig today if not for karlseguin&amp;rsquo;s &lt;a href="https://github.com/karlseguin/http.zig">http.zig&lt;/a> library, and our automatic LetsEncrypt support via acme.sh wouldn&amp;rsquo;t be possible without ianic&amp;rsquo;s excellent &lt;a href="https://github.com/ianic/tls.zig">tls.zig&lt;/a> library providing a TLS 1.3 server.&lt;/p>
&lt;h2 id="thanks">Thanks&lt;/h2>
&lt;ul>
&lt;li>You can now find pkgmirror and Mach itself on our self-hosted forgejo instance, &lt;a href="https://code.hexops.org">code.hexops.org&lt;/a>&lt;/li>
&lt;li>Join the &lt;a href="https://discord.gg/XNG3NZgCqp">Mach Discord server&lt;/a> to follow development&lt;/li>
&lt;li>Consider &lt;a href="https://github.com/sponsors/emidoots">sponsoring our work&lt;/a> so we can do more of it!&lt;/li>
&lt;/ul></description></item><item><title>Building the DirectX shader compiler better than Microsoft?</title><link>https://devlog.hexops.org/2024/building-the-directx-shader-compiler-better-than-microsoft/</link><pubDate>Fri, 09 Feb 2024 00:00:00 +0000</pubDate><guid>https://devlog.hexops.com/2024/building-the-directx-shader-compiler-better-than-microsoft/</guid><description>&lt;p>This is a &lt;del>story&lt;/del> nightmare about the messy state of Microsoft&amp;rsquo;s DirectX shader compiler, and trying to wrangle it into a nicer experience for game developers. In some respects, we now build the DXC compiler better than how Microsoft does.&lt;/p>
&lt;h2 id="setting-the-stage">Setting the stage&lt;/h2>
&lt;p>For &lt;a href="https://machengine.org">Mach engine&lt;/a> we&amp;rsquo;ve been building an &lt;a href="../mach-v0.3-released/#sysgpu">experimental graphics API called sysgpu&lt;/a> using Zig, aiming to be a &lt;em>successor&lt;/em> and &lt;em>descendant&lt;/em> of WebGPU for native graphics. It will support Metal, Vulkan, Direct3D, and OpenGL backends. As part of this, we need to compile shader programs into something that Direct3D 12 can consume. But what does it consume?&lt;/p>
&lt;h2 id="a-brief-history-lesson">A brief history lesson&lt;/h2>
&lt;p>The DirectX graphics API uses HLSL as its shading language of choice. In the past, with Direct3D 11 and earlier, this compiler was called &amp;lsquo;FXC&amp;rsquo; (the &amp;rsquo;effects compiler&amp;rsquo;)&lt;/p>
&lt;h3 id="fxc-is-deprecated-dxc-enters-the-scene-with-direct3d-12">FXC is deprecated, DXC enters the scene with Direct3D 12&lt;/h3>
&lt;p>Unfortunately, FXC as a compiler is rather notoriously slow among game developers, with suboptimal code generation - meaning shaders often both compile and execute fairly suboptimally.&lt;/p>
&lt;p>With the release of Direct3D 12 and Shader Model 6.0 (SM6), Microsoft officially deprecated the FXC compiler distributed as part of the Windows OS in favor of a new compiler called &amp;lsquo;DXC&amp;rsquo; (&amp;lsquo;directx compiler&amp;rsquo;), which exists as a public Microsoft-official fork of LLVM/Clang v3.7 &lt;a href="https://github.com/microsoft/DirectXShaderCompiler">Microsoft/DirectXShaderCompiler&lt;/a> and prebuilt binaries you can download.&lt;/p>
&lt;p>In this Microsoft fork of LLVM, changes are meticulously annotated via &lt;code>// HLSL Change Start&lt;/code> and &lt;code>// HLSL Change End&lt;/code> comments making it clear who owns what code:&lt;/p>
&lt;p>&lt;a class="imglink" href="https://devlog.hexops.org/img/2024/hlsl-change-start">&lt;img src="https://devlog.hexops.org/img/2024/hlsl-change-start.png">&lt;/a>&lt;/p>
&lt;h3 id="what-a-directx-driver-eats-for-breakfast-dxbc-or-dxil">What a DirectX driver eats for breakfast: DXBC or DXIL&lt;/h3>
&lt;p>Although HLSL is the language of choice for Direct3D programming, at the end of the day GPUs under the hood all have different compute architectures and requirements: the compiled binary form of a shader program that an Intel GPU needs is going to be different from what an NVIDIA GPU needs, same goes for AMD.&lt;/p>
&lt;p>Microsoft&amp;rsquo;s role is to provide the nice game developer frontend APIs (like Direct3D, and the HLSL shanding language), while working with independent hardware vendors (IHV&amp;rsquo;s) like Intel/AMD/NVIDIA who write the drivers - bridging those nice frontend APIs to whatever is hopefully closest to hardware manufacturer&amp;rsquo;s instruction set architecture (ISA) under the hood. You can think of it like web browsers making sure JavaScript can run on both a Windows PC and a macOS Apple Silicon device, though graphics developers would spit at the suggested comparison.&lt;/p>
&lt;p>DirectX versions 9-11 had driver manufacturers consuming what is called DXBC (DirectX Byte Code) - game developers would produce DXBC either using a CLI tool to compile their HLSL programs like &lt;code>fxc.exe&lt;/code>, or at runtime using the &lt;code>d3dcompiler&lt;/code> APIs, and then the driver&amp;rsquo;s job was to take that decently-optimized shader bytecode and turn it into the actual binary that the GPU would run. This bytecode was an undocumented, proprietary format really only shared between Microsoft and GPU driver manufacturers - excluding a few odd-ball Linux developers who cared to reverse engineer it for Proton.&lt;/p>
&lt;p>With the advent of DirectX 12 and Shader Model 6.0, Microsoft aspirationally had intended to create their own standard IR called DXIR, but in 2021 they &lt;a href="https://github.com/microsoft/DirectXShaderCompiler/commit/61c6573842be58a14e1dfc6b1b3def03d39d9988">removed all language suggesting they might do this&lt;/a>. The intent &lt;em>was&lt;/em> for DXIR to be the &amp;lsquo;high level&amp;rsquo;, &amp;lsquo;unoptimized&amp;rsquo; IR form which compilers (think: Rust) could target, and then the DXC compiler could lower DXIR into the optimized DXIL bytecode form, a new &amp;rsquo;low level&amp;rsquo; post-optimization IR format, before handing it off to graphics drivers to muck with as they please before it gets translated to run on the actual hardware.&lt;/p>
&lt;p>Asked about DXIR documentation, a &lt;a href="https://github.com/microsoft/DirectXShaderCompiler/issues/2389#issuecomment-517076643">Microsoft employee&lt;/a> had noted this in 2019:&lt;/p>
&lt;blockquote>
&lt;p>Unfortunately, documentation on the lowering process [from DXIR to DXIL] is mostly non-existent. [&amp;hellip;]&lt;/p>
&lt;p>Oh, and DXIR isn&amp;rsquo;t anything official, but just the first LLVM IR after CodeGen.&lt;/p>
&lt;/blockquote>
&lt;p>As you&amp;rsquo;ll soon see, this theme of &amp;rsquo;there are no docs, just whatever our compiler actually does&amp;rsquo; will become a common pattern.&lt;/p>
&lt;h3 id="dxil">DXIL&lt;/h3>
&lt;p>DXIL (pronunciation?) is the official format that DirectX 12 driver manufacturers consume &lt;em>today&lt;/em>.&lt;/p>
&lt;p>A game developer produces DXIL bytecode using the DXC compiler, which is a fork of LLVM/clang heavily modified to support HLSL compilation, and the DirectX APIs hand that DXIL over to the graphics driver which then converts the IR into their own intermediate languages, performing any secret sauce optimization passes on it, and ultimately boiling down to the actual machine code that will run on the GPU hardware.&lt;/p>
&lt;p>Much like the old bytecode format DXBC which DXIL replaced, it is &lt;em>also&lt;/em> an undocumented bytecode format, specifically it is LLVM&amp;rsquo;s version 3.7 post-codegen post-optimization-passes bitecode format. It is undocumented not because nobody wants to document it, but rather because the documentation is literally &amp;lsquo;whatever the Microsoft fork of LLVM v3.7 with all the HLSL changes we made, after CodeGen and optimization passes have occurred, actually emits as LLVM bitcode - plus a small custom container/wrapper file format on top.&amp;rsquo;&lt;/p>
&lt;h3 id="correcting-the-microsoft-fork-of-llvm">Correcting the Microsoft fork of LLVM&lt;/h3>
&lt;p>Microsoft themselves are well aware that a bunch of independent driver manufacturers relying on and expecting to consume a hyper-specific undocumented LLVM bitcode format specifically produced by their fork is, well, less than ideal - and also aware that their fork of LLVM is not super fun to maintain, either. Quoting &lt;a href="https://github.com/microsoft/DirectXShaderCompiler/issues/5773#issuecomment-1735794551">another Microsoft employee&lt;/a> (Sep 2023) who was asked about the potential of adding DirectX 9/10/11 support to the new/better DXC compiler, they stated:&lt;/p>
&lt;blockquote>
&lt;p>DXC&amp;rsquo;s fork of LLVM removed and/or damaged much of the code generation layer and infrastructure [of LLVM]. Given that, supporting DXBC generation in DXC would be a massive task to fix and restore broken LLVM functionality. Due to the large scale of this issue and resource constraints on our team we&amp;rsquo;re not going to address this issue in [the new] DXC [compiler] ever.&lt;/p>
&lt;p>We may support DXBC generation in Clang in the future (we mentioned that in the original proposal to LLVM). That work is unlikely to begin for a few years as our focus will be on supporting DXIL and SPIR-V generation first.&lt;/p>
&lt;/blockquote>
&lt;p>As noted above, in March of 2022, Microsoft had proposed and begun work on &lt;a href="https://discourse.llvm.org/t/rfc-adding-hlsl-and-directx-support-to-clang-llvm/60783">upstreaming HLSL compilation support directly into LLVM/clang proper&lt;/a> - work that is still ongoing today - and involved &lt;em>adding back&lt;/em> legacy LLVM v3.7 bitcode writing support to modern LLVM/clang versions:&lt;/p>
&lt;blockquote>
&lt;p>By isolating as much of the DXIL-specific code as possible into a target we hope to minimize the cost on the community to maintain our legacy bitcode writing support.&lt;/p>
&lt;/blockquote>
&lt;p>i.e. the plan to get away fromn their fork is to upstream HLSL and DXIL support to LLVM/clang proper.&lt;/p>
&lt;h2 id="the-challenge-for-gamedevs-webgpu-etc">The challenge for gamedevs, WebGPU, etc.&lt;/h2>
&lt;p>Graphics abstraction layers which aim to provide a unified interface to modern graphics APIs like Metal, Direct3D 12, and Vulkan.. ultimately need to provide a unified shading language as well. If you look today, you&amp;rsquo;ll find most WebGPU implementations which do this have had a goal of &amp;lsquo;in the future we might be able to emit DXIL directly..&amp;rsquo; but in practice, none actually do.&lt;/p>
&lt;p>Instead, basically every WebGPU implementation today behaves as follows:&lt;/p>
&lt;ul>
&lt;li>The WGSL textual language first gets translated to HLSL at runtime&lt;/li>
&lt;li>HLSL is compiled into DXBC or DXIL using an HLSL compiler&lt;/li>
&lt;li>The optimized DXBC/DXIL is handed to the graphics driver, which then gets converted to the various vendor-specific ILs before finally becoming machine code that runs on the GPU.&lt;/li>
&lt;/ul>
&lt;h3 id="a-quick-detour-spir-v">A quick detour: SPIR-V&lt;/h3>
&lt;p>Vulkan/SPIR-V does much the same as the above, in fact most drivers cannot assume SPIR-V is optimized at all - though some do, and this varies by mobile/desktop GPUs - and have more work to perform to get SPIR-V turned into a &lt;em>driver-compiled&lt;/em> native binary.&lt;/p>
&lt;p>Valve has &lt;a href="https://github.com/ValveSoftware/Fossilize">Fossilize&lt;/a> and maintains caches of each specific (GPU, driver version, etc.) pairing along with the &lt;em>actual&lt;/em> driver-compiled binary for a SPIR-V blob, to enable downloading &amp;lsquo;pre-cached shaders&amp;rsquo; from Valve servers ahead of playing games for this reason: so that you don&amp;rsquo;t spend all day waiting for your computer to go brrr compiling and optimizing SPIR-V shaders into actual native code your GPU understands.&lt;/p>
&lt;p>In other words, DXIL is always post-optimization-passes LLVM &lt;em>bitcode&lt;/em>, while SPIR-V can or cannot be an an optimized form, and GPU manufacturers write their drivers based on what SPIR-V looks like in the wild - which may or may not be a pre-optimized form. SPIR-V is closer to hardware than a textual shading language, but still very far from native machine code a GPU understands.&lt;/p>
&lt;p>Only Apple&amp;rsquo;s Metal graphics API supports compiling directly to the actual target hardware&amp;rsquo;s native binary format (thanks to that iron fist they hold over their hardware, I guess.)&lt;/p>
&lt;h2 id="to-use-dxcompilerdll-or-not">To use dxcompiler.dll or not?&lt;/h2>
&lt;p>Since WGSL-&amp;gt;HLSL-&amp;gt;DXIL is happening at runtime, WebGPU runtimes are faced with a challenge: do we use the new DXC HLSL compiler, or the old, officially deprecated FXC compiler which has worse performance and codegen quality? On the surface, this hardly sounds like a difficult choice!&lt;/p>
&lt;p>However, despite this, many indie devs and game engines choose to use FXC by default. &lt;a href="https://docs.rs/bevy/latest/i686-pc-windows-msvc/bevy/render/settings/enum.Dx12Compiler.html">Bevy game engine&amp;rsquo;s documentation&lt;/a> puts it really well:&lt;/p>
&lt;blockquote>
&lt;p>The Fxc compiler (default) is old, slow and unmaintained. However, it doesn’t require any additional .dlls to be shipped with the application.&lt;/p>
&lt;p>The Dxc compiler is new, fast and maintained. However, it requires both &lt;code>dxcompiler.dll&lt;/code> and &lt;code>dxil.dll&lt;/code> to be shipped with the application. These files can be downloaded from &lt;a href="https://github.com/microsoft/DirectXShaderCompiler/releases">https://github.com/microsoft/DirectXShaderCompiler/releases&lt;/a>.&lt;/p>
&lt;/blockquote>
&lt;p>As a result, much software defaults to the old, slow and unmaintained compiler. And it&amp;rsquo;s not just Bevy: &lt;code>wgpu&lt;/code> Rust users, Dawn WebGPU users, etc. are all faced with this same question. It&amp;rsquo;s likely one of the reasons WebGPU does not support Shader Model 6.0+ functionality today - using the DXC compiler is not so pleasant: it is after all a large, clunky Microsoft fork of a C++ codebase from nearly a decade ago!&lt;/p>
&lt;h2 id="well-why-not-just-statically-link-against-it">Well, why not just statically link against it?&lt;/h2>
&lt;p>You can&amp;rsquo;t.&lt;/p>
&lt;p>Firstly, there is the issue that &lt;a href="https://github.com/microsoft/DirectXShaderCompiler/issues/4766">Microsoft&amp;rsquo;s fork of LLVM doesn&amp;rsquo;t support statically linking&lt;/a>. On the surface, this appears just to be due to some cmake files assuming &lt;code>SHARED&lt;/code> instead of &lt;code>STATIC&lt;/code> when creating libraries, but if you dig into it - as I did - you&amp;rsquo;ll soon find it is &lt;em>much&lt;/em> more involved than that.&lt;/p>
&lt;p>Switching &lt;code>SHARED&lt;/code> to &lt;code>STATIC&lt;/code> everywhere in CMake files will appear to get you a build with ~15 different static libraries to link against (not pleasant compared to just one.) You might think using cmake &lt;code>OBJECT&lt;/code> libraries could solve this, but with this you will quickly encounter an issue where although the cmake files are structured logically as dependants, they actually have implicit dependencies on eachother due to the HLSL changes Microsoft made. I am 80% sure you would need to rewrite every cmake file in the repository to support OBJECT libraries. I can say this, because I tried!&lt;/p>
&lt;p>You might be thinking, linking against ~15 static libraries isn&amp;rsquo;t SO bad as long as the final executable is static, right?&lt;/p>
&lt;p>Not so fast - many parts of DXC&amp;rsquo;s COM interface implementation is also explicitly designed to load itself as a DLL, i.e. to load &lt;code>dxcompiler.dll&lt;/code> and &lt;code>dxil.dll&lt;/code> as dynamic libraries and self-invoke methods.&lt;/p>
&lt;p>OK, we just need to patch the implementation to not call &lt;code>LoadLibraryW&lt;/code> then, basically, right?&lt;/p>
&lt;h2 id="introducing-dxildll---the-proprietary-code-signing-blob-for-directx-shaders">Introducing dxil.dll - the proprietary code signing blob for DirectX shaders&lt;/h2>
&lt;p>If you&amp;rsquo;ve ever built DirectXShaderCompiler from source, you might notice something: dxil.dll doesn&amp;rsquo;t get built. Why? It&amp;rsquo;s distributed in every release on GitHub, both for Windows (x86/arm) and Linux (x86 only).&lt;/p>
&lt;p>Strange, I thought the compiler was supposed to be open source? Well, it wouldn&amp;rsquo;t be the first time&lt;a href="https://github.com/microsoft/win32metadata/issues/766#issuecomment-1150271300">[0]&lt;/a>&lt;a href="https://github.com/microsoft/Azure-Kinect-Sensor-SDK/issues/1521">[1]&lt;/a> I&amp;rsquo;ve encountered a Microsoft &amp;lsquo;open source&amp;rsquo; repository that actually completely depends on some proprietary platform-specific code blobs behind the scenes.&lt;/p>
&lt;p>Incidentally, I stumbled across the &lt;a href="https://microsoft.github.io/DirectX-Specs/d3d/ShaderCache.html">D3D12 Shader Cache API specification&lt;/a> which mentions the existence of this proprietary code signing blob as a &amp;lsquo;good reason for invoking the shader compiler at runtime&amp;rsquo;:&lt;/p>
&lt;blockquote>
&lt;p>D3D12 will only accept signed shaders. That means that if any patching or runtime optimizations are performed, such as constant folding, the shader must be re-validated and re-signed, which is non-trivial.&lt;/p>
&lt;/blockquote>
&lt;p>And in the recent &lt;a href="https://github.com/microsoft/DirectXShaderCompiler/releases/tag/v1.8.2306-preview">&amp;lsquo;preview release&amp;rsquo; for Shader Model 6.8 functionality&lt;/a>, Microsoft notes how they appear to leverage this DLL to restrict new experimental shader functionality:&lt;/p>
&lt;blockquote>
&lt;p>The DXIL signing library (dxil.dll/libdxil.so) is not provided with this preview release. DXIL generated with this compiler targeting Shader Model 6.8 is not final, cannot be validated, and is not supported for distribution or execution on machines not running in developer mode.&lt;/p>
&lt;/blockquote>
&lt;p>In other words: if you do not have dxil.dll, then your shaders will not be signed/validated. If your shaders are not signed/validated, then they cannot run on a Windows machine unless it is running in Developer Mode.&lt;/p>
&lt;h2 id="platform-support-challenges">Platform support challenges&lt;/h2>
&lt;p>For a second, I&amp;rsquo;d like to go back to something I wrote at the start of this article:&lt;/p>
&lt;blockquote>
&lt;p>For &lt;a href="https://machengine.org">Mach engine&lt;/a> [&amp;hellip;] we need to compile shader programs into something that Direct3D 12 can consume.&lt;/p>
&lt;/blockquote>
&lt;p>I&amp;rsquo;d like for us to be able to perform offline shader compilation, and skip out on distributing the heavy DXC dependency, when desired.&lt;/p>
&lt;p>But Microsoft only distributes a copy of dxil.dll for Windows (x86/arm) and Linux (x86). There&amp;rsquo;s no Linux aarch64 binary. There&amp;rsquo;s no macOS binary. In other words, you can&amp;rsquo;t produce builds of your cross-platform game for Windows using offline shader compilation on a mac, or in your Arm Linux CI pipeline. You need a Windows or x86_64 Linux machine to run the proprietary blob.&lt;/p>
&lt;h2 id="recap">Recap&lt;/h2>
&lt;p>To recap:&lt;/p>
&lt;ul>
&lt;li>We cannot build DXC as a static library, because the decades-old Microsoft fork of LLVM v3.7 has a very messy build-system.&lt;/li>
&lt;li>Even if we could, we cannot build DXC as a static library &lt;strong>because of the proprietary code-signing blob&lt;/strong>.&lt;/li>
&lt;li>We cannot compile DirectX HLSL shaders offline on a Mac, or build our cross-platform game in an arm Linux CI pipeline, because Microsoft doesn&amp;rsquo;t distribute copies of &lt;strong>the proprietary code signing blob&lt;/strong> for those platforms.&lt;/li>
&lt;/ul>
&lt;h2 id="going-deeper">Going deeper&lt;/h2>
&lt;h3 id="un-the-build-system">Un#$@&amp;amp;%*! the build system&lt;/h3>
&lt;p>The first problem I wanted to address was how to actually build this codebase into a single static library.&lt;/p>
&lt;p>After several days of attempting to fix the implicit dependencies that changing the cmake virtual libraries from &lt;code>DYNAMIC&lt;/code> -&amp;gt; &lt;code>OBJECT&lt;/code> surfaces, I gave up. Originally, my intent was to use their existing cmake build system (so as to not diverge from their codebase too much) and just swap the compiler with &lt;code>zig cc&lt;/code> as the build toolchain for cross-compilation.&lt;/p>
&lt;p>After it slowly and painfully became apparent that direction was not going to be &lt;em>any&lt;/em> better than maintaining the entire buildsystem myself, I decided to just bite the bullet and rewrite the entire CMake build system they had, some ~10.5k lines of code, using &lt;code>build.zig&lt;/code> instead. To make things simpler, I chose to build only the two parts we (and others) really care about as consumers of the code: the &lt;code>dxcompiler.dll&lt;/code> library, and &lt;code>dxc.exe&lt;/code> binary for offline compilation / testing. (we&amp;rsquo;ll deal with &lt;code>dxil.dll&lt;/code> later.)&lt;/p>
&lt;p>This resulted in somewhere around &lt;a href="https://github.com/hexops/mach-dxcompiler/blob/bd0cfbe4230133d8d3b50eedf1a0d0c4a00f47d7/build.zig#L1-L956">~1k lines of build.zig logic&lt;/a>, and in practice it&amp;rsquo;s less than that because much of it is just related to running &lt;code>git clone&lt;/code> on the source repository, having the ability for Zig package consumers to use a prebuilt binary instead of building the large C++ library from source, and header/source generation (though we&amp;rsquo;re still not done with that, thanks to llvm-tablegen)&lt;/p>
&lt;h3 id="un-the-dynamic-library-dependency">Un#$@&amp;amp;%*! the dynamic library dependency&lt;/h3>
&lt;p>As mentioned earlier, DXC is written with the expectation that &lt;code>dxcompiler.dll&lt;/code> and &lt;code>dxil.dll&lt;/code> exist. Reading the code, it almost appears as if the COM API implementation invokes the DLL, which then invokes itself dynamically depending on which is available.&lt;/p>
&lt;p>Taking some advice from Microsoft, I got my hands dirty, &lt;em>forked their codebase&lt;/em> and got to work on the actual C++ code. I began annotating my changes with cute &lt;code>// Mach change start&lt;/code> and &lt;code>// Mach change end&lt;/code> comments, to know who owns what code. All of this existing as a choice that I hope will come back to haunt my dreams in the future as much as Microsoft&amp;rsquo;s own choice to underemploy the HLSL team and fork LLVM 3.7 originally.&lt;/p>
&lt;p>I was off to the races: &lt;a href="https://github.com/hexops/DirectXShaderCompiler/blob/4190bb0c90d374c6b4d0b0f2c7b45b604eda24b6/tools/clang/tools/dxcompiler/DXCompiler.cpp#L88">simulating dllmain&lt;/a> entrypoints, &lt;a href="https://github.com/hexops/DirectXShaderCompiler/blob/4190bb0c90d374c6b4d0b0f2c7b45b604eda24b6/tools/clang/tools/dxclib/dxc.cpp#L1258">disabling&lt;/a> the ability to print the compiler version info derived from the dlls, and &lt;a href="https://github.com/hexops/DirectXShaderCompiler/blob/4190bb0c90d374c6b4d0b0f2c7b45b604eda24b6/include/dxc/Support/dxcapi.use.h#L17">emulating dynamic library function pointer loads&lt;/a>.&lt;/p>
&lt;h3 id="un-the-proprietary-code-signing">Un#$@&amp;amp;%*! the proprietary code signing&lt;/h3>
&lt;p>All that was left was that pesky &lt;code>dxil.dll&lt;/code> - what sort of magic might Microsoft be employing in that library to &amp;ldquo;sign shaders&amp;rdquo;? How can they prevent unsigned shaders from running on Windows machines that aren&amp;rsquo;t in developer mode? How are they able to distribute that binary on Linux, too?&lt;/p>
&lt;p>I won&amp;rsquo;t comment on any of those questions, but will say that &lt;a href="https://github.com/hexops/DirectXShaderCompiler/blob/4190bb0c90d374c6b4d0b0f2c7b45b604eda24b6/tools/clang/tools/dxcompiler/MachSiegbertVogtDXCSA.cpp#L178">you&amp;rsquo;ll find dxil.dll is NOT a dependency of mach-dxcompiler in any form&lt;/a>. You can compile an HLSL shader on a macOS machine using mach-dxcompiler, without the proprietary &lt;code>dxil.dll&lt;/code> blob - and end up with a DXIL bytecode file that is byte-for-byte equal to one which runs it on a standard Windows box. Enjoy!&lt;/p>
&lt;h2 id="results">Results&lt;/h2>
&lt;p>We now have prebuilt, static binaries of the &lt;code>dxcompiler&lt;/code> library, as well as the &lt;code>dxc&lt;/code> CLI &lt;a href="https://github.com/hexops/mach-dxcompiler/releases/tag/2024.02.10%2B2c3635c.1">here&lt;/a>, with zero dependency on the proprietary &lt;code>dxil.dll&lt;/code>. At the time of writing, we have binaries building in our CI pipeline for:&lt;/p>
&lt;ul>
&lt;li>macOS (the first ever in history), both Apple Silicon (aarch64) and Intel (x86_64).&lt;/li>
&lt;li>Linux, including musl and glibc, as well as aarch64 (first ever in history) and x86_64.&lt;/li>
&lt;li>Windows, x86_64 and aarch64, including for MinGW/GNU ABI (first ever in history?)&lt;/li>
&lt;/ul>
&lt;p>Additionally included is a &lt;a href="https://github.com/hexops/mach-dxcompiler/blob/main/src/mach_dxc.h">small C API&lt;/a> the library now exposes, as an alternative to the COM API traditionally required.&lt;/p>
&lt;p>Zig game developers will find the repository also includes a Zig API, see &lt;a href="https://github.com/hexops/mach-dxcompiler/blob/main/src/main.zig">&lt;code>src/main.zig&lt;/code>&lt;/a> tests for usage. By default prebuilt binaries are downloaded/used.&lt;/p>
&lt;p>You can &lt;a href="https://github.com/hexops/mach-dxcompiler">build from source yourself&lt;/a> for any OS/arch with only &lt;code>zig&lt;/code> and &lt;code>git&lt;/code>, just make sure you have &lt;a href="https://machengine.org/about/zig-version/">the right Zig version&lt;/a>:&lt;/p>
&lt;pre tabindex="0">&lt;code>git clone https://github.com/hexops/mach-dxcompiler
cd mach-dxcompiler/
zig build -Dfrom_source -Dtarget=aarch64-macos
zig build -Dfrom_source -Dtarget=x86_64-windows-gnu
zig build -Dfrom_source -Dtarget=x86_64-linux-gnu
&lt;/code>&lt;/pre>&lt;h2 id="caveats">Caveats&lt;/h2>
&lt;p>It&amp;rsquo;s not all roses - there are some drawbacks:&lt;/p>
&lt;ul>
&lt;li>Windows MSVC ABI binaries are currently not building due to a small bug in the C bindings - will fix it quickly if important for you, otherwise at our own pace.&lt;/li>
&lt;li>Linux musl binaries are untested, they build fine and I&amp;rsquo;d be curious to know if they run fine!&lt;/li>
&lt;li>With Mach engine, we plan to use Zig itself as our shading language, not HLSL, so I do not build SPIRV-output support, sorry! I have no plans to add it.&lt;/li>
&lt;li>No plans to update this to support SM6.7 currently (released very recently), though perhaps in the future.&lt;/li>
&lt;li>LLVM&amp;rsquo;s cmake build system is not trivial, there are some aspects yet-to-be-translated. See &lt;code>generated-include/&lt;/code> for specifics which come from the cmake build system still.&lt;/li>
&lt;li>If you use this, you&amp;rsquo;ll be relying on myself to fix/address any issues. I am the only person working on this, and it exists solely to solve Mach&amp;rsquo;s own problems. If it works for you, great - but there may be a time we find a better path forward for us and it could get deprecated, so keep that in mind.&lt;/li>
&lt;/ul>
&lt;h2 id="on-a-personal-note">On a personal note&lt;/h2>
&lt;p>My name is Emi, I work a normal tech job, and after signing off from work at the end of the day I go online to build &lt;a href="https://machengine.org/">Mach engine&lt;/a>. I&amp;rsquo;ve been dreaming of being able to build a game engine like this for a long time, and I&amp;rsquo;m finally doing it!&lt;/p>
&lt;p>FOSS &lt;a href="https://devlog.hexops.org/2021/increasing-my-contribution-to-zig-to-200-a-month#i-grew-up-playing-linux-games-like-mania-drive">is in my roots&lt;/a>, I believe we should own our tools, they should empower &lt;em>us&lt;/em>-not be part of &lt;a href="https://kristoff.it/blog/the-open-source-game/">the &amp;lsquo;open source&amp;rsquo; game&lt;/a> which is all too prevelant today (even among &amp;lsquo;open source&amp;rsquo; engines.) I &lt;em>need&lt;/em> Mach to genuinely be &lt;a href="https://softwareyoucan.love">software you can love&lt;/a>.&lt;/p>
&lt;p>My dream is one day to live a simple, modest, life earning a living building Mach for everyone and selling high-quality games. Please consider &lt;a href="https://github.com/sponsors/emidoots">sponsoring my work&lt;/a> if you believe in my vision. It means the world to me!&lt;/p>
&lt;h2 id="thanks-for-reading">Thanks for reading&lt;/h2>
&lt;div style="display: flex; flex-direction: row; align-items: center;">
&lt;img align="left" style="max-height: 12.5rem;" src="https://devlog.hexops.org/img/2024/building-the-directx-shader-compiler-better-than-microsoft/img1.png">&lt;/img>
&lt;ul>
&lt;li>Check out &lt;a href="https://machengine.org">machengine.org&lt;/a>&lt;/li>
&lt;li>Consider &lt;a href="https://github.com/sponsors/emidoots">sponsoring development&lt;/a> so we can do more of it!&lt;/li>
&lt;li>Join the &lt;a href="https://discord.gg/XNG3NZgCqp">Mach Discord server&lt;/a>&lt;/li>
&lt;/ul>
&lt;/div></description></item><item><title>Mach v0.3 released - Zig game engine &amp; graphics toolkit</title><link>https://devlog.hexops.org/2024/mach-v0.3-released/</link><pubDate>Fri, 02 Feb 2024 00:00:00 +0000</pubDate><guid>https://devlog.hexops.com/2024/mach-v0.3-released/</guid><description>&lt;p>Mach is a Zig game engine &amp;amp; graphics toolkit for building high-performance &amp;amp; modular games, visualizations, and desktop/mobile apps. &lt;a href="https://machengine.org/">Learn more&lt;/a>&lt;/p>
&lt;p>We are working towards Mach 1, and have just released v0.3 which includes &lt;a href="https://github.com/hexops/mach/milestone/5?closed=1">6 months of work&lt;/a> - here are the highlights!&lt;/p>
&lt;h2 id="coming-soon-intro-to-2d-gamedev-workshop">Coming soon: intro to 2D gamedev workshop&lt;/h2>
&lt;p>The first-ever &lt;strong>intro to 2D gamedev workshop using Mach&lt;/strong> will be hosted at the &lt;a href="https://sycl.it/">Software You Can Love&lt;/a> conference in Milan, Italy, May 14-17. The workshop will use Mach&amp;rsquo;s currently in-development higher level 2D graphics APIs.&lt;/p>
&lt;p>&lt;a class="imglink" href="https://sycl.it/">&lt;img src="https://devlog.hexops.org/img/2024/sycl-workshop.png">&lt;/a>&lt;/p>
&lt;p>If you&amp;rsquo;re interested in Zig or Mach, then &lt;a href="https://sycl.it/agenda/workshops/intro-to-2d-gamedev/">check out the SYCL conference&lt;/a>! It&amp;rsquo;s an amazing experience, a great opportunity to meet a ton of Zig community members, core team members, as well as enjoy some of the best food that Italy has to offer!&lt;/p>
&lt;h2 id="community-highlight-pixi-and-scoopems">Community highlight: Pixi and Scoop&amp;rsquo;ems&lt;/h2>
&lt;p>&lt;a href="https://github.com/foxnne">@foxxne&lt;/a> is an early adopter of &lt;a href="https://machengine.org/core/">Mach core&lt;/a>, largely pushing it to its limits. They make use of Mach&amp;rsquo;s new experimental sysgpu graphics API (which we intend to be a successor/descendant of WebGPU), as well as other libraries like flecs and dear-imgui. They&amp;rsquo;re developing &lt;a href="https://github.com/foxnne/pixi">Pixi&lt;/a> - a pixel art editor:&lt;/p>
&lt;p>&lt;a class="imglink" href="https://devlog.hexops.org/img/2024/pixi1.png">&lt;img src="https://devlog.hexops.org/img/2024/pixi1.png">&lt;/a>
&lt;a class="imglink" href="https://devlog.hexops.org/img/2024/pixi2.png">&lt;img src="https://devlog.hexops.org/img/2024/pixi2.png">&lt;/a>&lt;/p>
&lt;p>And have used it to make games like &lt;a href="https://github.com/foxnne/scoop-ems">Scoop&amp;rsquo;ems&lt;/a>:&lt;/p>
&lt;video loop controls height="4000px">
&lt;source src="https://media.machengine.org/showcase/scoopems.mp4" type="video/mp4">
&lt;/video>
&lt;p>&lt;a href="https://github.com/sponsors/foxnne">@foxxne&lt;/a> is making awesome tools and games in Zig, pushing things to their limits, I encourage watching &lt;a href="https://www.youtube.com/watch?v=7K9Vzcr7vJg">how humble Colton is when speaking about their work&lt;/a>. We are very excited to make Mach support foxxne&amp;rsquo;s projects better in the future, and enable others to build things like this too.&lt;/p>
&lt;p>Please consider &lt;a href="https://github.com/sponsors/foxnne">sponsoring their work on GitHub&lt;/a> - you could be their second-ever sponsor!&lt;/p>
&lt;h2 id="mach-core">Mach core&lt;/h2>
&lt;p>&lt;a href="https://machengine.org/core/">Mach core&lt;/a> aims to provide just a window, input, and truly cross-platform graphics API.&lt;/p>
&lt;p>We think of it as an alternative/competitor to the classic options of SDL+OpenGL, GLFW+Vulkan, etc. Today, it&amp;rsquo;s not quite there yet - it uses GLFW behind the scenes for desktop support, and WebGPU as its graphics API, but we&amp;rsquo;re actively working on making it a genuine competitor written in Zig.&lt;/p>
&lt;p>In this release, it saw general bug fixes - as well as some &lt;a href="https://github.com/hexops/libmach">libmach&lt;/a> development - which aims to provide a C API to both Mach core and engine APIs.&lt;/p>
&lt;h2 id="sysgpu">sysgpu&lt;/h2>
&lt;p>In Mach v0.2, we announced an experiment - that we were working on a WebGPU implementation written in Zig, as an alternative to using Dawn (Google Chrome&amp;rsquo;s WebGPU implementation.) In the past 6 months, this experiment saw an immense amount of development and exceeded our expectations!&lt;/p>
&lt;a class="imglink" href="https://machengine.org/pkg/mach-sysgpu">
&lt;picture>&lt;source media="(prefers-color-scheme: dark)" srcset="https://machengine.org/assets/mach/sysgpu-dark.svg">&lt;img alt="mach-sysgpu" src="https://machengine.org/assets/mach/sysgpu-light.svg" style="height:7rem;margin-top:1rem">&lt;/picture>
&lt;/a>
&lt;p>&lt;a href="https://machengine.org/pkg/mach-sysgpu/">sysgpu&lt;/a> today is a nearly fully-functional WebGPU native implementation (minus browser-level safety checks), thanks to &lt;a href="https://github.com/hexops/mach-sysgpu/graphs/contributors">two amazing contributors&lt;/a>. It has functional D3D12, Vulkan, Metal, and OpenGL backends. It has it&amp;rsquo;s own WGSL shader compiler, and nearly all mach-core examples are runnable using it. We&amp;rsquo;ve even seen real applications (the Pixi pixel editor from foxxne, for example) begin to adopt it.&lt;/p>
&lt;p>As we continued development of it over the past six months, we identified key design tradeoffs where we could differ from WebGPU&amp;rsquo;s API choices and gain a faster, more modern, featureful graphics API. As a result, we&amp;rsquo;ve come to view sysgpu as a leaner and meaner &lt;em>successor and descendant of&lt;/em> WebGPU for native graphics, rather than just another implementation of it. As a result, it builds on the back of WebGPU&amp;rsquo;s design choices, but ultimately has its own distinct API and will not be ABI-compatible.&lt;/p>
&lt;p>We have plans to alleviate some &lt;em>major&lt;/em> pain points of WebGPU, specifically around pipeline creation / descriptor boilerplate, supporting push constants when available via a better API design (not as an extension), a more integrated/seamless approach to binding resources to shaders with type-correctness, and more.&lt;/p>
&lt;p>We are also evaluating using Zig itself as the shading language, instead of WGSL, and are looking to enable fully offline shader compilation as an optional feature.&lt;/p>
&lt;p>sysgpu is still under heavy development, particularly all of the &amp;lsquo;successor/descendant&amp;rsquo; API design choices noted above have not been implemented yet. It is disabled in the v0.3 release by default, and after this release we plan to invest more aggressively in it - so expect more details and specifics to come soon.&lt;/p>
&lt;h2 id="sysaudio">sysaudio&lt;/h2>
&lt;p>As a bit of background, &lt;a href="https://machengine.org/pkg/mach-sysaudio/">mach-sysaudio&lt;/a> started out as Zig bindings to Andrew Kelley&amp;rsquo;s fantastic C library &lt;a href="https://github.com/andrewrk/libsoundio">libsoundio&lt;/a>, but ultimately it grew to stand on its own two feet - becoming a brand new library written in Zig from first-principles and the ground-up to achieve similar goals: providing just low-level audio input/output, nothing else. It saw a good deal of polish in this iteration:&lt;/p>
&lt;ul>
&lt;li>&lt;a href="https://github.com/hexops/mach-sysaudio/pull/33">SIMD sample conversion support&lt;/a> - and sample conversion is now explicitly optional (so the default is to work with the driver&amp;rsquo;s active audio format.)&lt;/li>
&lt;li>Major API design improvements&lt;/li>
&lt;li>Fixed issues with microphone/input devices, specifically multi-channel devices on macOS with CoreAudio.&lt;/li>
&lt;li>Fixed &lt;a href="https://github.com/hexops/mach/issues?q=is%3Aissue+sysaudio+is%3Aclosed+milestone%3A%22Mach+0.3%22">various issues&lt;/a> with the WASAPI/Windows backend.&lt;/li>
&lt;/ul>
&lt;h3 id="audio-synthesizer-hack-project">Audio synthesizer hack-project&lt;/h3>
&lt;p>As a quick hack project over the holidays, I leveraged Mach&amp;rsquo;s audio libraries, reading midi input from a piano keyboard, synthesizing audio using Zig code / Mach, and playing it back through digital piano speakers. Here&amp;rsquo;s a few vertical videos of me being silly &amp;amp; having fun with it (skip to 2:24 to hear how I think a Ziguana might sound!):&lt;/p>
&lt;iframe width="720" height="480" src="https://www.youtube.com/embed/b8WDjaZC1C8?si=SgojGMwdD_lfncgn" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen>&lt;/iframe>
&lt;h2 id="nominated-zig-versions">Nominated Zig versions&lt;/h2>
&lt;p>Mach has always needed a sweet-spot between stable Zig and nightly Zig, a better balance of latest-and-greatest features and bug fixes, but less of a moving target than nightly Zig. To address this, we formalized how we &lt;a href="https://devlog.hexops.org/2024/announcing-nominated-zig/">nominate Zig versions&lt;/a> for use, enabling others to synchronize their Zig version with the one Mach supports more easily.&lt;/p>
&lt;h2 id="mach-engine-as-a-standard-library-of-modules">Mach engine as a standard library of modules&lt;/h2>
&lt;p>We began &lt;a href="https://machengine.org/engine/stdlib/">documenting&lt;/a> how we view Mach &lt;em>engine&lt;/em> as a standard library of modules for game development, and how we&amp;rsquo;ll enable you to use just the parts you wish. This is a small-but-important step in showcasing how the engine&amp;rsquo;s higher level APIs will be more modular than the monolithic big engines of today.&lt;/p>
&lt;h2 id="entity-component-system">Entity component system&lt;/h2>
&lt;p>The Mach entity component system provides a key role in Mach&amp;rsquo;s modularity, in this iteration it saw numerous polish / bug fix improvements - the ability to actually query entities, a more clear/concise API, etc. It is still under heavy development, however.&lt;/p>
&lt;h2 id="machmath">mach.math&lt;/h2>
&lt;p>&lt;a href="https://github.com/hexops/mach/tree/main/src/math">mach.math&lt;/a> was introduced to the Mach standard library: a custom math library tailored towards our graphics API conventions, matrix representations, coordinate systems, etc.&lt;/p>
&lt;p>Today it includes many of the basics: vectors, matrices, quaternions - though it is still missing some basic tablestakes. It also has ray-triangle intersection, and we intend to expand it to cover more general collision utilities later.&lt;/p>
&lt;p>A new set of &lt;a href="https://machengine.org/engine/math/">math docs&lt;/a> were added to the website with some cute diagrams/visualizations.&lt;/p>
&lt;h2 id="machgfxsprite">mach.gfx.Sprite&lt;/h2>
&lt;p>&lt;a href="https://github.com/hexops/mach/blob/main/src/gfx/Sprite.zig">mach.gfx.Sprite&lt;/a> was introduced, which is the start of a 2D sprite-rendering module. It is largely usable, though we anticipate its API to change a fair amount and are looking to add animated sprite support among other key features.&lt;/p>
&lt;p>It has been useful in letting us test basic rendering of hundreds of thousands of sprites, each as separate entities with their own transformation matrices calculated CPU-side, and get more of an end-to-end feel for how things are looking with our ECS:&lt;/p>
&lt;iframe width="720" height="480" src="https://www.youtube.com/embed/ciuSYf7dcuE?si=SgojGMwdD_lfncgn" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen>&lt;/iframe>
&lt;h2 id="machgfxtext">mach.gfx.Text&lt;/h2>
&lt;p>Development of a basic text rendering module is underway, but not ready for use yet.&lt;/p>
&lt;h2 id="status-of-simple-2d-game-support">Status of &amp;lsquo;simple 2D game&amp;rsquo; support&lt;/h2>
&lt;p>In short, we&amp;rsquo;re still working on it. More to come soon.&lt;/p>
&lt;h2 id="general-project-maintenance">General project maintenance&lt;/h2>
&lt;ul>
&lt;li>Two new examples &lt;a href="https://github.com/hexops/mach-core/tree/main/examples/rgb-quad">rgb-quad&lt;/a> and &lt;a href="https://github.com/hexops/mach-core/tree/main/examples/textured-quad">textured-quad&lt;/a> showing off super basic 2D rendering were added.&lt;/li>
&lt;li>Began to formulate our &lt;a href="https://github.com/hexops/mach/issues/989">hardware support plans&lt;/a>, such as when we will target certain SIMD instruction sets.&lt;/li>
&lt;li>All of our &lt;code>build.zig&lt;/code> scripts went through a great deal of changes and improvements, as Zig&amp;rsquo;s build system and package manager matured greatly.&lt;/li>
&lt;li>Various &lt;a href="https://github.com/hexops/mach/milestone/5?closed=1">other issues&lt;/a> were addressed.&lt;/li>
&lt;/ul>
&lt;h2 id="a-personal-note">A personal note&lt;/h2>
&lt;p>I work a normal tech job, and most days after I sign off from work I go online to build Mach, often like working two jobs. I&amp;rsquo;ve been doing this for a few years now, and dreaming of being able to build Mach for a decade before that.&lt;/p>
&lt;p>FOSS &lt;a href="https://devlog.hexops.org/2021/increasing-my-contribution-to-zig-to-200-a-month#i-grew-up-playing-linux-games-like-mania-drive">is in my roots&lt;/a> and I believe we should own our tools, they should empower &lt;em>us&lt;/em>-not be part of &lt;a href="https://kristoff.it/blog/the-open-source-game/">the &amp;lsquo;open source&amp;rsquo; game&lt;/a> which is all too prevelant today (even among &amp;lsquo;open source&amp;rsquo; engines.) Mach &lt;em>needs&lt;/em> to be for people like you and me-it needs to genuinely be &lt;a href="https://softwareyoucan.love">software you can love&lt;/a>.&lt;/p>
&lt;p>My dream is one day to live a simple, modest, future earning a living building Mach for you and creating high-quality games for everyone. Please consider &lt;a href="https://github.com/sponsors/emidoots">sponsoring my work&lt;/a> if you believe in this vision.&lt;/p>
&lt;h2 id="thanks">Thanks&lt;/h2>
&lt;p>Immense thank you to all those who helped make this release possible, to those who contribute regularly or in the past, and those who sponsor development. It means the world!&lt;/p>
&lt;div style="display: flex; flex-direction: row; align-items: center;">
&lt;img align="left" style="max-height: 12.5rem;" src="https://devlog.hexops.org/img/2024/mach-v0.3-released/img1.png">&lt;/img>
&lt;ul>
&lt;li>Join the &lt;a href="https://discord.gg/XNG3NZgCqp">Mach Discord server&lt;/a>&lt;/li>
&lt;li>Check out &lt;a href="https://machengine.org">machengine.org&lt;/a>&lt;/li>
&lt;li>Consider &lt;a href="https://github.com/sponsors/emidoots">sponsoring development&lt;/a> so we can do more of it!&lt;/li>
&lt;/ul>
&lt;/div></description></item><item><title>Announcing Mach nominated Zig versions</title><link>https://devlog.hexops.org/2024/announcing-nominated-zig/</link><pubDate>Sun, 07 Jan 2024 00:00:00 +0000</pubDate><guid>https://devlog.hexops.com/2024/announcing-nominated-zig/</guid><description>&lt;p>Today we&amp;rsquo;re announcing Mach nominated Zig versions, a sweet-spot between stable Zig and nightly Zig which offers a different balance of latest-and-greatest features and fixes, and less of a moving target.&lt;/p>
&lt;p>If you are in the Zig community, you likely fall into one of two categories:&lt;/p>
&lt;ul>
&lt;li>You target Zig nightly, a target which moves every day&lt;/li>
&lt;li>You target Zig stable, which may be released once or twice per year.&lt;/li>
&lt;/ul>
&lt;h2 id="the-challenge-of-using-zig-stable">The challenge of using Zig stable&lt;/h2>
&lt;p>In recent years, Zig stable has about 2 releases per year. There are great benefits to using stable Zig:&lt;/p>
&lt;ul>
&lt;li>It is generally documented how to update/migrate your code to the new Zig version, rather than it being an ad-hoc process of discovery.&lt;/li>
&lt;li>The release is at a point where the Zig core team feels it is solid and ready to go (Zig releases are done when they are done, not usually timed.)&lt;/li>
&lt;li>These versions get nice numbers associated with them, package authors, distributions, etc. can easily target them.&lt;/li>
&lt;/ul>
&lt;p>But, for a language that is developed quite quickly, and has yet to reach v1.0 stability.. when you find that an awesome new feature like the package manager, incremental compilation, or something else you care about has just landed in Zig nightly.. are you going to wait 6 months or more to get that change?&lt;/p>
&lt;h2 id="the-challenge-of-using-zig-nightly">The challenge of using Zig nightly&lt;/h2>
&lt;p>Zig nightly is an ever-moving target. Although it is often nearly as stable as stable releases, that is not neccessarily the case during large refactors - such as the migration to the self-hosted compiler.&lt;/p>
&lt;p>There are benefits to using nightly, though! It means you are testing the latest version of Zig, and your project can exist in a sort of symbiotic relationship with the Zig project where you test new functionality, help provide feedback on it, discover new issues, and have a greater chance of getting them fixed/addressed while that code is on everyone&amp;rsquo;s mind.&lt;/p>
&lt;p>Unlike stable Zig, you&amp;rsquo;re not integrating 6+ months of breaking changes into your codebase all at once (quite painful!) but rather doing so as the breaks happen. Since many others in the Zig community do target nightly Zig, there are often people around who can help you with upgrading your code.&lt;/p>
&lt;p>One major downside, aside from death by a thousand paper cuts, is that everyone targets a different Zig nightly version. Often, it&amp;rsquo;s difficult to coordinate with others and keep your code compatible with theirs.&lt;/p>
&lt;h2 id="the-challenge-mach-has-using-zig-nightly">The challenge Mach has using Zig nightly&lt;/h2>
&lt;p>We use Zig nightly, but we only periodically update the version.&lt;/p>
&lt;p>Updating Mach&amp;rsquo;s Zig version involves updating a dependency tree of over &lt;a href="https://github.com/hexops/mach/issues/1135">40+ Zig repositories&lt;/a>, first updating the Zig code itself and testing manually, then updating their CI pipelines, then going and updating anyone who depends on that repository - from the bottom of the tree to the top!&lt;/p>
&lt;p>We&amp;rsquo;ve built some serious automation to help with this process, but it is still painful - and it is also the case that every time we update our Zig version, our users need to do the same: we&amp;rsquo;re not just updating Zig for us, we&amp;rsquo;re updating Zig for every user of Mach.&lt;/p>
&lt;p>This process has worked &lt;em>pretty&lt;/em> well for us, but the frequency of updates has always been ad-hoc, and when we do update it is a bit chaotic.&lt;/p>
&lt;h2 id="we-need-our-ecosystem-of-40-packages-to-be-compatible">We need our ecosystem of 40+ packages to be compatible&lt;/h2>
&lt;p>If we haven&amp;rsquo;t updated our Zig version for a bit of time, then we end up with tens of outstanding pull requests by people who maybe don&amp;rsquo;t use &lt;em>all of Mach&lt;/em> but rather just parts of it, and they can become (rightfully!) fairly frustrated that getting their pull-request merged takes us a while.&lt;/p>
&lt;p>We can&amp;rsquo;t just merge a one-off pull request to one repository, we have to update all 40+ to be compatible with the same Zig version.&lt;/p>
&lt;p>&lt;a href="https://devlog.hexops.org/img/2024/so-many-pull-requests.png">&lt;img alt="so many pull requests" src="https://devlog.hexops.org/img/2024/so-many-pull-requests.png">&lt;/a>&lt;/p>
&lt;h2 id="coordinating-outside-mach">Coordinating outside Mach&lt;/h2>
&lt;p>Although Mach provides a lot of libraries, there are still many important aspects of gamedev we do not have yet. Some folks in the Mach community will pull in third-party Zig projects, like those from zig-gamedev, introducing another challenge in ensuring their code works with the same version.&lt;/p>
&lt;h2 id="announcing-mach-nominated-zig-versions">Announcing Mach nominated Zig versions&lt;/h2>
&lt;p>Today we&amp;rsquo;re formalizing the process we&amp;rsquo;ve (generally) been following. This formalization will make it easier for others to understand what we&amp;rsquo;re doing and when, and also make it easier for other projects to align their Zig version with Mach&amp;rsquo;s if they desire.&lt;/p>
&lt;p>Throughout the year (aiming for the 4th day of the month), we will pick the latest Zig nightly version at that time and nominate it for use:&lt;/p>
&lt;table>
&lt;thead>
&lt;tr>
&lt;th>When&lt;/th>
&lt;th>What&lt;/th>
&lt;th>🚀 Other notable event&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td>January&lt;/td>
&lt;td>&lt;/td>
&lt;td>🚀 Mach version release&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>February&lt;/td>
&lt;td>&lt;/td>
&lt;td>👋 Anticipated influx of new Machanists / Ziguanas&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>March&lt;/td>
&lt;td>⚡ Zig version nominated&lt;/td>
&lt;td>&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>April&lt;/td>
&lt;td>&lt;/td>
&lt;td>&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>May&lt;/td>
&lt;td>⚡ Zig version nominated&lt;/td>
&lt;td>&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>June&lt;/td>
&lt;td>⚡ Zig version nominated&lt;/td>
&lt;td>&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>July&lt;/td>
&lt;td>&lt;/td>
&lt;td>🚀 Mach version release&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>August&lt;/td>
&lt;td>&lt;/td>
&lt;td>👋 Anticipated influx of new Machanists / Ziguanas&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>September&lt;/td>
&lt;td>⚡ Zig version nominated&lt;/td>
&lt;td>&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>October&lt;/td>
&lt;td>&lt;/td>
&lt;td>&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>November&lt;/td>
&lt;td>⚡ Zig version nominated&lt;/td>
&lt;td>&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>December&lt;/td>
&lt;td>&lt;/td>
&lt;td>👋 Anticipated influx of new Machanists / Ziguanas&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;p>These versions will be noted as e.g. &amp;lsquo;2024.1.0-mach&amp;rsquo;, and will correspond to a specific Zig nightly version from that month.&lt;/p>
&lt;p>The exact versions, which nightly version they map to, and the whole process (which is more involved), is &lt;a href="https://machengine.org/about/nominated-zig/">documented in full here&lt;/a>.&lt;/p>
&lt;p>At the time of writing this, you&amp;rsquo;ll see that &lt;a href="https://machengine.org/about/nominated-zig/#202401">2024.1.0-mach&lt;/a> is marked as &amp;lsquo;in progress&amp;rsquo; - we will make sure that the Zig version we intend to nominate is at least compatible with all Mach projects before finalizing the nomination.&lt;/p>
&lt;h3 id="a-sweet-spot-between-nightly-and-stable">A sweet spot between nightly and stable&lt;/h3>
&lt;p>Mach&amp;rsquo;s nominated Zig versions provide a different set of tradeoffs, we believe it is a sweetspot between the two extremes of nightly and stable. You can benefit from the changes in Zig 2-3x faster than if you were using stable, and suffer less from the never-ending game of catch-up and incompatibilities between projects that nightly necessarily requires.&lt;/p>
&lt;p>Other projects can target the same Zig version if they wish to be compatible with Mach Zig packages. For example, zig-gamedev is aiming to target the same versions. We encourage other gamedevs using Zig to do the same.&lt;/p>
&lt;p>Projects that target nightly Zig can often be coincidentally compatible, too, since they possibly had a compatible Zig version around the time we nominated a Zig version for use.&lt;/p>
&lt;h2 id="final-thoughts">Final thoughts&lt;/h2>
&lt;p>You can read more about the specifics of everything in the &lt;a href="https://machengine.org/about/nominated-zig">Mach documentation&lt;/a>.&lt;/p>
&lt;p>We&amp;rsquo;re currently working on nominating the first version, which will likely be finalized in the next week or so.&lt;/p>
&lt;h2 id="thanks">Thanks&lt;/h2>
&lt;div style="display: flex; flex-direction: row; align-items: center;">
&lt;img align="left" style="max-height: 12.5rem;" src="https://devlog.hexops.org/img/2024/announcing-nominated-zig/img1.png">&lt;/img>
&lt;ul>
&lt;li>Join the &lt;a href="https://discord.gg/XNG3NZgCqp">Mach Discord server&lt;/a> (check #discuss for this article)&lt;/li>
&lt;li>Checkout &lt;a href="https://machengine.org">machengine.org&lt;/a>&lt;/li>
&lt;li>Consider &lt;a href="https://github.com/sponsors/emidoots">sponsoring development&lt;/a> so we can do more of it!&lt;/li>
&lt;/ul>
&lt;/div></description></item><item><title>Mach v0.2 released - Zig game engine &amp; graphics toolkit</title><link>https://devlog.hexops.org/2023/mach-v0.2-released/</link><pubDate>Sat, 12 Aug 2023 00:00:00 +0000</pubDate><guid>https://devlog.hexops.com/2023/mach-v0.2-released/</guid><description>&lt;p>Mach is a Zig game engine &amp;amp; graphics toolkit for building high-performance, truly cross-platform, robust &amp;amp; modular games, visualizations, and desktop/mobile GUI apps. &lt;a href="https://machengine.org/">Learn more&lt;/a>&lt;/p>
&lt;p>We&amp;rsquo;ve been developing Mach for ~2 years; this release includes over a year of work, thousands of commits, and fixes &lt;a href="https://github.com/hexops/mach/milestone/2?closed=1">300 issues&lt;/a>.&lt;/p>
&lt;video height="800px" autoplay loop muted>
&lt;source src="https://media.machengine.org/core/example/deferred-rendering.mp4" type="video/mp4">
&lt;/video>
&lt;h2 id="on-your-machine-in-just-60-seconds">On your machine in just ~60 seconds&lt;/h2>
&lt;p>With &lt;a href="https://machengine.org/about/zig-version/">this Zig nightly&lt;/a> version you can run the above demo on your machine in ~60 seconds:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-sh" data-lang="sh">&lt;span class="line">&lt;span class="cl">git clone https://github.com/hexops/mach-core
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="nb">cd&lt;/span> mach-core/
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">zig build run-deferred-rendering
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>&lt;a href="https://machengine.org/about/goals/#zero-fuss-installation">Zero system dependencies&lt;/a> to slow you down; only &lt;a href="https://machengine.org/about/zig-version/">zig&lt;/a> is needed, we build and package the few relevant dependencies on our own. &lt;small>&lt;a href="https://machengine.org/about/known-issues/">known issues&lt;/a>&lt;/small>&lt;/p>
&lt;h2 id="engine-and-core-split">Engine and Core split&lt;/h2>
&lt;p>We completely split &lt;em>Mach engine&lt;/em> and &lt;em>Mach core&lt;/em> apart, so that you get to choose your journey and decide if you just want low-level window+input+GPU and nothing else, or prefer to use our higher level engine (which although not ready for use yet, will be very modular itself):&lt;/p>
&lt;p align="center">
&lt;img class="color-auto" src="https://devlog.hexops.org/img/2023/mach-v0.2-released/img1.png" />
&lt;/p>
&lt;h2 id="mach-core">Mach core&lt;/h2>
&lt;picture>
&lt;source media="(prefers-color-scheme: dark)" srcset="https://machengine.org/assets/mach/core-full-dark.svg">
&lt;img alt="mach-core" src="https://machengine.org/assets/mach/core-full-light.svg" style="height: 7rem; margin-top: 1rem;">
&lt;/picture>
&lt;p>Mach core aims to be a &lt;em>truly cross-platform&lt;/em> way to get &lt;em>window+input+GPU, and nothing else.&lt;/em> It supports Linux, Windows, and Mac today, with WebAssembly / browser support in active development, and mobile coming in the future.&lt;/p>
&lt;p>It gives you the power of Vulkan, DirectX, Metal, and modern OpenGL in a &lt;em>single concise graphics API and shader language&lt;/em> - by compiling Google Chrome&amp;rsquo;s WebGPU implementation natively using Zig&amp;rsquo;s build system.&lt;/p>
&lt;p>Seamless multi-threading capabilities are provided, which means that your rendering and input handling are trivially decoupled from one another, you get butter-smooth window resizing, and your render loop and input handling can run at different frequencies. For example, a 60FPS render loop while your application handles keyboard &amp;amp; mouse events at a much faster dynamic rate (as fast as the OS can deliver them.)&lt;/p>
&lt;div style="align-self: center;">
&lt;video autoplay loop muted height="190px">
&lt;source src="https://media.machengine.org/core/example/gen-texture-light.mp4" type="video/mp4">
&lt;/video>
&lt;video autoplay loop muted height="190px">
&lt;source src="https://media.machengine.org/core/example/boids.mp4" type="video/mp4">
&lt;/video>
&lt;video autoplay loop muted height="190px">
&lt;source src="https://media.machengine.org/core/example/textured-cube.mp4" type="video/mp4">
&lt;/video>
&lt;/div>
&lt;p>You can think of Mach core as an alternative to the classic options of SDL, GLFW+OpenGL, etc.&lt;/p>
&lt;p>There are &lt;a href="https://machengine.org/core/examples/">15+ examples in the showcase&lt;/a>, and we&amp;rsquo;re &lt;a href="https://github.com/hexops/mach/issues/858">planning a C API&lt;/a> so it can be used from other languages as well.&lt;/p>
&lt;h2 id="engine-development-has-begun">Engine development has begun&lt;/h2>
&lt;p>&lt;strong>Mach engine is not ready for use yet, but we&amp;rsquo;ve started breaking ground on higher-level engine APIs.&lt;/strong>&lt;/p>
&lt;p>The v0.2 release focuses on deep changes and improvements to our infrastructure, primarily building out the Zig gamedev ecosystem and building foundational packages that we needed for Mach core, the engine, and a game we&amp;rsquo;re starting to build.&lt;/p>
&lt;p>As a result, we&amp;rsquo;ve &lt;em>finally&lt;/em> just broken ground on the engine side of things.&lt;/p>
&lt;img src="https://devlog.hexops.org/img/2023/mach-where-we-are.png">
&lt;h2 id="breaking-up-our-monorepo">Breaking up our monorepo&lt;/h2>
&lt;p>Previously, all of Mach&amp;rsquo;s &lt;a href="https://machengine.org/pkg/">standalone packages&lt;/a> were developed in a single giant monorepo. This was both intimidating for new contributors, and we wanted to better communicate how many standalone Zig gamedev packages we actually provide.&lt;/p>
&lt;p>Today, we&amp;rsquo;re happy to report all standalone packages are now developed in separate repositories and available via the package manager!&lt;/p>
&lt;h2 id="zig-package-manager">Zig package manager&lt;/h2>
&lt;p>We migrated 100% to the self-hosted Zig compiler and the new experimental Zig package manager, every Git submodule has been banished!&lt;/p>
&lt;p>We created &lt;a href="https://pkg.machengine.org/">pkg.machengine.org&lt;/a> - a mirror for downloading Mach packages and Zig downloads as well.&lt;/p>
&lt;h2 id="introducing-wrench-the-machanist">Introducing Wrench the Machanist&lt;/h2>
&lt;img src="https://devlog.hexops.org/img/media/mach/wrench_rocket.svg" style="width: 300px">
&lt;p>Wrench is the Mach engine mascot (artwork contributed by &lt;a href="https://keylajones.me">Keyla Jones&lt;/a>); and also &lt;a href="https://wrench.machengine.org/">our infrastructure automation&lt;/a> tool, written in Go, to help us with various tasks:&lt;/p>
&lt;p>Giving us an overview of our many repositories &lt;a href="https://wrench.machengine.org/projects/">CI statuses&lt;/a> and &lt;a href="https://wrench.machengine.org/pull-requests/">pull requests&lt;/a>:&lt;/p>
&lt;p>&lt;a href="https://github.com/hexops/mach/assets/3173176/62c3e118-faa0-40c7-a663-e9e68e000bbf">&lt;img src="https://github.com/hexops/mach/assets/3173176/62c3e118-faa0-40c7-a663-e9e68e000bbf" style="width: 300px">&lt;/a>&lt;a href="https://devlog.hexops.org/img/2023/mach-v0.2-released/img2.png">&lt;img src="https://devlog.hexops.org/img/2023/mach-v0.2-released/img2.png" style="width: 300px">&lt;/a>&lt;/p>
&lt;p>Sending us &lt;a href="https://github.com/hexops/mach/pull/953">pull requests&lt;/a> to automatically update our CI pipelines to the latest Zig version, and update our &lt;a href="https://github.com/hexops/mach/pull/926">&lt;code>build.zig.zon&lt;/code>&lt;/a> dependencies - in a fully atomic way across all our repositories at once:&lt;/p>
&lt;p>&lt;a href="https://devlog.hexops.org/img/2023/mach-v0.2-released/img3.png">&lt;img src="https://devlog.hexops.org/img/2023/mach-v0.2-released/img3.png" style="width: 300px">&lt;/a>&lt;/p>
&lt;p>&amp;hellip;and much more:&lt;/p>
&lt;ul>
&lt;li>Checking our website and docs for &lt;a href="https://github.com/hexops/mach/issues/931">broken links&lt;/a> and sending us GitHub issues.&lt;/li>
&lt;li>Performing automatic updates of involved dependencies, such as updating our fork of Google Chrome&amp;rsquo;s WebGPU implementation, which involves &lt;a href="https://github.com/hexops/mach-gpu-dawn/pull/22">pull requests&lt;/a> across a few different repositories, pushing branches, running out-of-band commands, and ultimately presenting us with helpful/pretty diffs so we can just do the human work.&lt;/li>
&lt;li>A custom CI job runner system, running on a custom mini server with actual GPUs - so we can do screenshot-based testing of graphical applications in the future.&lt;/li>
&lt;/ul>
&lt;video height="800px" controls>
&lt;source src="https://devlog.hexops.org/img/2023/mach-v0.2-released/img4.mp4" type="video/mp4">
&lt;/video>
&lt;video height="800px" controls>
&lt;source src="https://devlog.hexops.org/img/2023/mach-v0.2-released/img5.mp4" type="video/mp4">
&lt;/video>
&lt;h2 id="mach-gpu-rewritten-for-perfection">mach-gpu: rewritten for perfection&lt;/h2>
&lt;picture>
&lt;source srcset="https://devlog.hexops.org/img/media/gpu/logo_dark.svg" media="(prefers-color-scheme: dark)">
&lt;img style="height: 100px;" src="https://devlog.hexops.org/img/media/gpu/logo_light.svg">
&lt;/picture>
&lt;p>&lt;a href="https://machengine.org/pkg/mach-gpu/">mach-gpu&lt;/a> is the WebGPU interface for Zig, and last year we &lt;a href="../2022/perfecting-webgpu-native/">completely rewrote&lt;/a> it, achieving:&lt;/p>
&lt;ul>
&lt;li>Zero overhead, using comptime interfaces&lt;/li>
&lt;li>100% API coverage&lt;/li>
&lt;li>Default values for 100% of the API (which makes writing descriptors, and makes examples, look much simpler.)&lt;/li>
&lt;/ul>
&lt;h2 id="audio-development">Audio development&lt;/h2>
&lt;p>Contributor &lt;a href="https://github.com/alichraghi">@alichraghi&lt;/a> has been relentless in pushing our audio capabilities (and more) forward&lt;/p>
&lt;div style="align-self: center">
&lt;picture>
&lt;source media="(prefers-color-scheme: dark)" srcset="https://machengine.org/assets/mach/flac-full-dark.svg">
&lt;img alt="mach-flac" src="https://machengine.org/assets/mach/flac-full-light.svg" style="width: 250px">
&lt;/picture>
&lt;picture>
&lt;source media="(prefers-color-scheme: dark)" srcset="https://machengine.org/assets/mach/sysaudio-full-dark.svg">
&lt;img alt="mach-sysaudio" src="https://machengine.org/assets/mach/sysaudio-full-light.svg" style="width: 250px">
&lt;/picture>
&lt;picture>
&lt;source media="(prefers-color-scheme: dark)" srcset="https://machengine.org/assets/mach/opus-full-dark.svg">
&lt;img alt="mach-opus" src="https://machengine.org/assets/mach/opus-full-light.svg" style="width: 250px">
&lt;/picture>
&lt;/div>
&lt;p>&lt;a href="https://machengine.org/pkg/mach-sysaudio/">mach-sysaudio&lt;/a> started as Zig bindings to Andrew Kelley&amp;rsquo;s awesome &lt;a href="https://github.com/andrewrk/libsoundio">libsoundio&lt;/a> library, and ended up being a fully-fledged new library written in Zig to achieve similar goals:&lt;/p>
&lt;ul>
&lt;li>Truly cross-platform, low-level, audio IO in Zig - playback and recording with backends for:
&lt;ul>
&lt;li>Linux
&lt;ul>
&lt;li>PulseAudio&lt;/li>
&lt;li>PipeWire&lt;/li>
&lt;li>Jack&lt;/li>
&lt;li>ALSA&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Windows: WASAPI&lt;/li>
&lt;li>macOS/iOS: CoreAudio&lt;/li>
&lt;li>WebAssembly: WebAudio&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;p>Then just recently we got &lt;a href="https://machengine.org/pkg/mach-flac/">mach-flac&lt;/a> and &lt;a href="https://machengine.org/pkg/mach-opus/">mach-opus&lt;/a>, which combined give you FLAC (lossless audio) and Opus (lossy audio) via the respective battle-hardeneed xiph.org libraries.&lt;/p>
&lt;h2 id="new-website">New website&lt;/h2>
&lt;img src="https://github.com/hexops/mach/assets/3173176/6a1167bd-330c-47b4-8d8b-69e0c5cd0de2">
&lt;p>We built a &lt;a href="https://machengine.org">brand new website&lt;/a> that will serve us well into the future, featuring:&lt;/p>
&lt;ul>
&lt;li>&lt;a href="https://machengine.org/engine/roadmap/">Project roadmap&lt;/a>&lt;/li>
&lt;li>Documentation for Engine, Core, Packages, and more.&lt;/li>
&lt;li>WebGPU documentation and &lt;a href="https://machengine.org/engine/gpu/">learning material&lt;/a>&lt;/li>
&lt;li>Offline-viewing support (see link in the footer)&lt;/li>
&lt;li>All around better design, landing page, etc.&lt;/li>
&lt;/ul>
&lt;p>&lt;strong>Note:&lt;/strong> Mobile support isn&amp;rsquo;t great right now, we&amp;rsquo;ll fix it later.&lt;/p>
&lt;h2 id="browser-support-in-development">Browser support: in development&lt;/h2>
&lt;p>Chrome has already shipped WebGPU, and others will follow soon. Mach support for WebAssembly is not yet ready, but is coming along nicely:&lt;/p>
&lt;ul>
&lt;li>Input, audio, etc. is working already (&lt;a href="https://emidoots.com/mach/piano/">piano demo&lt;/a>, click in the frame and type with your A-Z keys.)&lt;/li>
&lt;li>&lt;code>mach build&lt;/code> is a new CLI command written in Zig which:
&lt;ul>
&lt;li>Starts an HTTP development server&lt;/li>
&lt;li>Invokes &lt;code>zig build&lt;/code> for you when you reload the page&lt;/li>
&lt;li>Generally provides a nice browser development experience&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;p>We currently do not have graphics support in the browser: we are currently doing a full rewrite of &lt;a href="https://machengine.org/pkg/mach-sysjs/">mach-sysjs&lt;/a> to use a code generation approach to enable Zig and JavaScript to communicate across the WebAssembly boundary, while being able to pass complex types (strings, slices, etc.) instead of just pointers and integers as normal. Once this rewrite is finished, we will generate a &lt;code>gpu.Interface&lt;/code> implementation which simply invokes the browser&amp;rsquo;s JavaScript WebGPU API with minimal overhead.&lt;/p>
&lt;h2 id="dusk-experimental-pure-zig-webgpu-implementation">Dusk: Experimental pure-Zig WebGPU implementation&lt;/h2>
&lt;picture>
&lt;source media="(prefers-color-scheme: dark)" srcset="https://machengine.org/assets/mach/dusk-full-dark.svg">
&lt;img alt="mach-dusk" src="https://machengine.org/assets/mach/dusk-full-light.svg" style="height: 7rem; margin-top: 1rem;">
&lt;/picture>
&lt;p>Dusk is a highly experimental WebGPU implementation in Zig, aiming to be blazingly fast, lean &amp;amp; mean.&lt;/p>
&lt;p>&lt;a href="https://github.com/alichraghi">@alichraghi&lt;/a> has been quietly working on Dusk tirelessly and consistently over the past year, and we think it may begin to be usable in Mach v0.3. Although it is not usable today, it already features:&lt;/p>
&lt;ul>
&lt;li>A &lt;a href="https://github.com/hexops/mach-dusk/tree/main/src/shader">full WGSL shader parser and compiler&lt;/a>, based loosely on the Zig compiler, capable of emitting SPIRV.&lt;/li>
&lt;li>A partial Vulkan implementation.&lt;/li>
&lt;/ul>
&lt;p>Dusk is a long-term bet / investment for us, we intend to always have the option of using Dawn (the Google Chrome WebGPU implementation), and we don&amp;rsquo;t expect Dusk will be the default very soon. Since both will implement the same &lt;code>gpu.Interface&lt;/code>, it&amp;rsquo;ll just be another backend you can select from at build time.&lt;/p>
&lt;p>Learn more about our goals with Dusk &lt;a href="https://machengine.org/pkg/mach-dusk/#goals">here&lt;/a>, and feel free to join the &lt;code>#dusk&lt;/code> channel in Discord or check out the repository if you&amp;rsquo;re interested in contributing some Vulkan, Metal, or Direct3D knowledge.&lt;/p>
&lt;h2 id="model-loading">Model loading&lt;/h2>
&lt;p>&lt;a href="https://machengine.org/pkg/mach-model3d/">mach-model3d&lt;/a> provides Zig bindings to &lt;a href="https://gitlab.com/bztsrc/model3d/">Model3D&lt;/a>, a compact, featureful model format &amp;amp; alternative to glTF. We may replace this with our own model format in the future, but for now this enables us to load models from Blender in a decent, performant way.&lt;/p>
&lt;video height="800px" autoplay loop muted>
&lt;source src="https://media.machengine.org/core/example/pbr-basic.mp4" type="video/mp4">
&lt;/video>
&lt;h2 id="sprite--2d-examples">Sprite / 2D examples&lt;/h2>
&lt;img src="https://media.machengine.org/core/example/sprite2d.jpg">
&lt;p>mach-core now has a &lt;a href="https://github.com/hexops/mach-core/tree/main/examples/sprite2d">sprite2d example&lt;/a> which is ~400 lines and demonstrates loading sprites from a JSON file and sprite atlas, basic keyboard movement, etc.&lt;/p>
&lt;p>We&amp;rsquo;ve already begun making a higher-level API for 2D graphics, as well - though not ready for use yet.&lt;/p>
&lt;h2 id="community">Community&lt;/h2>
&lt;ul>
&lt;li>Our &lt;a href="https://machengine.org/discord">Discord community&lt;/a> grew to over 700+ members, though we aim to keep all valuable information in GitHub issues and on the new website.&lt;/li>
&lt;li>We attended &lt;a href="https://softwareyoucan.love/">Software You Can Love&lt;/a> in Milan, Italy - gave a talk, had a table full of Zig gamedevs, and gave out some cool stickers&lt;/li>
&lt;li>We saw a number of new contributors, both one-off and ongoing.&lt;/li>
&lt;li>Many coffee was drank, and much coding was done over the holidays.&lt;/li>
&lt;/ul>
&lt;div style="align-self: center;">
&lt;img src="https://github.com/hexops/mach/assets/3173176/43573ecf-35ce-4e34-831f-425151d5c281" style="height: 190px">
&lt;img src="https://github.com/hexops/mach/assets/3173176/ad30a37d-a37c-4950-b7b6-36411c9a51f1" style="height: 190px">
&lt;img src="https://github.com/hexops/mach/assets/3173176/9cfe64fe-ba2e-49a1-bfbe-f40b66abde2b" style="height: 190px">
&lt;/div>
&lt;h2 id="a-personal-note">A personal note&lt;/h2>
&lt;p>I work a normal tech job, and every day after I sign off from work I go online to build Mach, almost like working two jobs. I&amp;rsquo;ve been working on Mach double-time like this for over two years now, and dreaming of it for a decade before that.&lt;/p>
&lt;p>FOSS &lt;a href="https://devlog.hexops.org/2021/increasing-my-contribution-to-zig-to-200-a-month#i-grew-up-playing-linux-games-like-mania-drive">is in my roots&lt;/a> and I believe we should own our tools, they should empower &lt;em>us&lt;/em>-not be part of &lt;a href="https://kristoff.it/blog/the-open-source-game/">the &amp;lsquo;open source&amp;rsquo; game&lt;/a> which is all too prevelant today (even among &amp;lsquo;open source&amp;rsquo; engines.) Mach &lt;em>needs&lt;/em> to be for people like you and me-it needs to genuinely be &lt;a href="https://softwareyoucan.love">software you can love&lt;/a>.&lt;/p>
&lt;p>My dream is one day to live a simple, modest, future earning a living building Mach for you and creating high-quality games for everyone. Please consider &lt;a href="https://github.com/sponsors/emidoots">sponsoring my work&lt;/a> if you believe in my vision.&lt;/p>
&lt;h2 id="thanks">Thanks&lt;/h2>
&lt;p>Both to everyone who has contributed and sponsored the project, as well as you for reading this far!&lt;/p>
&lt;div style="display: flex; flex-direction: row; align-items: center;">
&lt;img align="left" style="max-height: 12.5rem;" src="https://devlog.hexops.org/img/2023/mach-v0.2-released/img6.png">&lt;/img>
&lt;ul>
&lt;li>Join the &lt;a href="https://discord.gg/XNG3NZgCqp">Mach Discord server&lt;/a>&lt;/li>
&lt;li>Checkout &lt;a href="https://machengine.org">machengine.org&lt;/a>&lt;/li>
&lt;li>Consider &lt;a href="https://github.com/sponsors/emidoots">sponsoring development&lt;/a> so we can do more of it!&lt;/li>
&lt;/ul>
&lt;/div></description></item><item><title>Mach: providing an ecosystem of C libraries using the Zig package manager</title><link>https://devlog.hexops.org/2023/mach-ecosystem-c-libraries/</link><pubDate>Wed, 14 Jun 2023 00:00:00 +0000</pubDate><guid>https://devlog.hexops.com/2023/mach-ecosystem-c-libraries/</guid><description>&lt;p>&lt;a href="https://github.com/andrewrk">Andrew Kelley&lt;/a> gave a keynote speech at &lt;a href="https://softwareyoucanlove.ca">Software You Can Love 2023&lt;/a> in Vancouver last week (a recording will be available later), the outline was:&lt;/p>
&lt;blockquote>
&lt;p>&lt;strong>How to Build Software From Source&lt;/strong>&lt;/p>
&lt;p>[&amp;hellip;] then I&amp;rsquo;ll take things in a completely different direction, by &lt;strong>showing you how to rip apart a project&amp;rsquo;s build system and replace it with the zig build system.&lt;/strong> This will make building things from source work effortlessly for more people and more platforms, as well as annoy a lot of boomers. It&amp;rsquo;s going to be super fun and spicy!&lt;/p>
&lt;/blockquote>
&lt;h2 id="as-we-all-know-zig-is-a-cc-compiler-and-its-own-build-system">As we all know, Zig is a C/C++ compiler and its own build system&lt;/h2>
&lt;p>Zig is really three things:&lt;/p>
&lt;ul>
&lt;li>Programming language&lt;/li>
&lt;li>Build system (build.zig) replacing makefiles/cmake/ninja/etc&lt;/li>
&lt;li>C/C++ compiler with an emphasis on cross-compilation&lt;/li>
&lt;/ul>
&lt;p>We&amp;rsquo;ve been leveraging all three in &lt;a href="https://github.com/hexops/mach">Mach engine&lt;/a> for a while now. For example, we maintain a version of Google Chrome&amp;rsquo;s WebGPU implementation (Dawn) with its rather complex build system (code generation, python scripts, depot_tools, ninja, cmake, depot_tools, etc.) replaced with &lt;code>build.zig&lt;/code>.&lt;/p>
&lt;p>That let&amp;rsquo;s us say that if you have &lt;a href="https://github.com/hexops/mach#supported-zig-version">a recent Zig version&lt;/a> you can get started with Mach in ~60s on Windows, Mac, and Linux:&lt;/p>
&lt;pre tabindex="0">&lt;code>git clone --recursive https://github.com/hexops/mach-examples
cd mach-examples/
zig build run-textured-cube
&lt;/code>&lt;/pre>&lt;p>And instead of getting a bunch of dependency errors that you might need to &lt;code>apt-get&lt;/code> install or whatever, you&amp;rsquo;ll just get something that works out of the box:&lt;/p>
&lt;video autoplay loop muted playsinline style="width:24rem">
&lt;source src="https://devlog.hexops.org/img/2023/mach-ecosystem-c-libraries/img1.webm" type="video/webm">&lt;/video>
&lt;h2 id="zig-has-a-new-package-manager-for-cc-too">Zig has a new package manager (for C/C++ too!)&lt;/h2>
&lt;p>Those in the Zig community know that Zig has a new package manager, it&amp;rsquo;s built into the compiler. Effectively you describe your dependencies in a &lt;code>build.zig.zon&lt;/code> file, and then the &lt;code>zig&lt;/code> compiler is able to fetch them for you as part of &lt;code>zig build&lt;/code>. You&amp;rsquo;re then able to link against/use dependencies in your &lt;code>build.zig&lt;/code> file, which declaratively says how to build your project (except, using a real language instead of a DSL like cmake/etc use.)&lt;/p>
&lt;p>It&amp;rsquo;s still very experimental, &lt;a href="https://github.com/ziglang/zig/pull/14265">has little to no documentation yet&lt;/a> - it&amp;rsquo;s not ready for widespread use. But one strong point is that it also aims to address the issue of building C/C++ projects, not just Zig ones. You can write a &lt;code>build.zig&lt;/code> file in Zig, describing how to build your C/C++ project using Zig as the toolchain. Then for free you get quite solid cross-compilation (since Zig bundles clang, every glibc version, and more), plus now a dependency manager, as well as a declarative way to describe your build using the Zig language.&lt;/p>
&lt;p>One example of this is in Andrew Kelley&amp;rsquo;s &lt;a href="https://github.com/andrewrk/ffmpeg/">fork of ffmpeg&lt;/a>, where he merely forked the ffmpeg repository, removed their build system &amp;amp; unnecessary files, and added &lt;a href="https://github.com/andrewrk/ffmpeg/blob/main/build.zig">a &lt;code>build.zig&lt;/code> file&lt;/a>. This allows you to clone the repository and &lt;code>zig build&lt;/code> will fetch all the required dependencies and build ffmpeg for you. Fancy!&lt;/p>
&lt;h2 id="mach-engine">Mach engine&lt;/h2>
&lt;p>&lt;a href="https://github.com/hexops/mach">Mach engine&lt;/a> is an upcoming game engine built in Zig, that we&amp;rsquo;re building with the aim of becoming competitive with Unity/Unreal/Godot - but with an emphasis on &lt;em>modularity&lt;/em>:&lt;/p>
&lt;ul>
&lt;li>&lt;strong>Mach core&lt;/strong>: If you choose to use &lt;em>core&lt;/em>, then it&amp;rsquo;s like an alternative to GLFW+OpenGL or SDL, you just get a window+input+WebGPU with minimal dependencies. Your application runs natively on Windows/Linux/Mac using their respective graphics APIs (DirectX12, Vulkan, Metal), you get cross-compilation and zero-fuss installation, and also web/mobile support in the future with the same codebase. Write once, run everywhere.&lt;/li>
&lt;li>&lt;strong>Mach engine&lt;/strong>: If you choose this option, you &lt;em>additionally&lt;/em> get an entity component system - with a library of standard modules that you can &amp;lsquo;plug and play&amp;rsquo; with for rendering/audio/etc.&lt;/li>
&lt;/ul>
&lt;h3 id="keeping-our-runtime-c-dependencies-small">Keeping our runtime C dependencies small&lt;/h3>
&lt;p>One way that we&amp;rsquo;re keeping &lt;em>runtime C dependencies&lt;/em> (the ones your game/app would ship with!) a smaller, focused, set - is by building tooling: a &lt;code>mach&lt;/code> CLI and fully-fledged GUI editor like other engines have. But how does that help reduce runtime dependencies? Well, at runtime you may need:&lt;/p>
&lt;ul>
&lt;li>Harfbuzz: for Unicode text layout&lt;/li>
&lt;li>GLFW (and some headers): for window management)&lt;/li>
&lt;li>Basisu and PNG: for GPU supercompressed textures / lossless textures&lt;/li>
&lt;li>Opus and FLAC: for lossy and lossless audio&lt;/li>
&lt;/ul>
&lt;p>Mach will &amp;lsquo;bless&amp;rsquo; certain formats, being opinionated in what you &lt;em>ship&lt;/em> with your game. You&amp;rsquo;re free to pull in other formats, if you like, but the default/easy path will be these ones. As a result, there&amp;rsquo;s a lot we &lt;em>won&amp;rsquo;t&lt;/em> need at runtime:&lt;/p>
&lt;ul>
&lt;li>Freetype&lt;/li>
&lt;li>JPEG, TGA, or other image formats&lt;/li>
&lt;li>MP3, ffmpeg, or other audio formats&lt;/li>
&lt;/ul>
&lt;p>We don&amp;rsquo;t need these because our &lt;em>tooling&lt;/em> (the CLI and GUI editor) is going to make it easy to convert whatever format you want into the &amp;lsquo;blessed&amp;rsquo; runtime formats. One major benefit of this is that we can nudge you to the right defaults, without you being an expert. For example, you probably want to be using texture compression formats that GPU hardware itself understands, instead of say shipping a JPEG that just gets expanded to an uncompressed texture, eating a bunch of GPU memory and harming your texture bandwidth.&lt;/p>
&lt;h2 id="providing-an-ecosystem-of-c-libraries">Providing an ecosystem of C libraries&lt;/h2>
&lt;p>Similar to Andrew Kelley&amp;rsquo;s ffmpeg fork (although, with a few niceties to verify the supply chain) - Mach is now maintaining forks of various C libraries that we make use of. These aren&amp;rsquo;t Zig bindings to these libraries (which we have separately), but rather are just forks of the actual project with their build system replaced by &lt;code>build.zig&lt;/code>.&lt;/p>
&lt;p>A massive special thanks to &lt;a href="https://mzte.de/git/">@LordMZTE&lt;/a> who has been tirelessly pushing us along here over the past month-ish, helping to inch us ever-closer to fully adopting the new package manager.&lt;/p>
&lt;h3 id="forks-we-maintain">Forks we maintain&lt;/h3>
&lt;p>We have &lt;em>forks&lt;/em> of these projects which switch their build systems to &lt;code>build.zig&lt;/code>:&lt;/p>
&lt;ul>
&lt;li>&lt;a href="https://github.com/hexops/harfbuzz">hexops/harfbuzz&lt;/a>&lt;/li>
&lt;li>&lt;a href="https://github.com/hexops/freetype">hexops/freetype&lt;/a>&lt;/li>
&lt;li>&lt;a href="https://github.com/hexops/brotli">hexops/brotli&lt;/a>&lt;/li>
&lt;li>&lt;a href="https://github.com/hexops/glfw">hexops/glfw&lt;/a>&lt;/li>
&lt;li>&lt;a href="https://github.com/hexops/basisu">hexops/basisu&lt;/a> (basis_universal, supercompressed textures)&lt;/li>
&lt;/ul>
&lt;h3 id="a-note-about-supply-chain-verification">A note about supply chain verification&lt;/h3>
&lt;p>I personally care a lot about supply chain security - and more importantly, bugs. In general, I don&amp;rsquo;t ever want anyone to have to &amp;lsquo;wonder&amp;rsquo; if our fork of a library has some strange patches applied to it or something.&lt;/p>
&lt;p>As a result, in each of these forks we&amp;rsquo;ve taken the time to ensure &lt;em>you&lt;/em> know the exact &lt;code>git diff&lt;/code> command you can run to verify that our fork &lt;em>exactly&lt;/em> matches the upstream version - with the only difference being &lt;em>removing the project&amp;rsquo;s old build system, and unnecessary files&lt;/em>.&lt;/p>
&lt;h3 id="header-packages-were-maintaining">Header packages we&amp;rsquo;re maintaining&lt;/h3>
&lt;p>In addition to the above, we&amp;rsquo;re maintaining the following which aren&amp;rsquo;t strict forks (a repository for each would simply be too much for us to maintain), but rather are collections of common headers that you very often need together. These can help you build GLFW, SDL, and other such applications.&lt;/p>
&lt;p>Some headers are generated with platform-specific tools (e.g. in the case of Wayland this is needed.) We always provide the exact steps we used to produce the headers from upstream repositories in an &lt;code>update-upstream.sh&lt;/code> script, with the intent that you can fully reproduce what&amp;rsquo;s in these packages and have confidence it came from the upstream repository.&lt;/p>
&lt;ul>
&lt;li>&lt;a href="https://github.com/hexops/vulkan-headers">hexops/vulkan-headers&lt;/a>&lt;/li>
&lt;li>&lt;a href="https://github.com/hexops/linux-audio-headers">hexops/linux-audio-headers&lt;/a> includes:
&lt;ul>
&lt;li>ALSA&lt;/li>
&lt;li>Jack&lt;/li>
&lt;li>PipeWire&lt;/li>
&lt;li>PulseAudio&lt;/li>
&lt;li>SPA&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>&lt;a href="https://github.com/hexops/x11-headers">hexops/x11-headers&lt;/a> includes:
&lt;ul>
&lt;li>x11&lt;/li>
&lt;li>xcb&lt;/li>
&lt;li>xkbcommon&lt;/li>
&lt;li>xcursor&lt;/li>
&lt;li>xrandr&lt;/li>
&lt;li>xfixes&lt;/li>
&lt;li>xrender&lt;/li>
&lt;li>xinerama&lt;/li>
&lt;li>xi&lt;/li>
&lt;li>xext&lt;/li>
&lt;li>xorgproto&lt;/li>
&lt;li>GLX&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>&lt;a href="https://github.com/hexops/wayland-headers">hexops/wayland-headers&lt;/a> includes:
&lt;ul>
&lt;li>xdg-shell&lt;/li>
&lt;li>xdg-decoration&lt;/li>
&lt;li>viewporter&lt;/li>
&lt;li>pointer-constraints-unstable-v1&lt;/li>
&lt;li>relative-pointer-unstable-v1&lt;/li>
&lt;li>idle-inhibit-unstable-v1&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="how-do-i-use-these">How do I use these?&lt;/h3>
&lt;p>Later we&amp;rsquo;ll provide more details specifically on how to use these. Effectively you just add them to a &lt;code>build.zig.zon&lt;/code> file next to your &lt;code>build.zig&lt;/code> and then call &lt;code>b.dependency(&amp;quot;name&amp;quot;)&lt;/code> to retrieve each one. If you need more help than that, you might need to &lt;a href="https://discord.gg/XNG3NZgCqp">join our Discord&lt;/a> because as mentioned previously the Zig package manager is pretty immature and has sharp edges today. There are &lt;a href="https://github.com/hexops/mach/issues/721">lots of known issues &amp;amp; bugs&lt;/a> that prevent even us from using it fully today.&lt;/p>
&lt;p>But, it is coming along rather quickly! We wanted to let the broader Zig community know we&amp;rsquo;re maintaining these packages to help with collaboration.&lt;/p>
&lt;h2 id="help-us-become-sustainable">Help us become sustainable&lt;/h2>
&lt;p>We&amp;rsquo;re working towards Mach v0.2, this article was one of the first steps in beginning to share the progress we&amp;rsquo;ve been making towards that behind the scenes over the past several months. We have some exciting things to share next, this was the &amp;lsquo;boring&amp;rsquo; article that had to go first. :)&lt;/p>
&lt;p>&lt;img align="left" style="max-height: 150px;" src="https://devlog.hexops.org/img/2023/mach-ecosystem-c-libraries/img2.png">&lt;/img>
&lt;br>&lt;br>
Consider &lt;a href="https://github.com/sponsors/emidoots">sponsoring my work&lt;/a> to help us become a sustainable OSS project and enable us to do more in the future.
&lt;br>&lt;br>
Join the &lt;a href="https://discord.gg/XNG3NZgCqp">Mach Discord&lt;/a> where we&amp;rsquo;re building the future of Zig game development in realtime!&lt;/p></description></item><item><title>Zig tips: v0.11 std.build API / package manager changes</title><link>https://devlog.hexops.org/2023/zig-0-11-breaking-build-changes/</link><pubDate>Mon, 13 Feb 2023 00:00:00 +0000</pubDate><guid>https://devlog.hexops.com/2023/zig-0-11-breaking-build-changes/</guid><description>&lt;p>We&amp;rsquo;ve just updated &lt;a href="https://machengine.org/">Mach engine&lt;/a> to use the latest Zig nightly version, which includes a fair amount of improvements and breaking changes to the &lt;code>std.build&lt;/code> API used in &lt;code>build.zig&lt;/code> files, and figured now would be a good time to share the general changes you may need to make if you want to update your own code.&lt;/p>
&lt;h2 id="package-manager-incoming">Package manager: incoming!&lt;/h2>
&lt;p>Zig is finally starting to see its package manager and build system shape up, some notable mentions:&lt;/p>
&lt;ul>
&lt;li>&lt;code>std.http.Client&lt;/code> and &lt;code>std.crypto.tls&lt;/code> were added (&lt;a href="https://github.com/ziglang/zig/pull/13980">#13980&lt;/a>)&lt;/li>
&lt;li>The package manager MVP landed almost a month ago and has seen steady improvements since (&lt;a href="https://github.com/ziglang/zig/pull/14265">#14265&lt;/a>)&lt;/li>
&lt;li>Zig packages can now expose C headers are part of their public API (&lt;a href="https://github.com/ziglang/zig/pull/14449">#14449&lt;/a>)&lt;/li>
&lt;li>Transitive dependencies are now handled better (&lt;a href="https://github.com/ziglang/zig/pull/14392">#14392&lt;/a>)&lt;/li>
&lt;li>&amp;ldquo;zig build: The breakings will continue until morale improves.&amp;rdquo; (&lt;a href="https://github.com/ziglang/zig/pull/14498">#14498&lt;/a>)&lt;/li>
&lt;li>Zig Object Notation (ZON, an alternative to JSON) was introduced (&lt;a href="https://github.com/ziglang/zig/pull/14523">#14523&lt;/a>)&lt;/li>
&lt;li>The caching system is being moved from the compiler to the std lib to start using it in the bulid system (&lt;a href="https://github.com/ziglang/zig/pull/14571">#14571&lt;/a>)&lt;/li>
&lt;li>Zig plans to run the build system in a sandboxed WASM environment (&lt;a href="https://github.com/ziglang/zig/issues/14286">#14286&lt;/a>)&lt;/li>
&lt;/ul>
&lt;p>You can get an overview of progress on the package manager on this &lt;a href="https://github.com/ziglang/zig/projects/4">GitHub project board&lt;/a>&lt;/p>
&lt;p>Mach isn&amp;rsquo;t yet using the new package manager: it&amp;rsquo;s improving rapidly, and we plan to make use of it soon, but things are still changing so we&amp;rsquo;ve held off for now. What we have done, though, is updated to the latest API and want to share those changes with you.&lt;/p>
&lt;h2 id="release-options-have-been-renamed-to-optimization">Release options have been renamed to optimization&lt;/h2>
&lt;p>Previously you would&amp;rsquo;ve used &lt;code>b.standardReleaseOptions()&lt;/code> which would provide your &lt;code>zig build&lt;/code> command with multiple options like &lt;code>zig build -Drelease-fast=true&lt;/code>, &lt;code>zig build -Drelease-safe=true&lt;/code>, etc.&lt;/p>
&lt;p>It&amp;rsquo;s been renamed to &lt;code>b.standardOptimizeOption(.{})&lt;/code> and now exposes a single build option &lt;code>zig build -Doptimize=ReleaseFast&lt;/code>, &lt;code>zig build -Doptimize=ReleaseSafe&lt;/code>, etc. instead.&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-diff" data-lang="diff">&lt;span class="line">&lt;span class="cl">&lt;span class="gd">-pub fn build(b: *std.build.Builder) void {
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="gd">- const mode = b.standardReleaseOptions();
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="gd">- const target = b.standardTargetOptions(.{});
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="gd">&lt;/span>&lt;span class="gi">+pub fn build(b: *std.Build) void {
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="gi">+ const optimize = b.standardOptimizeOption(.{});
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="gi">+ const target = b.standardTargetOptions(.{});
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-diff" data-lang="diff">&lt;span class="line">&lt;span class="cl">&lt;span class="gd">-mode: std.builtin.Mode
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="gd">&lt;/span>&lt;span class="gi">+optimize: std.builtin.OptimizeMode
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-diff" data-lang="diff">&lt;span class="line">&lt;span class="cl">&lt;span class="gd">-step.build_mode
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="gd">&lt;/span>&lt;span class="gi">+step.optimize
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;h2 id="creating-tests-libraries-and-executables">Creating tests, libraries, and executables&lt;/h2>
&lt;p>Creating tests, libraries, and executables now takes a struct with options as the parameter instead of using a setter API:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-diff" data-lang="diff">&lt;span class="line">&lt;span class="cl">&lt;span class="gd">-const exe = b.addExecutable(&amp;#34;example&amp;#34;, &amp;#34;src/main.zig&amp;#34;);
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="gd">-exe.setBuildMode(mode);
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="gd">-exe.setTarget(target);
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="gd">&lt;/span>&lt;span class="gi">+const exe = b.addExecutable(.{
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="gi">+ .name = &amp;#34;example&amp;#34;,
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="gi">+ .root_source_file = &amp;#34;src/main.zig&amp;#34;,
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="gi">+ .target = target,
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="gi">+ .optimize = optimize,
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="gi">+});
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;details>
&lt;summary>See more examples&lt;/summary>
&lt;p>Tests:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-diff" data-lang="diff">&lt;span class="line">&lt;span class="cl">&lt;span class="gd">-const main_tests = b.addTestExe(&amp;#34;glfw-tests&amp;#34;, sdkPath(&amp;#34;/src/main.zig&amp;#34;));
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="gd">-main_tests.setBuildMode(mode);
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="gd">-main_tests.setTarget(target);
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="gd">&lt;/span>&lt;span class="gi">+const main_tests = b.addTest(.{
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="gi">+ .name = &amp;#34;glfw-tests&amp;#34;,
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="gi">+ .kind = .test_exe,
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="gi">+ .root_source_file = .{ .path = sdkPath(&amp;#34;/src/main.zig&amp;#34;) },
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="gi">+ .target = target,
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="gi">+ .optimize = optimize,
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="gi">+});
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Shared libraries:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-diff" data-lang="diff">&lt;span class="line">&lt;span class="cl">&lt;span class="gd">-const lib = b.addSharedLibrary(&amp;#34;glfw&amp;#34;, null, .unversioned)
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="gd">-lib.setTarget(target);
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="gd">-lib.setBuildMode(mode);
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="gd">&lt;/span>&lt;span class="gi">+b.addSharedLibrary(.{ .name = &amp;#34;glfw&amp;#34;, .target = target, .optimize = optimize })
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-diff" data-lang="diff">&lt;span class="line">&lt;span class="cl">&lt;span class="gd">-const lib = b.addSharedLibrary(&amp;#34;machcore&amp;#34;, &amp;#34;src/platform/libmachcore.zig&amp;#34;, .unversioned);
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="gd">-lib.setTarget(target);
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="gd">-lib.setBuildMode(mode);
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="gd">&lt;/span>&lt;span class="gi">+const lib = b.addSharedLibrary(.{
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="gi">+ .name = &amp;#34;machcore&amp;#34;,
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="gi">+ .root_source_file = &amp;#34;src/platform/libmachcore.zig&amp;#34;,
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="gi">+ .target = target,
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="gi">+ .optimize = optimize
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="gi">+});
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Static libraries:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-diff" data-lang="diff">&lt;span class="line">&lt;span class="cl">&lt;span class="gd">-const lib = b.addStaticLibrary(&amp;#34;basisu-transcoder&amp;#34;, null);
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="gd">-lib.setTarget(target);
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="gd">-lib.setMode(mode);
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="gd">&lt;/span>&lt;span class="gi">+const lib = b.addStaticLibrary(.{
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="gi">+ .name = &amp;#34;basisu-transcoder&amp;#34;,
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="gi">+ .target = target,
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="gi">+ .optimize = optimize,
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="gi">+});
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;/details>
&lt;h2 id="renamings">Renamings&lt;/h2>
&lt;ul>
&lt;li>&lt;code>std.build.LibExeObjStep&lt;/code> has been renamed to just &lt;code>std.build.CompileStep&lt;/code> (beautiful!)&lt;/li>
&lt;li>&lt;code>*std.build.Builder&lt;/code> has been renamed to just &lt;code>*std.Build&lt;/code> (nice, this is used extensively everywhere!)&lt;/li>
&lt;/ul>
&lt;h2 id="modules">Modules&lt;/h2>
&lt;p>Units of code you &lt;code>@import(&amp;quot;foo&amp;quot;)&lt;/code> (previously known as &lt;em>packages&lt;/em>) are now known as &lt;em>modules&lt;/em>, and &lt;em>packages&lt;/em> now refers to a piece of code you download/depend on using the Zig package manager. &lt;em>Libraries&lt;/em> is reserved for referring to C-style libraries, &lt;code>.dll&lt;/code>s, etc.&lt;/p>
&lt;p>These units of code used to be declared as a &lt;code>std.build.Pkg&lt;/code> struct:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="kr">pub&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">my_pkg&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">std&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">build&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">Pkg&lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">name&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="s">&amp;#34;earcut&amp;#34;&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">source&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">.{&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">path&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="s">&amp;#34;src/main.zig&amp;#34;&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">},&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="p">};&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>And added as a dependency using e.g. &lt;code>exe.addPackage(my_pkg);&lt;/code>&lt;/p>
&lt;p>Now, these are called &lt;em>modules&lt;/em> and can be created in a few ways. One is using &lt;code>b.createModule&lt;/code>:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">my_module&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">b&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">createModule&lt;/span>&lt;span class="p">(.{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">source_file&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">.{&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">path&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="s">&amp;#34;src/main.zig&amp;#34;&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">},&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">dependencies&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">&amp;amp;&lt;/span>&lt;span class="p">.{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">.{&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">name&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="s">&amp;#34;core&amp;#34;&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">module&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">core&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">module&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">b&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">},&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">.{&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">name&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="s">&amp;#34;ecs&amp;#34;&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">module&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">ecs&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">module&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">b&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">},&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">.{&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">name&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="s">&amp;#34;sysaudio&amp;#34;&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">module&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">sysaudio&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">module&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">b&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">},&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">},&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="p">});&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>And then depend on that module using e.g. &lt;code>exe.addModule(&amp;quot;earcut&amp;quot;, my_module);&lt;/code>&lt;/p>
&lt;p>Notably, modules are created at &lt;em>runtime&lt;/em> via the &lt;code>*std.Build&lt;/code> now - so you may have some reworking to do if you previously depended on &lt;code>std.build.Pkg&lt;/code> being a global constant you could rely on at comptime.&lt;/p>
&lt;p>Another option, which may be preferred, is via &lt;a href="https://github.com/ziglang/zig/blob/fc48467a97021cb872ff2a947f96e882274c39c1/lib/std/Build.zig#L547-L558">&lt;code>addModule&lt;/code>&lt;/a>. It will make the module available to other packages which depend on this package.&lt;/p>
&lt;p>You may also like to know that a &lt;em>pair of dependency name + the module&lt;/em> can be represented as &lt;a href="https://github.com/ziglang/zig/blob/fc48467a97021cb872ff2a947f96e882274c39c1/lib/std/Build.zig#L560-L563">&lt;code>std.Build.ModuleDependency&lt;/code>&lt;/a> now.&lt;/p>
&lt;p>We&amp;rsquo;ve just gone for an initial 1:1 translation in our code, but adoption of the package manager will likely mean structuring your code a bit differently than the above, and the package manager is still a work-in-progress.&lt;/p>
&lt;h2 id="thanks-for-reading">Thanks for reading&lt;/h2>
&lt;p>As we work towards Mach v0.2, we&amp;rsquo;re getting more serious about what &lt;em>stability&lt;/em> means for us. Our intent is to enable us to move quickly, while also helping you to update your code. We will be achieving this through articles like this which help you understand &amp;amp; update your code to the latest APIs. Hopefully this has helped you! You can find other &lt;em>zig: Tips&lt;/em> &lt;a href="https://devlog.hexops.org/categories/zigtips/">here&lt;/a>.&lt;/p>
&lt;p>&lt;img align="left" style="max-height: 150px;" src="https://devlog.hexops.org/img/2023/zig-0-11-breaking-build-changes/img2.png">&lt;/img>
Be sure to join the &lt;a href="https://discord.gg/XNG3NZgCqp">Mach engine Discord&lt;/a> where we&amp;rsquo;re building the future of Zig game development.
&lt;br>&lt;br>
You can also &lt;a href="https://github.com/sponsors/emidoots">sponsor my work&lt;/a> if you like what I&amp;rsquo;m doing! :)&lt;/p></description></item><item><title>Debugging undefined behavior caught by Zig</title><link>https://devlog.hexops.org/2022/debugging-undefined-behavior/</link><pubDate>Mon, 14 Nov 2022 00:00:00 +0000</pubDate><guid>https://devlog.hexops.com/2022/debugging-undefined-behavior/</guid><description>&lt;p>&lt;a href="https://machengine.org/">Mach engine&lt;/a> uses &lt;a href="https://ziglang.org">Zig&lt;/a> as the C/C++ compiler for almost everything. Unlike other toolchains, Zig enables many more safety checks by default - such as clang&amp;rsquo;s undefined behavior sanitizer.&lt;/p>
&lt;p>Using Zig, we&amp;rsquo;ve caught undefined behavior in established projects like GLFW[&lt;a href="https://devlog.hexops.org/2021/perfecting-glfw-for-zig-and-finding-undefined-behavior/">1&lt;/a>], the DirectX Shader Compiler[&lt;a href="https://github.com/microsoft/DirectXShaderCompiler/pull/4178#discussion_r780733405">2&lt;/a>], and more. Undefined behavior is everywhere, often relatively innocuous, and hard to catch.&lt;/p>
&lt;p>Professional C/C++ developers know to run UBSan fuzzers as part of their test suite, but even with that we&amp;rsquo;ve found e.g. Google Chrome&amp;rsquo;s fuzzers weren&amp;rsquo;t running normally at one point, and we caught undefined behavior in Chrome&amp;rsquo;s implementation of WebGPU as a result[&lt;a href="https://dawn-review.googlesource.com/c/dawn/+/87380">3&lt;/a>].&lt;/p>
&lt;p>Zig having UBSan enabled by default is valuable, but it can also lead to tricky to debug errors that are a bit annoying to interpret today. And so this article is a walkthrough of how to debug such an error when it arises using Zig and LLDB.&lt;/p>
&lt;h2 id="the-situation">The situation&lt;/h2>
&lt;p>We&amp;rsquo;re looking at using &lt;a href="https://bztsrc.gitlab.io/model3d/">model3d&lt;/a> in Mach. Model3D is an up-and-coming compact, featureful, universal model format that tries to address the shortcomings of existing formats (yes, including glTF - see &lt;a href="https://gitlab.com/bztsrc/model3d/#rationale">their rationale&lt;/a>.) It is a small, zero-dependency single-header C implementation which in Zig we can simply import.&lt;/p>
&lt;p>As we&amp;rsquo;ve been testing it with various models, though, we found our Zig program just crashes when we use it:&lt;/p>
&lt;pre tabindex="0">&lt;code>% zig build test 2&amp;gt;&amp;amp;1|cat
1/1 test_0... The following command terminated unexpectedly:
cd /mach/libs/model3d &amp;amp;&amp;amp; /mach/libs/model3d/zig-cache/o/26c4104a1643fed2068dfa9244dfe90e/model3d-tests /Users/emidoots/zig-macos-aarch64-0.11.0-dev.38+b40fc7018/zig
error: the following build command failed with exit code 1:
/mach/libs/model3d/zig-cache/o/679e494577315c1bcc3749ee7068ea2f/build /Users/emidoots/zig-macos-aarch64-0.11.0-dev.38+b40fc7018/zig /mach/libs/model3d /mach/libs/model3d/zig-cache /Users/emidoots/.cache/zig test
&lt;/code>&lt;/pre>&lt;p>As you can see, we&amp;rsquo;re not getting much info here on why our tests crashed. This is a telltale sign of undefined behavior in Zig (and &lt;a href="https://github.com/ziglang/zig/issues/5163">there&amp;rsquo;s an open issue to make this error messaging way more clear&lt;/a>). When we compile our program with &lt;code>-Drelease-fast&lt;/code>, which disables safety checks, we find it runs as expected - which confirms our suspicion about it being a safety check.&lt;/p>
&lt;h2 id="debugging-with-lldb">Debugging with LLDB&lt;/h2>
&lt;p>In the output above, we can grab the path to the executable &lt;code>model3d-tests&lt;/code>. We can debug it using lldb (I think we should find a nicer way to invoke &lt;code>lldb&lt;/code> via &lt;code>zig build test&lt;/code> though!):&lt;/p>
&lt;pre tabindex="0">&lt;code>lldb -- /mach/libs/model3d/zig-cache/o/26c4104a1643fed2068dfa9244dfe90e/model3d-tests
&lt;/code>&lt;/pre>&lt;p>Next we just enter the &lt;code>run&lt;/code> command at the &lt;code>(lldb)&lt;/code> prompt:&lt;/p>
&lt;pre tabindex="0">&lt;code>(lldb) run
Process 6830 launched: &amp;#39;/mach/libs/model3d/zig-cache/o/26c4104a1643fed2068dfa9244dfe90e/model3d-tests&amp;#39; (arm64)
Process 6830 stopped
* thread #1, queue = &amp;#39;com.apple.main-thread&amp;#39;, stop reason = EXC_BREAKPOINT (code=1, subcode=0x100033634)
frame #0: 0x0000000100033634 model3d-tests`m3d_load(data=&amp;#34;\xc3.:&amp;gt;&amp;#34;, readfilecb=0x0000000000000000, freecb=0x0000000000000000, mtllib=0x0000000000000000) at m3d.h:3356:56
3353 model-&amp;gt;tmap[i].v = (M3D_FLOAT)(*((uint16_t*)(data+2))) / (M3D_FLOAT)65535.0;
3354 break;
3355 case 4:
-&amp;gt; 3356 model-&amp;gt;tmap[i].u = (M3D_FLOAT)(*((float*)(data+0)));
3357 model-&amp;gt;tmap[i].v = (M3D_FLOAT)(*((float*)(data+4)));
3358 break;
3359 case 8:
&lt;/code>&lt;/pre>&lt;p>If you squint, you can see &lt;code>stop reason = EXC_BREAKPOINT&lt;/code> in the output. This is the sign I was looking for: it tells me that UBSan likely inserted a breakpoint for undefined behavior, and we hit it. There&amp;rsquo;s &lt;em>some&lt;/em> form of undefined behavior going on here!&lt;/p>
&lt;p>Thanks to lldb, we also got an exact line number, and see the source code where the crash occurred. Sometimes you may not get this much, in which case you may need to run the lldb &lt;code>up&lt;/code> command until you get to a line you can make sense of.&lt;/p>
&lt;p>In this instance, we&amp;rsquo;re running code inside of a for loop - and so one question I have is: what iteration of that loop are we at? We can get this info easily using &lt;code>p i&lt;/code> to inspect the &lt;code>i&lt;/code> variable:&lt;/p>
&lt;pre tabindex="0">&lt;code>(lldb) p i
(unsigned int) $1 = 0
&lt;/code>&lt;/pre>&lt;p>This shows us that the &lt;code>i&lt;/code> variable is an unsigned 32-bit int with the value &lt;code>0&lt;/code>. It crashed on the first iteration.&lt;/p>
&lt;p>(Note: we could also use &lt;code>frame variables&lt;/code> to get a list of variables in the local frame (i.e. our function), but in this case it&amp;rsquo;s quite a few and so I didn&amp;rsquo;t find that useful.)&lt;/p>
&lt;p>Next, I wanted to find out: what would the float at that address actually look like? Luckily, the LLDB &lt;code>p &amp;lt;expr&amp;gt;&lt;/code> command actually interprets C-like expressions for us. It&amp;rsquo;s rather easy to write the C expression for that:&lt;/p>
&lt;pre tabindex="0">&lt;code>(lldb) p *(float*)(data+0)
(float) $2 = 0.181819007
&lt;/code>&lt;/pre>&lt;p>Here we can see the float at the address of &lt;code>data&lt;/code> is &lt;code>0.181819007&lt;/code> - which is within the normalized range I&amp;rsquo;d expect for a UV coordinate. It seems correct. My next question was.. is the pointer address aligned? I know UBSan has a check for pointer alignment, so I looked at it:&lt;/p>
&lt;pre tabindex="0">&lt;code>(lldb) p data
(unsigned char *) $3 = 0x0000000101008251 &amp;#34;\xc3.:&amp;gt;&amp;#34;
&lt;/code>&lt;/pre>&lt;p>Since we&amp;rsquo;re accessing a float at this address, we&amp;rsquo;d expect it to be aligned to 4 bytes. We can check this easily by plopping the address into Python and dividing by 4. If there&amp;rsquo;s a remainder, it&amp;rsquo;s not aligned:&lt;/p>
&lt;pre tabindex="0">&lt;code>&amp;gt;&amp;gt;&amp;gt; 0x0000000101008251 % 4
1
&lt;/code>&lt;/pre>&lt;h2 id="what-are-the-consequences">What are the consequences?&lt;/h2>
&lt;p>Many times, UBSan will catch undefined behavior that in practice isn&amp;rsquo;t really harmful on modern machines you might care about. I wasn&amp;rsquo;t sure about the consequences of unaligned pointer accesses like this, so I asked someone smarter than myself and &lt;a href="https://gitlab.com/bztsrc/model3d/-/issues/19">filed an issue on model3d&lt;/a> which led to some very interesting insights &lt;a href="https://gitlab.com/bztsrc/model3d/-/issues/19#note_1171783061">from @bztsrc&lt;/a>.&lt;/p>
&lt;blockquote>
&lt;p>Short answer: the bug is in UBSan&lt;/p>
&lt;p>Long answer: it is true that in ancient times there were CPUs that couldn&amp;rsquo;t handle unaligned access. However that&amp;rsquo;s not the case with today mainstream processors: x86, ARM, RISC-V, etc. all handle unaligned access out-of-the-box. (Ok, for ARM it&amp;rsquo;s not out-of-the-box, you have to enable MMU which is surely done by the OS kernel otherwise virtual memory mapping would be impossible.)&lt;/p>
&lt;p>[&amp;hellip;] The reason for the unaligned access is pretty simple. In the binary bit-chunk that a compressed model file is, there&amp;rsquo;s obviously no guarantee that a value is aligned. Such compactness is absolutely needed for small file sizes, padding with zeros would insanely increase the required storage requirements.&lt;/p>
&lt;p>[&amp;hellip;] So the decision I had to make here was: keep UBSan happy but create crappy and slow code, or don&amp;rsquo;t care about UBSan and take advantage of modern CPU features. I&amp;rsquo;ve decided on the latter.&lt;/p>
&lt;/blockquote>
&lt;p>Which is a quite compelling argument for this just being noise for our purposes. :)&lt;/p>
&lt;h3 id="risc-v-a-notable-exception">RISC-V: a notable exception&lt;/h3>
&lt;p>It&amp;rsquo;s worth noting (as was pointed out to me by someone much more knowledgable) that RISC-V cores lack hardware support for unaligned accesses[0][1] (&amp;lsquo;if sifive doesn&amp;rsquo;t do this in hardware (unalignment) there&amp;rsquo;s no way any other risc-v cores do [&amp;hellip;due to sifive&amp;rsquo;s sheer popularity in the space]&amp;rsquo;), unalignment is done by trap handlers instead:&lt;/p>
&lt;p>&lt;img src="https://devlog.hexops.org/img/2022/debugging-undefined-behavior/img2.png" alt="image">&lt;/p>
&lt;blockquote>
&lt;p>Officially, programs running in any mode but M-Mode on risc-v are allowed to do unaligned accesses but that doesn&amp;rsquo;t mean it can&amp;rsquo;t be pretty painful&lt;/p>
&lt;/blockquote>
&lt;p>[0] &lt;a href="https://forums.sifive.com/t/ld-sd-alignment/5530/6">https://forums.sifive.com/t/ld-sd-alignment/5530/6&lt;/a>&lt;/p>
&lt;p>[1] &lt;a href="https://patchwork.kernel.org/project/linux-riscv/patch/60c1f087-1e8b-8f22-7d25-86f5f3dcee3f@gmail.com/#24313195">https://patchwork.kernel.org/project/linux-riscv/patch/60c1f087-1e8b-8f22-7d25-86f5f3dcee3f@gmail.com/#24313195&lt;/a>&lt;/p>
&lt;p>I don&amp;rsquo;t have any RISC-V hardware to test on, so this doesn&amp;rsquo;t affect us at present, but I figured it worth noting.&lt;/p>
&lt;h2 id="disabling-the-sanitizer">Disabling the sanitizer&lt;/h2>
&lt;p>In our case, we can just disable the alignment sanitizer for this one function:&lt;/p>
&lt;pre tabindex="0">&lt;code>+__attribute__((no_sanitize(&amp;#34;alignment&amp;#34;)))
m3d_t *m3d_load(unsigned char *data, m3dread_t readfilecb, m3dfree_t freecb, m3d_t *mtllib)
&lt;/code>&lt;/pre>&lt;p>And with this, our tests pass. And we continue to get all the other benefits and safety checks of UBSan elsewhere.&lt;/p>
&lt;p>In this case, though, I opted to just disable alignment sanitization entirely when building model3d by &lt;a href="https://github.com/hexops/mach/commit/c96ff64958c241249041856a8ea0e8a4349050a6">adding&lt;/a> the &lt;code>-fno-sanitize=alignment&lt;/code> compiler flag to our &lt;code>build.zig&lt;/code> to avoid this surprising us in other model3d functions.&lt;/p>
&lt;h2 id="thanks-for-reading">Thanks for reading&lt;/h2>
&lt;p>If you&amp;rsquo;re writing Zig, C, or C++ code - then I hope this &lt;code>zig: Tip&lt;/code> helps you! You can find &lt;a href="https://devlog.hexops.org/categories/zigtips/">other &lt;code>zig: Tips&lt;/code> here&lt;/a>.&lt;/p>
&lt;p>&lt;img align="left" style="max-height: 150px;" src="https://devlog.hexops.org/img/2022/debugging-undefined-behavior/img3.png">&lt;/img>
Be sure to join the &lt;a href="https://discord.gg/XNG3NZgCqp">Mach engine Discord&lt;/a> where we&amp;rsquo;re building the future of Zig game development.
&lt;br>&lt;br>
You can also &lt;a href="https://github.com/sponsors/emidoots">sponsor my work&lt;/a> if you like what I&amp;rsquo;m doing! :)&lt;/p></description></item><item><title>Perfecting WebGPU/Dawn native graphics for Zig</title><link>https://devlog.hexops.org/2022/perfecting-webgpu-native/</link><pubDate>Sun, 11 Sep 2022 00:00:00 +0000</pubDate><guid>https://devlog.hexops.com/2022/perfecting-webgpu-native/</guid><description>&lt;picture>
&lt;source srcset="https://devlog.hexops.org/img/media/gpu/logo_dark.svg" media="(prefers-color-scheme: dark)">
&lt;img style="height: 100px;" src="https://devlog.hexops.org/img/media/gpu/logo_light.svg">
&lt;/picture>
&lt;p>We&amp;rsquo;ve just finished a complete rewrite of &lt;code>mach/gpu&lt;/code> (WebGPU/Dawn bindings for Zig), with 700+ commits, ~7.4k LOC, and 100% API coverage.&lt;/p>
&lt;p>WebGPU (not to be confused with WebGL) is a modern graphics API, acting as a unified API to the underlying Vulkan/Metal/DirectX APIs. Despite it&amp;rsquo;s name, it is also designed for use in native applications via its C API.&lt;/p>
&lt;p>Dawn is the C++ implementation of WebGPU by Google, used in Chrome, planned to be shipped to millions of browsers in the not too distant future.&lt;/p>
&lt;h2 id="machgpu-webgpu-for-zig">&lt;code>mach/gpu&lt;/code>: WebGPU for Zig&lt;/h2>
&lt;p>6 months ago we &lt;a href="https://devlog.hexops.org/2022/mach-v0.1-zig-graphics-in-60s/">released Mach v0.1&lt;/a> which enabled the creation of native applications using WebGPU graphics in Zig:&lt;/p>
&lt;img class="color" style="max-height: 300px;" src="https://devlog.hexops.org/img/2022/perfecting-webgpu-native/img2.gif">
&lt;p>It all Just Works™ out of the box in under ~60s - all you need is &lt;code>zig&lt;/code>, &lt;code>git&lt;/code>, and &lt;code>curl&lt;/code>:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-sh" data-lang="sh">&lt;span class="line">&lt;span class="cl">git clone https://github.com/hexops/mach
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="nb">cd&lt;/span> mach/
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">zig build run-example-boids
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>&lt;small>(requires Zig v0.10+, see &lt;a href="https://github.com/hexops/mach/blob/main/doc/known-issues.md">known issues&lt;/a>.)&lt;/small>&lt;/p>
&lt;p>We do all the heavy-lifting behind the scenes for you: building Dawn using Zig as a C++ compiler, rewriting build scripts in Zig (so you don&amp;rsquo;t need ninja/cmake/etc), package up all required dependencies so you don&amp;rsquo;t need Google&amp;rsquo;s &lt;code>depot_tools&lt;/code>, and more.&lt;/p>
&lt;p>Because of this, cross-compilation to every major desktop OS is available at the flip of a switch:&lt;/p>
&lt;pre tabindex="0">&lt;code>$ zig build example-boids -Dtarget=x86_64-windows
$ zig build example-boids -Dtarget=x86_64-linux
$ zig build example-boids -Dtarget=x86_64-macos.12
$ zig build example-boids -Dtarget=aarch64-macos.12
&lt;/code>&lt;/pre>&lt;p>But this is old news! We released this 6 months ago-so what&amp;rsquo;s new since?&lt;/p>
&lt;h2 id="zig--webgpu-showcase-10-examples">Zig + WebGPU showcase (10+ examples)&lt;/h2>
&lt;p>The new &lt;a href="https://machengine.org/gpu">Zig WebGPU demo showcase&lt;/a> has 12+ examples you can try on your own machine to begin learning Zig and WebGPU quickly:&lt;/p>
&lt;video style="height: 40rem;" autoplay loop controls>
&lt;source src="https://devlog.hexops.org/img/2022/perfecting-webgpu-native/img3.mp4" type="video/mp4">
&lt;/video>
&lt;h2 id="mach-core-vs-mach-engine">Mach core vs. Mach engine&lt;/h2>
&lt;p align="center">
&lt;img class="color-auto" src="https://devlog.hexops.org/img/2022/perfecting-webgpu-native/img4.png" />
&lt;/p>
&lt;p>Mach has a choose-your-journey development strategy, where you don&amp;rsquo;t even have to adopt the entire engine to benefit from it. All the WebGPU examples we provide are &lt;em>Mach core apps&lt;/em>: they rely on Mach for window creation, user input, and setting up the WebGPU API - nothing else. Using &lt;em>Mach core&lt;/em>, you write your own engine!&lt;/p>
&lt;p>Why use this over, say, GLFW and WebGPU on your own? The benefit is that this will work on Desktop, WebAssembly (soon), Mobile (future), and consoles (long term.) You can write Mach core apps in Zig, or other languages via &lt;code>libmach&lt;/code> (more on this later.) Think of Mach core as &lt;em>a competitor to SDL/GLFW.&lt;/em>&lt;/p>
&lt;p>In the future we&amp;rsquo;ll offer &lt;em>Mach engine&lt;/em> apps, where you buy into our ECS, Unity/Unreal-like editor, and other composable building-blocks that make up the engine at your choosing. But this isn&amp;rsquo;t ready today.&lt;/p>
&lt;h3 id="dawnwebgpu-on-the-steam-deck">Dawn/WebGPU on the Steam Deck&lt;/h3>
&lt;p>We believe Linux should be a first-class platform, and because of this we&amp;rsquo;ve found Mach all Just Works™ right out of the box on the Steam Deck (running natively as a Linux Vulkan application, no DirectX or Proton in the mix.):&lt;/p>
&lt;div class="video-container">&lt;video autoplay loop muted src="https://devlog.hexops.org/img/2022/perfecting-webgpu-native/img5.mp4">&lt;/video>&lt;/div>
&lt;h2 id="a-complete-rewrite-of-machgpu-to-be-lean--mean">A complete rewrite of &lt;code>mach/gpu&lt;/code> to be lean &amp;amp; mean&lt;/h2>
&lt;p>When we wrote the initial WebGPU bindings for Zig 6+ months ago, our primary goal was just to get &lt;em>something&lt;/em> working to where we could start building out examples: we always knew we&amp;rsquo;d need to revisit things later, especially as Browser support, the use of native extensions in Dawn (like bindless support in the future, etc.), overhead &amp;amp; other aspects became clear.&lt;/p>
&lt;p>We&amp;rsquo;ve finally done that revisit in a month-long complete rewrite of &lt;code>mach/gpu&lt;/code> from the ground up. This brings 700+ commits, zero-overhead bindings, Dawn native extensions, and much more. Here are the highlights.&lt;/p>
&lt;h3 id="righting-our-wrongs-runtime-interfaces">Righting our wrongs: runtime interfaces&lt;/h3>
&lt;p>One goal of &lt;code>mach/gpu&lt;/code> is to be able to intercept WebGPU API calls, so that we can provide superior debugging facilities in the future (imagine record-and-replay, step-by-step debugging of WebGPU API calls, etc.)&lt;/p>
&lt;p>In the old &lt;code>mach/gpu&lt;/code>, we achieved this by wrapping each WebGPU API object that had methods (like textures, render pass encoders, etc.) in a &lt;em>runtime interface&lt;/em> similar to Zig&amp;rsquo;s &lt;code>std.mem.Allocator&lt;/code> interface:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="kr">pub&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Texture&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">struct&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="c1">/// The type erased pointer to the Texture implementation
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="c1">/// Equal to c.WGPUTexture for NativeInstance.
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">ptr&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">*&lt;/span>&lt;span class="n">anyopaque&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">vtable&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">*&lt;/span>&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">VTable&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="p">};&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="kr">pub&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">VTable&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">struct&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">destroy&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">fn&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">ptr&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">*&lt;/span>&lt;span class="n">anyopaque&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">void&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="c1">// ...
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="p">};&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="kr">pub&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">inline&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">fn&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">destroy&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">tex&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Texture&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">void&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">tex&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">vtable&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">destroy&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">tex&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">ptr&lt;/span>&lt;span class="p">);&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="p">}&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Our thought process was simply to follow any established patterns, learn what didn&amp;rsquo;t work about it by writing examples, and then revisiting the API later. Even six months ago, though, we knew there were issues with this approach.&lt;/p>
&lt;p>&lt;strong>The problem:&lt;/strong> In WebGPU, &lt;code>Descriptor&lt;/code> data structures are often passed to methods: these fairly large data structures contain a wide range of options and graphics pipeline state to use, and often involve passing a list of WebGPU objects as a field - or nested field - as part of the Descriptor data structure. Because our &lt;code>Texture&lt;/code> involves keeping a &lt;code>ptr&lt;/code> (the interface implementation) and a &lt;code>vtable&lt;/code> pointer (our implementation methods) it meant that a &lt;code>gpu.Texture&lt;/code> was two pointers, while a C &lt;code>WGPUTexture&lt;/code> was a single pointer - breaking ABI compatibility.&lt;/p>
&lt;p>This meant that our &lt;code>Texture&lt;/code> could not simply be passed to a C API expecting a &lt;code>WGPUTexture&lt;/code>: instead, we needed to pass our &lt;code>.ptr&lt;/code> field only. This had viral effects, though: every &lt;code>Descriptor&lt;/code> struct which embedded a &lt;code>Texture&lt;/code> needed to be copied/rewritten to convert our two-pointer &lt;code>Texture&lt;/code> to a single-pointer &lt;code>WGPUTexture&lt;/code>. Worse yet, some descriptors hold &lt;em>dynamic arrays&lt;/em> of such objects, requiring us to &lt;em>copy an array to a temporary (and worst-case, heap-allocated), buffer&lt;/em> just in order to call the actual WebGPU C API.&lt;/p>
&lt;p>Needless to say, this was a cancer we felt we absolutely had to get rid of in the rewrite.&lt;/p>
&lt;h3 id="comptime-interfaces">Comptime interfaces&lt;/h3>
&lt;p>While we want to get rid of runtime interfaces, maintain C ABI compatability, and be zero-overhead-we&amp;rsquo;d still like to be able to intercept WebGPU API calls if desired, so that we can provide superior debugging facilities in the future.&lt;/p>
&lt;p>Zig&amp;rsquo;s &lt;code>std.mem.Allocator&lt;/code> being a &lt;em>runtime interface&lt;/em> makes sense because they have different use cases, no existing ABI to remain compatible with, and importantly there are cases where you would want to have &lt;strong>multiple allocator implementations&lt;/strong> in the same program for different purposes.&lt;/p>
&lt;p>With WebGPU, we have different constraints: it&amp;rsquo;s very unlikely to want multiple WebGPU implementations per program. We do need to maintain ABI compatibility. So to address this, we introduce a &lt;em>comptime interface&lt;/em>.&lt;/p>
&lt;img class="color-auto" style="max-height: 200px;" src="https://devlog.hexops.org/img/2022/perfecting-webgpu-native/img6.png" />
&lt;p>Let&amp;rsquo;s look at the &lt;code>Texture.destroy&lt;/code> method from earlier:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="kr">pub&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">inline&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">fn&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">destroy&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">tex&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Texture&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">void&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">tex&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">vtable&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">destroy&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">tex&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">ptr&lt;/span>&lt;span class="p">);&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="p">}&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>As you can see, this would&amp;rsquo;ve called the &lt;em>tex.vtable pointer&lt;/em>, and passed the &lt;code>tex.ptr&lt;/code> interface implementation pointer to it. It&amp;rsquo;s a classical runtime interface implementation. The key point here is that the data type can remain the same, while the &lt;em>implementation pointer&lt;/em> could be replaced at runtime with a different one. On the other side of this invocation, &lt;code>tex.vtable.destroy&lt;/code> would look like this:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="kr">pub&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">fn&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">destroy&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">ptr&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">*&lt;/span>&lt;span class="n">anyopaque&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">void&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">c&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">wgpuTextureDestroy&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="nb">@ptrCast&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">c&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">WGPUTexture&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">ptr&lt;/span>&lt;span class="p">));&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="p">}&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Now let&amp;rsquo;s look at how the &lt;em>comptime interface&lt;/em> approach differs:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="kr">pub&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Texture&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">opaque&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="kr">pub&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">inline&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">fn&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">destroy&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">texture&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">*&lt;/span>&lt;span class="n">Texture&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">void&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">Impl&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">textureDestroy&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">texture&lt;/span>&lt;span class="p">);&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">}&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="c1">// ...
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="p">}&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Firstly, we see that &lt;code>*gpu.Texture&lt;/code> is merely an opaque pointer (a C &lt;code>void*&lt;/code> if you like), just the same as before. Unlike before, however, there is no vtable pointer: there is only one pointer, it&amp;rsquo;s passed directly to the implementor via &lt;code>Impl.textureDestroy&lt;/code> - and the implementation &lt;em>cannot be changed at runtime&lt;/em>.&lt;/p>
&lt;p>This solves the issue of ABI compatibility (we have only one pointer now), but we still need to let the user of the library - say from their &lt;code>main.zig&lt;/code> file - decide which &lt;code>Impl&lt;/code>ementation of the interface to use.&lt;/p>
&lt;p>Traditionally, one might use generics for this (passing an &lt;code>Impl&lt;/code> type parameter to each method for example), but we&amp;rsquo;d rather not pass that around everywhere: after all, we know it will be decided by one user of the API for the entire program, and requiring a type parameter here would have viral effects to every user of the WebGPU API (every API they expose would need that same type parameter.)&lt;/p>
&lt;p>Luckily, in Zig there is a trick: from within our WebGPU API we can import the root file of the program (e.g. &lt;code>main.zig&lt;/code>). Zig allows this since it lazily evaluates code, so there&amp;rsquo;s no dependency loop here. So in our &lt;code>mach/gpu&lt;/code> package, we can define:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="kr">pub&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Impl&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">blk&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">root&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="nb">@import&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="s">&amp;#34;root&amp;#34;&lt;/span>&lt;span class="p">);&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="k">if&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="o">!&lt;/span>&lt;span class="nb">@hasDecl&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">root&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="s">&amp;#34;GPUInterface&amp;#34;&lt;/span>&lt;span class="p">))&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="nb">@compileError&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="s">&amp;#34;expected to find `pub const GPUInterface = T;` in root file&amp;#34;&lt;/span>&lt;span class="p">);&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">_&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">gpu&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">Interface&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">root&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">GPUInterface&lt;/span>&lt;span class="p">);&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="c1">// verify the type
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">break&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="n">blk&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">root&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">GPUInterface&lt;/span>&lt;span class="p">;&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="p">};&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>This effectively looks in the user&amp;rsquo;s &lt;code>main.zig&lt;/code> (&amp;ldquo;root&amp;rdquo;) file for a declaration like:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="kr">pub&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">GPUInterface&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">gpu&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">dawn&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">Interface&lt;/span>&lt;span class="p">;&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Once resolved, our &lt;code>Impl&lt;/code> constant is known statically at compile time to be an exact interface implementation of the &lt;code>gpu.Interface&lt;/code>: &lt;code>gpu.dawn.Interface&lt;/code> in this case, which is just a struct type with functions in it calling the Dawn C API:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="kr">pub&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Interface&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">struct&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="kr">pub&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">inline&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">fn&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">textureDestroy&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">texture&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">*&lt;/span>&lt;span class="n">gpu&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">Texture&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">void&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">procs&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">textureDestroy&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="o">?&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="nb">@ptrCast&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">c&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">WGPUTexture&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">texture&lt;/span>&lt;span class="p">));&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">}&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="c1">// ...
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="p">}&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>The trick here to ensuring that a type actually satisfies the &lt;code>gpu.Interface&lt;/code> is that you write a type validator function, which checks if the struct passes to it has the desired methods with matching function signatures:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="c1">/// Verifies that a gpu.Interface implementation exposes the expected function declarations.
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="kr">pub&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">fn&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Interface&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="kr">comptime&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">T&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">type&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">type&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">assertDecl&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">T&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="s">&amp;#34;textureDestroy&amp;#34;&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">fn&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">texture&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">*&lt;/span>&lt;span class="n">gpu&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">Texture&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">callconv&lt;/span>&lt;span class="p">(.&lt;/span>&lt;span class="n">Inline&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">void&lt;/span>&lt;span class="p">);&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="c1">// ...
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">return&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">T&lt;/span>&lt;span class="p">;&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="p">};&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Best of all, since the interface implementation is completely static and known at comptime, we can enforce every method invocation is &lt;code>inline&lt;/code> and we&amp;rsquo;re not adding any overhead.&lt;/p>
&lt;h3 id="libmach-and-gpuexport">libmach and gpu.Export&lt;/h3>
&lt;img class="color-auto" style="max-height: 300px;" src="https://devlog.hexops.org/img/2022/perfecting-webgpu-native/img7.png" />
&lt;p>One recent development is &lt;code>libmach&lt;/code>, which will provide at least a C ABI for the creation of &lt;em>Mach core&lt;/em> applications from other languages (think a bit like SDL, but for WebGPU and it works on Desktop, Mobile, WebAssembly &amp;amp; more in the future.)&lt;/p>
&lt;p>One thing we&amp;rsquo;d like to retain, though, is the ability to have such applications get the same nice WebGPU debugging experience in the future, while still using that language&amp;rsquo;s existing WebGPU bindings. This means instead of calling Dawn&amp;rsquo;s &lt;code>wgpuTextureDestroy&lt;/code> for example, we&amp;rsquo;d need to call &lt;code>libmach&lt;/code>&amp;rsquo;s &lt;code>wgpuTextureDestroy&lt;/code>.&lt;/p>
&lt;p>This is where &lt;code>gpu.Export&lt;/code> comes in: it merely takes a &lt;code>gpu.Interface&lt;/code> struct with all of the Zig functions that implement the WebGPU API, and exports the WebGPU C ABI for them:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="c1">/// Exports C ABI function declarations for the given gpu.Interface implementation.
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="kr">pub&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">fn&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Export&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="kr">comptime&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">T&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">type&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">type&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">_&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Interface&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">T&lt;/span>&lt;span class="p">);&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="c1">// verify implementation is a valid interface
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">return&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">struct&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="c1">// WGPU_EXPORT void wgpuTextureDestroy(WGPUTexture texture);
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">export&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">fn&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">wgpuTextureDestroy&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">texture&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">*&lt;/span>&lt;span class="n">gpu&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">Texture&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">void&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">T&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">textureDestroy&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">texture&lt;/span>&lt;span class="p">);&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">}&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="c1">// ...
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">};&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="p">}&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>From this, you might notice something important: We&amp;rsquo;ve maintained 100% C ABI compatability in the new &lt;code>mach/gpu&lt;/code> rewrite. Every data structure is ABI compatible with Dawn&amp;rsquo;s &lt;code>webgpu.h&lt;/code> header.&lt;/p>
&lt;h3 id="zig-flag-sets">Zig flag sets&lt;/h3>
&lt;p>One nice property of Zig is it&amp;rsquo;s &lt;code>packed struct&lt;/code>s. For example, in C there is a &lt;code>WGPUColorWriteMaskFlags&lt;/code> type which is a &lt;code>uint32_t&lt;/code> where the first four bits represent a color write mask for red, green, blue, and alpha respectively. The remaining 28 bits are unused at present.&lt;/p>
&lt;img class="color-auto" style="max-height: 300px;" src="https://devlog.hexops.org/img/2022/perfecting-webgpu-native/img8.png" />
&lt;p>Interacting with &lt;code>WGPUColorWriteMaskFlags&lt;/code> in C can be a bit cumbersome: you need to make sure you remember the right bit masking operations to set bits, check if they are set, and so on.&lt;/p>
&lt;p>In Zig, we have &lt;code>packed struct&lt;/code> in which &lt;code>bool&lt;/code> is just one bit - and we have integers of any bit width we desire. We can use this to compose a 32-bit data structure compatible with the C ABI variant, but using nice bools to represent those first four bits:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="kr">pub&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">ColorWriteMaskFlags&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">packed&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">struct&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">red&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">bool&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kc">false&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">green&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">bool&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kc">false&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">blue&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">bool&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kc">false&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">alpha&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">bool&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kc">false&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">_padding&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">u28&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="mi">0&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="p">};&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>This is nice because now one can simply check &lt;code>if (write_mask.red and write_mask.blue)&lt;/code> for example, or simply pass it as a parameter to a function like &lt;code>ColorWriteMaskFlags{.red = true, .blue = true}&lt;/code>.&lt;/p>
&lt;p>Read more about how this works: &lt;a href="https://devlog.hexops.org/2022/packed-structs-in-zig/">&amp;ldquo;Packed structs in Zig make bit/flag sets trivial&amp;rdquo;&lt;/a>&lt;/p>
&lt;h3 id="dawn-native-extensions">Dawn native extensions&lt;/h3>
&lt;p>One not-so-friendly aspect of &lt;code>webgpu.h&lt;/code> (the C API for WebGPU) is that it allows for arbitrary extension of the API via so-called chaining. For example, let&amp;rsquo;s look at a descriptor struct used as the parameters to create a shader module from its text source code:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-c" data-lang="c">&lt;span class="line">&lt;span class="cl">&lt;span class="k">typedef&lt;/span> &lt;span class="k">struct&lt;/span> &lt;span class="n">WGPUShaderModuleDescriptor&lt;/span> &lt;span class="p">{&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="n">WGPUChainedStruct&lt;/span> &lt;span class="k">const&lt;/span> &lt;span class="o">*&lt;/span> &lt;span class="n">nextInChain&lt;/span>&lt;span class="p">;&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="kt">char&lt;/span> &lt;span class="k">const&lt;/span> &lt;span class="o">*&lt;/span> &lt;span class="n">label&lt;/span>&lt;span class="p">;&lt;/span> &lt;span class="c1">// nullable
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="p">}&lt;/span> &lt;span class="n">WGPUShaderModuleDescriptor&lt;/span>&lt;span class="p">;&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Here you can obviously see there is a &lt;code>label&lt;/code> for the shader module - but where does our shader source code go? It&amp;rsquo;s not clear. And what goes in that &lt;code>nextInChain&lt;/code> field? It looks like this:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-c" data-lang="c">&lt;span class="line">&lt;span class="cl">&lt;span class="k">typedef&lt;/span> &lt;span class="k">struct&lt;/span> &lt;span class="n">WGPUChainedStruct&lt;/span> &lt;span class="p">{&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="k">struct&lt;/span> &lt;span class="n">WGPUChainedStruct&lt;/span> &lt;span class="k">const&lt;/span> &lt;span class="o">*&lt;/span> &lt;span class="n">next&lt;/span>&lt;span class="p">;&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="n">WGPUSType&lt;/span> &lt;span class="n">sType&lt;/span>&lt;span class="p">;&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="p">}&lt;/span> &lt;span class="n">WGPUChainedStruct&lt;/span>&lt;span class="p">;&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Effectively, WebGPU implementations can take arbitrary data structures via this chaining process - as extensions to the WebGPU API for example - so long as the chained struct &lt;em>begins with these ABI-compatible fields&lt;/em>.&lt;/p>
&lt;p>For example-to construct a shader in Zig, you might write:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">next_in_chain&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">c&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">WGPUShaderModuleWGSLDescriptor&lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">chain&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">c&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">WGPUChainedStruct&lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">next&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kc">null&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="c1">// nothing else to chain
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">sType&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">c&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">WGPUSType_ShaderModuleWGSLDescriptor&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="c1">// so it knows what type we chained!
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">},&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">source&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">my_shader_source_code_text&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="p">};&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">shader_module_descriptor&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">c&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">WGPUShaderModuleDescriptor&lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">nextInChain&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="nb">@ptrCast&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="o">?*&lt;/span>&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">c&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">WGPUChainedStruct&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">next_in_chain&lt;/span>&lt;span class="p">),&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">label&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="s">&amp;#34;my shader module&amp;#34;&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="p">};&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>That&amp;rsquo;s pretty nasty! Also take note of how &lt;code>nextInChain&lt;/code> needs to be &lt;em>cast to&lt;/em> the &lt;code>WGPUChainedStruct&lt;/code> pointer type, only the &lt;code>sType&lt;/code> field identifies it (the C type system can&amp;rsquo;t.)&lt;/p>
&lt;p>More importantly: because &lt;code>nextInChain&lt;/code> is an opaque type, you can&amp;rsquo;t really know what type of pointer is legal at all to give to the API in a &lt;code>nextInChain&lt;/code> field. Oof!&lt;/p>
&lt;p>Needless to say, we didn&amp;rsquo;t want to adopt this lack of type safety (and lack of documentation), so we worked with the Dawn developers at Google &lt;a href="https://bugs.chromium.org/p/dawn/issues/detail?id=1486&amp;amp;q=reporter%3Ame&amp;amp;can=1">to add documentation about what structs are legal where&lt;/a>, and then in Zig we used this information to replace &lt;code>next_in_chain&lt;/code> fields with a union of pointers so it&amp;rsquo;s type safe (for all known structs) and self-documenting. Our example from before becomes just:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">shader_module_descriptor&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">gpu&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">ShaderModule&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">Descriptor&lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">next_in_chain&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">.{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">wgsl_descriptor&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">&amp;amp;&lt;/span>&lt;span class="p">.{.&lt;/span>&lt;span class="n">source&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">my_shader_source_code_text&lt;/span>&lt;span class="p">},&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">},&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">label&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="s">&amp;#34;my shader module&amp;#34;&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="p">};&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>It may not seem much more readable, but all of the type system info is there to protect you and that&amp;rsquo;s what counts. Of course, we also added a helper to create WGSL shader modules so this ends up being truly clean:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="n">device&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">createShaderModuleWGSL&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="s">&amp;#34;my shader module&amp;#34;&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">my_shader_source_code_text&lt;/span>&lt;span class="p">);&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;h3 id="upstreamed-patches-to-dawn">Upstreamed patches to Dawn&lt;/h3>
&lt;p>Out of the box, Dawn needed a little love to be compiled with Zig as the C/C++ compiled - so we&amp;rsquo;ve contributed patches upstream for this:&lt;/p>
&lt;ul>
&lt;li>Resolving some undefined behavior in Dawn caught by Zig using UBSAN by default. &lt;a href="https://dawn-review.googlesource.com/c/dawn/+/87380">#87380&lt;/a>&lt;/li>
&lt;li>Improving constexpr compatibility for a DirectX constant, due to using MinGW DirectX headers. &lt;a href="https://dawn-review.googlesource.com/c/dawn/+/87381">#87381&lt;/a>&lt;/li>
&lt;li>Correcting an invocation of &lt;code>_uuidof&lt;/code> on Windows. &lt;a href="https://dawn-review.googlesource.com/c/dawn/+/87309">#87309&lt;/a>&lt;/li>
&lt;li>Adding an option to disable use of (Windows 10+) Windows UI, as we don&amp;rsquo;t have headers for it. &lt;a href="https://dawn-review.googlesource.com/c/dawn/+/87383">#87383&lt;/a>&lt;/li>
&lt;/ul>
&lt;h3 id="obvious-improvements">Obvious improvements&lt;/h3>
&lt;p>There were many other &lt;a href="https://github.com/hexops/mach/tree/main/gpu">obvious improvements&lt;/a> we won&amp;rsquo;t enumerate in detail here:&lt;/p>
&lt;ul>
&lt;li>Achieving 100% API coverage, and coming up with processes/rules/conventions to ensure this all remains up-to-date and correct going forward as Dawn&amp;rsquo;s &lt;code>webgpu.h&lt;/code> API changes.&lt;/li>
&lt;li>Setting the right default values for every field in the entire API, which reduces verbosity of the API substantially.&lt;/li>
&lt;li>Adding slice helpers where the C ABI uses pointers-and-lengths distinctly.&lt;/li>
&lt;li>Adding type-safe helpers to callbacks which would have a &lt;code>void*&lt;/code> userdata pointer in the C API.&lt;/li>
&lt;li>Exposing every Dawn native extension, e.g. in anticipation of bindless support in the future.&lt;/li>
&lt;/ul>
&lt;h2 id="standalone-repository">Standalone repository&lt;/h2>
&lt;p>As with all &lt;a href="https://github.com/hexops/mach/tree/main/libs">standalone Mach libraries&lt;/a> that reach a certain level of maturity, &lt;code>mach/gpu&lt;/code> is now available in it&amp;rsquo;s own standalone repository with an example using it with GLFW: &lt;a href="https://github.com/hexops/mach-gpu">https://github.com/hexops/mach-gpu&lt;/a>&lt;/p>
&lt;h2 id="whats-next-browser-support-more-examples">What&amp;rsquo;s next: browser support, more examples&lt;/h2>
&lt;p>I&amp;rsquo;d say we&amp;rsquo;re well on our way to having a perfect WebGPU/Dawn API for Zig, but we do have a little ways to go. Things coming up include:&lt;/p>
&lt;ul>
&lt;li>More examples&lt;/li>
&lt;li>Adding browser support: this will be achieved in the near future by direct WebAssembly-&amp;gt;JS calls (not via Emscripten.)&lt;/li>
&lt;li>Adding higher-level helpers (always 100% optional, the C ABI is always available and present via &lt;code>gpu.Impl.foobar&lt;/code> methods.)&lt;/li>
&lt;/ul>
&lt;p>We&amp;rsquo;re continuing to work towards &lt;a href="https://github.com/hexops/mach/issues/355">the Mach v0.2 release&lt;/a> otherwise (special thanks for all those contributing to Mach today!)&lt;/p>
&lt;h2 id="thanks-for-reading">Thanks for reading&lt;/h2>
&lt;div style="display: flex; flex-direction: row; align-items: center;">
&lt;img align="left" style="max-height: 12.5rem;" src="https://devlog.hexops.org/img/2022/perfecting-webgpu-native/img9.png">&lt;/img>
&lt;ul>
&lt;li>Join the &lt;a href="https://discord.gg/XNG3NZgCqp">Mach Discord server&lt;/a>&lt;/li>
&lt;li>Check out the mach/gpu &lt;a href="https://machengine.org/gpu">example showcase&lt;/a>&lt;/li>
&lt;li>Help us &lt;a href="https://github.com/hexops/mach/issues/230">port/write more WebGPU examples&lt;/a> to Zig&lt;/li>
&lt;li>Read up on WebGPU &lt;a href="https://surma.dev/things/webgpu/">compute&lt;/a> and &lt;a href="https://alain.xyz/blog/raw-webgpu">rendering&lt;/a>&lt;/li>
&lt;li>&lt;a href="https://github.com/sponsors/emidoots">Sponsor development&lt;/a> if you like what we're doing!&lt;/li>
&lt;/ul>
&lt;/div></description></item><item><title>Packed structs in Zig make bit/flag sets trivial</title><link>https://devlog.hexops.org/2022/packed-structs-in-zig/</link><pubDate>Mon, 29 Aug 2022 00:00:00 +0000</pubDate><guid>https://devlog.hexops.com/2022/packed-structs-in-zig/</guid><description>&lt;p>As we&amp;rsquo;ve been building &lt;a href="https://machengine.org/">Mach engine&lt;/a>, we&amp;rsquo;ve been using a neat little pattern in Zig that enables writing flag sets more nicely in Zig than in other languages.&lt;/p>
&lt;h2 id="what-is-a-flag-set">What is a flag set?&lt;/h2>
&lt;p>We&amp;rsquo;ve been rewriting &lt;code>mach/gpu&lt;/code> (WebGPU bindings for Zig) from scratch recently, so let&amp;rsquo;s take a flag set from the WebGPU C API:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-c" data-lang="c">&lt;span class="line">&lt;span class="cl">&lt;span class="k">typedef&lt;/span> &lt;span class="kt">uint32_t&lt;/span> &lt;span class="n">WGPUFlags&lt;/span>&lt;span class="p">;&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="k">typedef&lt;/span> &lt;span class="n">WGPUFlags&lt;/span> &lt;span class="n">WGPUColorWriteMaskFlags&lt;/span>&lt;span class="p">;&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Effectively, &lt;code>WGPUColorWriteMaskFlags&lt;/code> here is a 32-bit unsigned integer where you can set specific bits in it to represent whether or not to write certain colors:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-c" data-lang="c">&lt;span class="line">&lt;span class="cl">&lt;span class="k">typedef&lt;/span> &lt;span class="k">enum&lt;/span> &lt;span class="n">WGPUColorWriteMask&lt;/span> &lt;span class="p">{&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="n">WGPUColorWriteMask_None&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="mh">0x00000000&lt;/span>&lt;span class="p">,&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="n">WGPUColorWriteMask_Red&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="mh">0x00000001&lt;/span>&lt;span class="p">,&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="n">WGPUColorWriteMask_Green&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="mh">0x00000002&lt;/span>&lt;span class="p">,&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="n">WGPUColorWriteMask_Blue&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="mh">0x00000004&lt;/span>&lt;span class="p">,&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="n">WGPUColorWriteMask_Alpha&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="mh">0x00000008&lt;/span>&lt;span class="p">,&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="n">WGPUColorWriteMask_All&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="mh">0x0000000F&lt;/span>&lt;span class="p">,&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="n">WGPUColorWriteMask_Force32&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="mh">0x7FFFFFFF&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="p">}&lt;/span> &lt;span class="n">WGPUColorWriteMask&lt;/span>&lt;span class="p">;&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Then to use it you&amp;rsquo;d use the various bit operations with those masks, e.g.:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-c" data-lang="c">&lt;span class="line">&lt;span class="cl">&lt;span class="n">WGPUColorWriteMaskFlags&lt;/span> &lt;span class="n">mask&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="n">WGPUColorWriteMask_Red&lt;/span> &lt;span class="o">|&lt;/span> &lt;span class="n">WGPUColorWriteMask_Green&lt;/span>&lt;span class="p">;&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="n">mask&lt;/span> &lt;span class="o">|=&lt;/span> &lt;span class="n">WGPUColorWriteMask_Blue&lt;/span>&lt;span class="p">;&lt;/span> &lt;span class="c1">// set blue bit
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>This all works, people have been doing it for years in C, C++, Java, Rust, and more. In Zig, we can do better.&lt;/p>
&lt;h2 id="zig-packed-structs">Zig packed structs&lt;/h2>
&lt;img class="color-auto" style="max-height: 300px;" src="https://devlog.hexops.org/img/2022/packed-structs-in-zig/img1.png" />
&lt;p>Zig has &lt;code>packed struct&lt;/code>s: these let us pack memory tightly, where a &lt;code>bool&lt;/code> is actually a single bit (in most other languages, this is not true.) Zig also has arbitrary bit-width integers, like &lt;code>u28&lt;/code>, &lt;code>u1&lt;/code> and so on.&lt;/p>
&lt;p>We can write &lt;code>WGPUColorWriteMaskFlags&lt;/code> from earlier in Zig using:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="kr">pub&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">ColorWriteMaskFlags&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">packed&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">struct&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">red&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">bool&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kc">false&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">green&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">bool&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kc">false&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">blue&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">bool&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kc">false&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">alpha&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">bool&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kc">false&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">_padding&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">u28&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="mi">0&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="p">};&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>This is still just 32 bits of memory, and so can be passed to the same C APIs that expect a &lt;code>WGPUColorWriteMaskFlags&lt;/code> - but interacting with it is much nicer:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="kr">var&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">mask&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">ColorWriteMaskFlags&lt;/span>&lt;span class="p">{.&lt;/span>&lt;span class="n">red&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kc">true&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">green&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kc">true&lt;/span>&lt;span class="p">};&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="n">mask&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">blue&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kc">true&lt;/span>&lt;span class="p">;&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="c1">// set blue bit
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>In C you would need to write code like this:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-c" data-lang="c">&lt;span class="line">&lt;span class="cl">&lt;span class="k">if&lt;/span> &lt;span class="p">(&lt;/span>&lt;span class="n">mask&lt;/span> &lt;span class="o">&amp;amp;&lt;/span> &lt;span class="n">WGPUColorWriteMask_Alpha&lt;/span>&lt;span class="p">)&lt;/span> &lt;span class="p">{&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="c1">// alpha is set..
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="p">}&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="k">if&lt;/span> &lt;span class="p">(&lt;/span>&lt;span class="n">mask&lt;/span> &lt;span class="o">&amp;amp;&lt;/span> &lt;span class="p">(&lt;/span>&lt;span class="n">WGPUColorWriteMask_Alpha&lt;/span>&lt;span class="o">|&lt;/span>&lt;span class="n">WGPUColorWriteMask_Blue&lt;/span>&lt;span class="p">))&lt;/span> &lt;span class="p">{&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="c1">// alpha and blue are set..
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="p">}&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="k">if&lt;/span> &lt;span class="p">((&lt;/span>&lt;span class="n">mask&lt;/span> &lt;span class="o">&amp;amp;&lt;/span> &lt;span class="n">WGPUColorWriteMask_Green&lt;/span>&lt;span class="p">)&lt;/span> &lt;span class="o">==&lt;/span> &lt;span class="mi">0&lt;/span>&lt;span class="p">)&lt;/span> &lt;span class="p">{&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="c1">// green not set
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="p">}&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>In Zig it&amp;rsquo;s just:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="k">if&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">mask&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">alpha&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="c1">// alpha is set..
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="p">}&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="k">if&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">mask&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">alpha&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">and&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">mask&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">blue&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="c1">// alpha is set..
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="p">}&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="k">if&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="o">!&lt;/span>&lt;span class="n">mask&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">green&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="c1">// green not set
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="p">}&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;h2 id="comptime-validation">Comptime validation&lt;/h2>
&lt;p>Making sure that our &lt;code>ColorWriteMaskFlags&lt;/code> ends up being the same size could be a bit tricky: what if we count the number of &lt;code>bool&lt;/code> wrong? Or what if we accidently get the padding size wrong? Then it might not be the same size as a &lt;code>uint32&lt;/code> anymore.&lt;/p>
&lt;p>Luckily, we can verify our expectations at comptime:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="kr">pub&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">ColorWriteMaskFlags&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">packed&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">struct&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">red&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">bool&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kc">false&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">green&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">bool&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kc">false&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">blue&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">bool&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kc">false&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">alpha&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">bool&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kc">false&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">_padding&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">u28&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="mi">0&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="kr">comptime&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">std&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">debug&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">assert&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="nb">@sizeOf&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="nb">@This&lt;/span>&lt;span class="p">())&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">==&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="nb">@sizeOf&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="kt">u32&lt;/span>&lt;span class="p">));&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">std&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">debug&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">assert&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="nb">@bitSizeOf&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="nb">@This&lt;/span>&lt;span class="p">())&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">==&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="nb">@bitSizeOf&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="kt">u32&lt;/span>&lt;span class="p">));&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">}&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="p">}&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>The Zig compiler will take care of running the &lt;code>comptime&lt;/code> code block here for us when building, and it will verify that the byte size of &lt;code>@This()&lt;/code> (the type we&amp;rsquo;re inside of, the &lt;code>ColorWriteMaskFlags&lt;/code> struct in this case) matches the &lt;code>@sizeOf(u32)&lt;/code>.&lt;/p>
&lt;p>Similarly we could check the &lt;code>@bitSizeOf&lt;/code> both types if we like.&lt;/p>
&lt;p>Note that &lt;a href="https://ziglang.org/documentation/master/#sizeOf">&lt;code>@sizeOf&lt;/code>&lt;/a> may include the size of padding for more complex types, while &lt;a href="https://ziglang.org/documentation/master/#bitSizeOf">&lt;code>@bitSizeOf&lt;/code>&lt;/a> returns the number of bits it takes to store &lt;code>T&lt;/code> in memory &lt;em>if the type were a field in a packed struct/union&lt;/em>. For flag sets like this, it doesn&amp;rsquo;t matter and either will do. For more complex types, be sure to recall this.&lt;/p>
&lt;h2 id="explicit-backing-integers-for-packed-structs">Explicit backing integers for packed structs&lt;/h2>
&lt;p>It&amp;rsquo;s worth noting that in Zig 0.10 (shipping in Nov), the new self-hosted compiler has support for &lt;a href="https://github.com/ziglang/zig/pull/12379">explicit backing integers for packed structs&lt;/a> which will simplify this even further.&lt;/p>
&lt;p>Instead of manually adding padding to make up 32 bits, one could simply write &lt;code>packed struct(u32)&lt;/code>:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="kr">pub&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">ColorWriteMaskFlags&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">packed&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">struct&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="kt">u32&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">red&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">bool&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kc">false&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">green&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">bool&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kc">false&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">blue&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">bool&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kc">false&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">alpha&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">bool&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kc">false&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="p">}&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;h2 id="thanks-for-reading">Thanks for reading&lt;/h2>
&lt;p>&lt;img align="left" style="max-height: 150px;" src="https://devlog.hexops.org/img/2022/packed-structs-in-zig/img2.png">&lt;/img>
Be sure to join the new &lt;a href="https://discord.gg/XNG3NZgCqp">Mach engine Discord server&lt;/a> where we&amp;rsquo;re building the future of Zig game development.
&lt;br>&lt;br>
You can also &lt;a href="https://github.com/sponsors/emidoots">sponsor my work&lt;/a> if you like what I&amp;rsquo;m doing! :)&lt;/p>
&lt;h2 id="but-c-has-had-bitfields-since-forever">&amp;ldquo;But C has had bitfields since forever!&amp;rdquo;&lt;/h2>
&lt;p>Shortly after posting this article I was inundated with comments proclaiming &amp;ldquo;But C has had bitfields since forever!&amp;rdquo;&lt;/p>
&lt;p>First, I&amp;rsquo;d like to say I was not aware of C bitfields at the time of writing - I simply had not ever come across usage of them. Secondly, I&amp;rsquo;d like to question: if C has bitfields, then why do seemingly all modern C APIs not use? Why do they all expose integer types instead?&lt;/p>
&lt;p>And then I found the answer in the TC3 C specification:&lt;/p>
&lt;img width="803" alt="image" src="https://devlog.hexops.org/img/2022/packed-structs-in-zig/img3.png">
&lt;p>As &lt;a href="https://news.ycombinator.com/item?id=32648232">this user writes&lt;/a>:&lt;/p>
&lt;blockquote>
&lt;p>The in-memory representation of bit fields is implementation-defined. Therefore, if you&amp;rsquo;re calling into an external API that takes a uint32_t like in the example without an explicit remapping, you may or may not like the results.&lt;/p>
&lt;p>In practice, everything you&amp;rsquo;re likely to come across will be little endian nowadays, and the ABI you&amp;rsquo;re using will most likely order your struct from top to bottom in memory, so they will look the same most of the time. However, it&amp;rsquo;s still technically not portable.&lt;/p>
&lt;/blockquote>
&lt;p>My intention behind this article wasn&amp;rsquo;t to say C is bad; but rather to say that I find Zig&amp;rsquo;s packed structs quite nice. I actually come from a background mostly in Go - which absolutely does not have bitfields, packed structs, or arbitrary bit-width integers. Having never come across them in C either, my claims against C bitfields today could be summarized as:&lt;/p>
&lt;ul>
&lt;li>C&amp;rsquo;s bitfields are more implementation-defined than Zig&amp;rsquo;s.&lt;/li>
&lt;li>C&amp;rsquo;s bitfields being so implementation-defined, tend not to be used in modern APIs - so the fact that Zig has come up with a variant which &lt;em>is used in practice in most APIs&lt;/em> is very important.&lt;/li>
&lt;/ul>
&lt;p>In any case, I am not an expert in C bitfields! I just hate masking to check if bits are set, and the &lt;a href="https://news.ycombinator.com/item?id=32646998">insane number of ways&lt;/a> that exact same logic can be written - both correctly and incorrectly. We deserve nicer syntax to check if a bit field is set out of the box, Zig provides that and I am happier for it.&lt;/p>
&lt;p>Please stop messaging me about how C has bitfields :)&lt;/p></description></item><item><title>Let's build an Entity Component System (part 2): databases</title><link>https://devlog.hexops.org/2022/lets-build-ecs-part-2-databases/</link><pubDate>Sat, 28 May 2022 00:00:00 +0000</pubDate><guid>https://devlog.hexops.com/2022/lets-build-ecs-part-2-databases/</guid><description>&lt;p>
&lt;img alt="ECS connected to databases and data oriented design" class="color-auto-light" style="height: 20rem; float: left; padding-right: 1rem;" src="https://devlog.hexops.org/img/2022/lets-build-ecs-part-2-databases/img2.png">
&lt;br>&lt;br>
In this series we build the &lt;a href="https://machengine.org">Mach engine&lt;/a> Entity Component System from scratch in &lt;a href="https://ziglang.org">the Zig programming language&lt;/a>.
&lt;br>&lt;br>
In part one, we looked at how ECS intersects with &lt;em>data oriented design&lt;/em>, starting without any foundational understanding of how ECS typically works and instead working from first-principles to arrive at what would probably be the most computationally efficient implementation.
&lt;br>&lt;br>
In this ~24 page part two, we examine functionality gaps our first approach had, explore how databases relate to ECS, and begin writing our actual implementation in Zig! By the end, you'll have an archetypal ECS with the ability to add/remove entities and components. In part 3, we'll cover queries.
&lt;br>&lt;br>
Check out the &lt;a href="https://devlog.hexops.org/categories/build-an-ecs/">prior parts of this series&lt;/a> if you haven't already!
&lt;/p>
&lt;h1 id="the-case-for-a-general-purpose-runtime-ecs">The case for a general-purpose runtime ECS&lt;/h1>
&lt;p>In part 1 we proposed an architecture which would have had you end up with &lt;a href="https://github.com/hexops/mach/blob/dcb5c3aed2ba705d8d0ec854148a90628369f410/ecs/src/main.zig">something like this&lt;/a> where you declare an entity archetype &lt;em>at comptime&lt;/em>, as a struct type:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Player&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">struct&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">name&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">[]&lt;/span>&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">u8&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="c1">// a string / byte slice
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">location&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Vec3&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">velocity&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Vec3&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">health&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">u8&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">team&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Team&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="p">};&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>In this code, our struct fields &lt;code>name&lt;/code>, &lt;code>location&lt;/code>, etc. are said to be our entity &lt;em>components&lt;/em> for the &lt;code>Player&lt;/code> archetype. To create a &lt;code>Player&lt;/code> entity, we would simply create a value of this type. We proposed using &lt;code>std.MultiArrayList&lt;/code> to store lists of &lt;code>Player&lt;/code> entities for efficient CPU cache utilization:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="kr">var&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">players&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">MultiArrayList&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">Player&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">.{};&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="c1">// all players
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>This approach is minimal, simplistic, and has an extremely efficient memory layout. Anyone who has used a production-worthy ECS, though, will tell you: &lt;em>that lacks flexibility.&lt;/em>&lt;/p>
&lt;p>Here are a few reasons why a general-purpose runtime ECS is more flexible.&lt;/p>
&lt;h2 id="operating-on-components-not-entities">Operating on components, not entities&lt;/h2>
&lt;p>Imagine you&amp;rsquo;d like to have your physics system operate on every entity with &lt;code>velocity&lt;/code> and &lt;code>location&lt;/code> components. We know that as long as we have those two values, we can do some maths and update the location of an entity to where it should be. But wait, which entities?&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="kr">var&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">players&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">MultiArrayList&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">Player&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">.{};&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="c1">// all players
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="kr">var&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">monsters&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">MultiArrayList&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">Monster&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">.{};&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="c1">// all monsters
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="kr">var&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">cameras&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">MultiArrayList&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">Camera&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">.{};&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="c1">// all cameras
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="kr">var&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">lights&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">MultiArrayList&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">Light&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">.{};&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="c1">// all lights
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>We could look at these types as a programmer and find which ones have &lt;code>velocity&lt;/code> and &lt;code>location&lt;/code> fields, probably just &lt;code>players&lt;/code> and &lt;code>monsters&lt;/code> do. But now, how do we write our physics code to operate on both lists of &lt;code>Player&lt;/code> and &lt;code>Monster&lt;/code> entities? We just need the &lt;code>velocity&lt;/code> and &lt;code>location&lt;/code> fields - we don&amp;rsquo;t care if it&amp;rsquo;s a player or monster!&lt;/p>
&lt;h2 id="rapid-iterative-game-design">Rapid, iterative, game design&lt;/h2>
&lt;p>
&lt;img alt="server giving the stovetop entity a sword, scripting language giving the stovetop entity physics" style="height: 30rem; float:right;" src="https://devlog.hexops.org/img/2022/lets-build-ecs-part-2-databases/img3.png">
If we wish to add a &lt;code>weapon: Weapon&lt;/code> component to our &lt;code>Player&lt;/code> entity, all good: we update the &lt;code>Player&lt;/code> struct to have that field. If we want to add a weapon at runtime, we make it an optional &lt;code>weapon: ?Weapon&lt;/code> so it can be &lt;code>null&lt;/code>.
&lt;br>&lt;br>
Let's say we're working on a whacky new cooking simulator game: you've got a kitchen stove, ingredients, utensils, etc. as entities. Some code checks if an entity touches the stovetop and, if it has a &lt;code>cookable: void&lt;/code> component, then it gets cooked. If we're trying to build a 100% science-based cooking simulator, well, then we could probably plan ahead and "know" that &lt;code>Ingredient&lt;/code> entities should have the &lt;code>cookable&lt;/code> component while &lt;code>Utensil&lt;/code> entities should not. But often, there's &lt;em>immense joy in strange mechanics:&lt;/em> What if utensils were &lt;code>cookable&lt;/code>?!
&lt;br>&lt;br>
Maybe even a game server has made this decision, or a scripting language. We didn't anticipate this at compile time! It'd be great if we could quickly try it out at the flip of a switch, though, while the game is running. And especially without having to track down every codepath handling &lt;code>cookable&lt;/code> &lt;code>ingredients&lt;/code> to now handle &lt;code>cookable&lt;/code> &lt;code>utensils&lt;/code>!
&lt;/p>
&lt;h1 id="runtime-components-just-as-fast">Runtime components? Just as fast&lt;/h1>
&lt;p>&lt;strong>We want runtime components in Mach engine for the reasons above, all of which boil down to &lt;em>rapid, iterative game design&lt;/em>.&lt;/strong> Integration of our ECS with a GUI level editor, etc. all require deep levels of runtime introspection of the data in our ECS.&lt;/p>
&lt;p>One may assume that runtime ECS will just naturally be slower than a comptime ECS.&lt;/p>
&lt;p>It&amp;rsquo;s important to note that just because we&amp;rsquo;re defining components at runtime, it doesn&amp;rsquo;t mean we cannot take special care to follow data oriented design and structure our memory in a way that is very efficient for CPU cache.&lt;/p>
&lt;h1 id="thinking-in-terms-of-databases">Thinking in terms of databases&lt;/h1>
&lt;p>With more complex aspects of an ECS, there are just &lt;em>tradeoffs&lt;/em>, &lt;em>tradeoffs&lt;/em>, &lt;em>tradeoffs&lt;/em> everywhere!&lt;/p>
&lt;ul>
&lt;li>Querying
&lt;ul>
&lt;li>&amp;ldquo;find all entities that have Physics and Location components&amp;rdquo;&lt;/li>
&lt;li>&amp;ldquo;find all entities within 5 units distance from (x, y, z)&amp;rdquo;&lt;/li>
&lt;li>&amp;ldquo;find me all player entities whose Name component starts with &amp;lsquo;ziggy&amp;rsquo;&amp;rdquo;&lt;/li>
&lt;li>&amp;hellip;&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Indexing queries (how to make complex queries fast?)&lt;/li>
&lt;li>Dense vs. sparse storage
&lt;ul>
&lt;li>&amp;ldquo;almost every player has a Weapon component&amp;rdquo;&lt;/li>
&lt;li>&amp;ldquo;only a few players have a Weapon component, most don&amp;rsquo;t&amp;rdquo;&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;p>Could there ever be a perfect way to represent ECS data in memory to handle all possible ways someone might want to use it? You might see this as a drawback - there cannot be a perfect ECS! &amp;ldquo;Maybe that means you shouldn&amp;rsquo;t use one at all&amp;rdquo; you might think&lt;/p>
&lt;p>But, if we begin to think about an ECS as nothing more than an in-memory database for game entities, it&amp;rsquo;s incredibly tempting to draw analogies with traditional databases:&lt;/p>
&lt;img style="max-width: 100%; max-height: unset;" src="https://devlog.hexops.org/img/2022/lets-build-ecs-part-2-databases/img4.png">
&lt;h1 id="pushing-the-database-analogy-further">Pushing the database analogy further&lt;/h1>
&lt;h2 id="multi-threaded-queries--writes">Multi-threaded queries / writes&lt;/h2>
&lt;p>A physics system which wishes to calculate physics for any entity with &lt;code>location&lt;/code> and &lt;code>velocity&lt;/code> components (columns on any table) ideally can run in parallel with other systems which wish to query and mutate entities.&lt;/p>
&lt;p>Such a physics system could interact with the ECS through a &amp;ldquo;database connection&amp;rdquo; or &amp;ldquo;database handle&amp;rdquo; which synchronizes access (say through table locks, column locks, row locks, etc.) to ensure conflict-free parallel execution with other systems.&lt;/p>
&lt;p>Additionally, finding entities with &lt;code>location&lt;/code> and &lt;code>velocity&lt;/code> components is as simple as asking: which tables have those columns? Every entity in such a table is guaranteed to have those components, we don&amp;rsquo;t need to check each entity to see if it has those components.&lt;/p>
&lt;h2 id="indexing-queries">Indexing queries&lt;/h2>
&lt;img src="https://devlog.hexops.org/img/2022/lets-build-ecs-part-2-databases/img5.png">
&lt;p>The natural row-by-row order of database tables is great, but we could have &lt;em>indexes&lt;/em> to optimize specific query usage patterns without fundamentally changing our architecture:&lt;/p>
&lt;ul>
&lt;li>&lt;strong>Spatial index:&lt;/strong> Maybe there are 1 million entities spread across a huge area, we only want to find those within 10 meters from our player. A spatial index could utilize an octree to optimize such queries.&lt;/li>
&lt;li>&lt;strong>Graph relation index:&lt;/strong> If you anticipate walking up/down a graph of entities (think a GUI / scene graph) a lot, then it&amp;rsquo;s important to have a fast way to lookup a given entity&amp;rsquo;s parent/children - a graph relation index could efficiently keep track of such relations.&lt;/li>
&lt;li>&lt;strong>Generic probability index:&lt;/strong> Sometimes you&amp;rsquo;ll need to &amp;ldquo;find all entities where component X has a value Y&amp;rdquo;, a probability index could maintain &lt;a href="https://github.com/hexops/fastfilter">fastfilters&lt;/a> to statistically answer &amp;ldquo;these entities likely have value Y (though a few might not)&amp;rdquo; extremely quickly.&lt;/li>
&lt;li>&lt;strong>Generic function index:&lt;/strong> An escape hatch - maybe you want to find all entities where &lt;code>arbitraryFunction(entity)&lt;/code> returns &lt;code>true&lt;/code>, a generic index could keep track of when entities (rows) are changed and only invoke &lt;code>arbitraryFunction&lt;/code> when changes occur.&lt;/li>
&lt;/ul>
&lt;h2 id="other-ecs-implementations">Other ECS implementations&lt;/h2>
&lt;p>After thinking about this anology quite far, writing our implementation around it, etc. I was quite happy to find that I wasn&amp;rsquo;t the only one who thought of this: The Rust Bevy authors &lt;a href="https://bevy-cheatbook.github.io/programming/ecs-intro.html#ecs-as-a-data-structure">also describe ECS as a data structure in this way&lt;/a> and after writing my implementation I got in touch with them to discuss tradeoffs, get advice, etc. (many thanks!)&lt;/p>
&lt;p>While this is a helpful analogy to have in the back of your head, we won&amp;rsquo;t take it &lt;em>too far&lt;/em> - it&amp;rsquo;s not completely perfect. For example, the database equivalent of &amp;lsquo;sparse storage&amp;rsquo; might be &amp;ldquo;every row of our &amp;ldquo;players&amp;rdquo; table has a foreign key (the row ID of another table with less rows)&amp;rdquo;, but in reality we wouldn&amp;rsquo;t want our ECS sparse storage to pay the cost of storing that ID for every table row: only rows of entities where we want such a component value. Instead, sparse storage in an ECS is more like a mapping of &lt;code>row ID -&amp;gt; component_value&lt;/code>.&lt;/p>
&lt;h1 id="writing-our-ecs-implementation-in-zig">Writing our ECS implementation in Zig&lt;/h1>
&lt;p>At this point we&amp;rsquo;ve made a large amount of the architecture decisions for our ECS: we understand how ECS relates to data oriented design, databases, and the tradeoffs we&amp;rsquo;ll make with our implementation. From this point on, this series will be much more code-heavy!&lt;/p>
&lt;h2 id="representing-a-world-of-entities">Representing a &amp;ldquo;world&amp;rdquo; of entities&lt;/h2>
&lt;p>The first thing we need is a way to represent our &amp;ldquo;database of tables&amp;rdquo;, the tables that will contain our entities, usage may look something like:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="kr">var&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">world&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Entities&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">init&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">allocator&lt;/span>&lt;span class="p">);&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="c1">// create a world
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="k">defer&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">world&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">deinit&lt;/span>&lt;span class="p">();&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="c1">// free the world
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>We can define this as a struct:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="kr">pub&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Entities&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">struct&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">allocator&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Allocator&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="kr">pub&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">fn&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">init&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">allocator&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Allocator&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Entities&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="k">return&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">.{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">allocator&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">allocator&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">};&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">}&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="kr">pub&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">fn&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">deinit&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">entities&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">*&lt;/span>&lt;span class="n">Entities&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">void&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">_&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">entities&lt;/span>&lt;span class="p">;&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="c1">// TODO: release anything we allocate
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">}&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="p">};&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>So if &lt;code>Entities&lt;/code> is our &amp;ldquo;database&amp;rdquo;, we need a way to represent our &amp;ldquo;tables&amp;rdquo; (or &amp;ldquo;archetypes&amp;rdquo;) that will store our actual entities component data:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="kr">pub&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Entities&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">struct&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">allocator&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Allocator&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="c1">/// A mapping of archetype hash to their storage.
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="c1">///
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="c1">/// Database equivalent: table name -&amp;gt; tables representing entities.
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">archetypes&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">std&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">AutoArrayHashMapUnmanaged&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="kt">u64&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">ArchetypeStorage&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">.{},&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="kr">pub&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">fn&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">init&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">allocator&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Allocator&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Entities&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="k">return&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">.{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">allocator&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">allocator&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">};&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">}&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="kr">pub&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">fn&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">deinit&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">entities&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">*&lt;/span>&lt;span class="n">Entities&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">void&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">-&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">_&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">entities&lt;/span>&lt;span class="p">;&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">-&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="c1">// TODO: release anything we allocate
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">var&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">iter&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">entities&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">archetypes&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">iterator&lt;/span>&lt;span class="p">();&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">while&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">iter&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">next&lt;/span>&lt;span class="p">())&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">|&lt;/span>&lt;span class="n">entry&lt;/span>&lt;span class="o">|&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">entry&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">value_ptr&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">deinit&lt;/span>&lt;span class="p">();&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">}&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">entities&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">archetypes&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">deinit&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">entities&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">allocator&lt;/span>&lt;span class="p">);&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">}&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">};&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>The &lt;code>Entities.archetypes&lt;/code> field is a hashmap of &lt;em>archetype hashes&lt;/em> (more on these later) to the &lt;em>archetype storage&lt;/em>, where our actual entities component values will be stored.&lt;/p>
&lt;p>&lt;code>std.AutoArrayHashMapUnmanaged(u64, ArchetypeStorage)&lt;/code> is a bit of a mouth full! If you&amp;rsquo;re keen to understand more about Zig hashmaps, I wrote &lt;a href="https://devlog.hexops.org/2022/zig-hashmaps-explained">a quick article explaining Zig hashmaps&lt;/a> you should check out, but all you need to know is this: it&amp;rsquo;s a hashmap of &lt;code>u64&lt;/code> keys to &lt;code>ArchetypeStorage&lt;/code> struct values.&lt;/p>
&lt;p>The &lt;code>deinit&lt;/code> function we&amp;rsquo;ve added just iterates over each value in the hashmap and calls &lt;code>deinit&lt;/code> on it so it has a chance to free it&amp;rsquo;s allocated memory before we free the entire hashmap.&lt;/p>
&lt;h2 id="arrayhashmap-as-an-alternative-to-sparse-sets">&lt;code>ArrayHashMap&lt;/code> as an alternative to sparse sets&lt;/h2>
&lt;p>Importantly, we use an &lt;code>ArrayHashMap&lt;/code> here not a regular hash map: an &lt;code>ArrayHashMap&lt;/code> is actually just backed by an ordered array behind the scenes, and because of this it&amp;rsquo;s optimized for &lt;em>iteration over the hashmap values&lt;/em> rather than &lt;em>hashmap lookups&lt;/em>, since consecutive values are very likely to be in CPU cache.&lt;/p>
&lt;p>Critically, we can directly index into the ordered backing array: if we know the index of a table we&amp;rsquo;d like to lookup, that&amp;rsquo;s a simple O(1) index operation and not a hashmap lookup - we&amp;rsquo;ll take great advantage of this later as an alternative to &amp;lsquo;sparse sets&amp;rsquo; you may read about in other ECS implementations.&lt;/p>
&lt;h2 id="why-our-archetype-table-names-are-hashes-entities-move-between-tables">Why our archetype table names are hashes: entities move between tables&lt;/h2>
&lt;p>You may have noticed we use &lt;code>u64&lt;/code> values to name our archetype storage tables: why not strings? In a traditional database, these would be strings:&lt;/p>
&lt;img src="https://devlog.hexops.org/img/2022/lets-build-ecs-part-2-databases/img6.png">
&lt;p>In our ECS, though, there&amp;rsquo;s a trick: tables will not be user-defined, they&amp;rsquo;ll be automatically created and destroyed as needed for you. We&amp;rsquo;ll just put entities into the table that has all the needed columns, and so our &lt;code>players&lt;/code> and &lt;code>monsters&lt;/code> tables above would actually just be one big table (since they have identical components) with a name like &lt;code>has_sword__and__health__and__location&lt;/code>:&lt;/p>
&lt;img src="https://devlog.hexops.org/img/2022/lets-build-ecs-part-2-databases/img7.png">
&lt;p>Let&amp;rsquo;s say we want to give a &lt;code>player&lt;/code> entity a new component, like a &lt;code>rotation&lt;/code>, then we&amp;rsquo;ll just create a new table with &lt;code>has_sword, health, location, rotation&lt;/code> columns and move &lt;em>just that one entity&lt;/em> over to the new table.&lt;/p>
&lt;p>And so the &lt;em>names&lt;/em> for our archetype tables are actually just a hash of all the component names/types! This means that when we add that &lt;code>rotation&lt;/code> component to another &lt;code>player&lt;/code> entity, we can merely hash all the component names the entity will &lt;em>now&lt;/em> have to quickly check: does a table for storing this archetype of entity already exist, or do we need to create a new one?&lt;/p>
&lt;h2 id="creating-our-first-archetype-table">Creating our first archetype table&lt;/h2>
&lt;p>When we first create an entity, it&amp;rsquo;s not going to have any components. We need a way to represent entities that do not have any components - for this, we&amp;rsquo;ll create a special &amp;ldquo;void archetype&amp;rdquo;, an empty table where entities will start out:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="o">+&lt;/span>&lt;span class="kr">pub&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">void_archetype_hash&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">std&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">math&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">maxInt&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="kt">u64&lt;/span>&lt;span class="p">);&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="kr">pub&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Entities&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">struct&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">...&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">-&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">pub&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">fn&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">init&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">allocator&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Allocator&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">!&lt;/span>&lt;span class="n">Entities&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">-&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">return&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">.{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">-&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">allocator&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">allocator&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">-&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">};&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">pub&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">fn&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">init&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">allocator&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Allocator&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">!&lt;/span>&lt;span class="n">Entities&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">var&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">entities&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Entities&lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">allocator&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">allocator&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">};&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">try&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">entities&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">archetypes&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">put&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">allocator&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">void_archetype_hash&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">ArchetypeStorage&lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">allocator&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">allocator&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">components&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">.{},&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">hash&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">void_archetype_hash&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">});&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">return&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">entities&lt;/span>&lt;span class="p">;&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">}&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">};&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>This puts a single item into our &lt;code>entities.archetypes&lt;/code> hashmap: &lt;code>void_archetype_hash&lt;/code> as the key, which will be a special key for entities without any components, and the value is a new &lt;code>ArchetypeStorage{...}&lt;/code> table, which looks like this:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="kr">pub&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">ArchetypeStorage&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">struct&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">allocator&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Allocator&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="c1">/// The hash of every component name in this archetype, i.e. the name of this archetype.
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">hash&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">u64&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="c1">/// A string hashmap of component_name -&amp;gt; type-erased *ComponentStorage(Component)
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">components&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">std&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">StringArrayHashMapUnmanaged&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">ErasedComponentStorage&lt;/span>&lt;span class="p">),&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="kr">pub&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">fn&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">deinit&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">storage&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">*&lt;/span>&lt;span class="n">ArchetypeStorage&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">void&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="k">for&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">storage&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">components&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">values&lt;/span>&lt;span class="p">())&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">|&lt;/span>&lt;span class="n">erased&lt;/span>&lt;span class="o">|&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">erased&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">deinit&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">erased&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">ptr&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">storage&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">allocator&lt;/span>&lt;span class="p">);&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">}&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">storage&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">components&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">deinit&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">storage&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">allocator&lt;/span>&lt;span class="p">);&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">}&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="p">};&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>&lt;code>ArchetypeStorage&lt;/code> is representing &lt;em>all entities&lt;/em> that have the exact same set of component types (think back to our &amp;ldquo;monsters and players go in one table&amp;rdquo; diagram above.) The database equivalent of &lt;code>ArchetypeStorage&lt;/code> is a table, where rows are entities and columns are components (dense storage.)&lt;/p>
&lt;p>It&amp;rsquo;s aware of it&amp;rsquo;s own &lt;code>hash&lt;/code> (table name), and maintains it&amp;rsquo;s own hashmap &lt;code>components&lt;/code> which maps &lt;em>component names&lt;/em> (strings) to the actual place in memory where we store the components&amp;rsquo; values. Here, this is &lt;code>ErasedComponentStorage&lt;/code>, a type-erased pointer to &lt;code>*ComponentStorage(Component)&lt;/code>, which brings us to..&lt;/p>
&lt;h2 id="storing-components-in-memory">Storing components in memory&lt;/h2>
&lt;p>Within our &lt;code>ArchetypeStorage&lt;/code> database tables, we need to actually store the values for components &lt;em>somewhere&lt;/em>. And how we represent these in memory is critical. You may recall from Andrew Kelley&amp;rsquo;s &lt;a href="https://media.handmade-seattle.com/practical-data-oriented-design/">“A Practical Guide to Applying Data-Oriented Design”&lt;/a> talk that if we use &lt;code>std.MultiArrayList&lt;/code> it would store our data in a way that is more efficient for CPU cache, leading to much greater performance. We get to take advantage of that here by storing component values as a struct-of-arrays instead of array-of-structs which, as the talk describes, helps to reduce the in-memory size of our data and ensure more of them are in CPU cache.&lt;/p>
&lt;p>A component will be a relatively simple, small value - such as:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Location&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">struct&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">x&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">f32&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="mi">0&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">y&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">f32&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="mi">0&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">z&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">f32&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="mi">0&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="p">};&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>On an entity, there may be &lt;em>many of these&lt;/em>. Thanks to the database model we have and tables being laid out, we get to store all &lt;code>Location&lt;/code> component values in contiguous memory which is great for CPU caches:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="c1">/// Represents the storage for a single type of component within a single type of entity.
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">///
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">/// Database equivalent: a column within a table.
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="kr">pub&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">fn&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">ComponentStorage&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="kr">comptime&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Component&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">type&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">type&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="k">return&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">struct&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="c1">/// A reference to the total number of entities with the same type as is being stored here.
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">total_rows&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">*&lt;/span>&lt;span class="kt">usize&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="c1">/// The actual densely stored component data.
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">data&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">std&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">ArrayListUnmanaged&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">Component&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">.{},&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Self&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="nb">@This&lt;/span>&lt;span class="p">();&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="kr">pub&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">fn&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">deinit&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">storage&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">*&lt;/span>&lt;span class="n">Self&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">allocator&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Allocator&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">void&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">storage&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">data&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">deinit&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">allocator&lt;/span>&lt;span class="p">);&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">}&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">};&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="p">}&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>So when we have a table of entities that have a &lt;code>Location&lt;/code> component, our &lt;code>ArchetypeStorage.components&lt;/code> hashmap will have a &lt;code>&amp;quot;location&amp;quot;&lt;/code> entry for example that points to a &lt;code>*ComponentStorage(Location)&lt;/code> densely storing all location values for every entity in the entire table.&lt;/p>
&lt;h2 id="type-erased-component-storage">Type-erased component storage&lt;/h2>
&lt;p>You might recall our &lt;code>components&lt;/code> hashmap in our table is &lt;code>ErasedComponentStorage&lt;/code>, not &lt;code>*ComponentStorage(Component)&lt;/code>:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="kr">pub&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">ArchetypeStorage&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">struct&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">...&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">components&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">std&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">StringArrayHashMapUnmanaged&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">ErasedComponentStorage&lt;/span>&lt;span class="p">),&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="p">};&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>What gives? Well, the problem is that we need to store multiple component types in this hashmap. For example, if this table represents player entities we may need two entries:&lt;/p>
&lt;ul>
&lt;li>&lt;code>&amp;quot;weapon&amp;quot;&lt;/code> -&amp;gt; &lt;code>*ComponentStorage(Weapon)&lt;/code>&lt;/li>
&lt;li>&lt;code>&amp;quot;location&amp;quot;&lt;/code> -&amp;gt; &lt;code>*ComponentStorage(Location)&lt;/code>&lt;/li>
&lt;/ul>
&lt;p>Here &lt;code>Weapon&lt;/code> and &lt;code>Location&lt;/code> are generic type parameters. Our &lt;code>ArchetypeStorage.components&lt;/code> hashmap can only point to one type of value, though! So we must first turn our &lt;code>*ComponentStorage(Weapon)&lt;/code> into a type-erased pointer &lt;code>*anyopaque&lt;/code> (equal to C&amp;rsquo;s &lt;code>void*&lt;/code>). Of course, this can make working with the values quite difficult because then we don&amp;rsquo;t know what data type they were supposed to have! To aid with this, we introduce a &lt;code>ErasedComponentStorage&lt;/code> type:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="c1">/// A type-erased representation of ComponentStorage(T) (where T is unknown).
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="kr">pub&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">ErasedComponentStorage&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">struct&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">ptr&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">*&lt;/span>&lt;span class="n">anyopaque&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="c1">// Casts this `ErasedComponentStorage` into `*ComponentStorage(Component)` with the given type
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="c1">// (unsafe).
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">pub&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">fn&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">cast&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">ptr&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">*&lt;/span>&lt;span class="n">anyopaque&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">comptime&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Component&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">type&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">*&lt;/span>&lt;span class="n">ComponentStorage&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">Component&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="kr">var&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">aligned&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="nb">@alignCast&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="nb">@alignOf&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="o">*&lt;/span>&lt;span class="n">ComponentStorage&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">Component&lt;/span>&lt;span class="p">)),&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">ptr&lt;/span>&lt;span class="p">);&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="k">return&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="nb">@ptrCast&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="o">*&lt;/span>&lt;span class="n">ComponentStorage&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">Component&lt;/span>&lt;span class="p">),&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">aligned&lt;/span>&lt;span class="p">);&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">}&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="p">};&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>This is useful as it allows us to store all of the typed &lt;code>ComponentStorage(T)&lt;/code> as values in a hashmap despite having different &lt;code>T&lt;/code> types, and allows us to still interact with them in consistent ways even though we don&amp;rsquo;t remember what the underlying type is. For example, add the requirement that &lt;code>ErasedComponentStorage&lt;/code> knows how to deinitialize itself:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="kr">pub&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">ErasedComponentStorage&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">struct&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">ptr&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">*&lt;/span>&lt;span class="n">anyopaque&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">deinit&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">fn&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">erased&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">*&lt;/span>&lt;span class="n">anyopaque&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">allocator&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Allocator&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">void&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">...&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">};&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>When we go to actually create an &lt;code>ErasedComponentStorage&lt;/code> value, we know the type, and so we can set up a function that does the deinitialization for us:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="kr">pub&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Entities&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">struct&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">...&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="kr">pub&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">fn&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">initErasedStorage&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">entities&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">*&lt;/span>&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Entities&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">total_rows&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">*&lt;/span>&lt;span class="kt">usize&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">comptime&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Component&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">type&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">!&lt;/span>&lt;span class="n">ErasedComponentStorage&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="kr">var&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">new_ptr&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">try&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">entities&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">allocator&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">create&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">ComponentStorage&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">Component&lt;/span>&lt;span class="p">));&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">new_ptr&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="o">*&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">ComponentStorage&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">Component&lt;/span>&lt;span class="p">){&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">total_rows&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">total_rows&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">};&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="k">return&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">ErasedComponentStorage&lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">ptr&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">new_ptr&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">deinit&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="k">struct&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="kr">pub&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">fn&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">deinit&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">erased&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">*&lt;/span>&lt;span class="n">anyopaque&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">allocator&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Allocator&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">void&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="kr">var&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">ptr&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">ErasedComponentStorage&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">cast&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">erased&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Component&lt;/span>&lt;span class="p">);&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">ptr&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">deinit&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">allocator&lt;/span>&lt;span class="p">);&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">allocator&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">destroy&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">ptr&lt;/span>&lt;span class="p">);&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">}&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">}).&lt;/span>&lt;span class="n">deinit&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">};&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">}&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">...&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Here &lt;code>initErasedStorage&lt;/code> is a way for us to say:&lt;/p>
&lt;blockquote>
&lt;p>Hey! I anticipate storing &lt;code>total_rows&lt;/code> of &lt;code>Component&lt;/code> values in a table, please allocate that for me and give me a &lt;code>ErasedComponentStorage&lt;/code> value in return.&lt;/p>
&lt;/blockquote>
&lt;p>The first two lines create a pointer where we can store our &lt;code>ComponentStorage(Component)&lt;/code> struct value itself (not the items inside of it), and initialize &lt;code>new_ptr.*&lt;/code> with a value:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="kr">var&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">new_ptr&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">try&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">entities&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">allocator&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">create&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">ComponentStorage&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">Component&lt;/span>&lt;span class="p">));&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="n">new_ptr&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="o">*&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">ComponentStorage&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">Component&lt;/span>&lt;span class="p">){&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">total_rows&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">total_rows&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">};&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Then we create the &lt;code>ErasedComponentStorage&lt;/code> value, giving it the pointer &lt;code>*ComponentStorage(Component)&lt;/code> (and erasing that type in the process) &lt;code>.ptr = new_ptr,&lt;/code>, and create our &lt;code>deinit&lt;/code> helper:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="p">.&lt;/span>&lt;span class="n">deinit&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="k">struct&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="kr">pub&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">fn&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">deinit&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">erased&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">*&lt;/span>&lt;span class="n">anyopaque&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">allocator&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Allocator&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">void&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="kr">var&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">ptr&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">ErasedComponentStorage&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">cast&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">erased&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Component&lt;/span>&lt;span class="p">);&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">ptr&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">deinit&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">allocator&lt;/span>&lt;span class="p">);&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">allocator&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">destroy&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">ptr&lt;/span>&lt;span class="p">);&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">}&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="p">}).&lt;/span>&lt;span class="n">deinit&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>&lt;code>.deinit =&lt;/code> is setting the &lt;code>deinit&lt;/code> field of &lt;code>ErasedComponentStorage&lt;/code> to a value. The &lt;code>(struct { ... }).deinit,&lt;/code> part is just creating an anonymous struct with a &lt;code>deinit&lt;/code> function in it so we can pick it back out, this is some syntactual cruft &lt;a href="https://github.com/ziglang/zig/issues/1717">Zig currently requires&lt;/a> for writing function expressions.&lt;/p>
&lt;p>You&amp;rsquo;ll see that what the function does is pretty simple, though: It takes that &lt;code>erased: *anyopaque&lt;/code> pointer, casts it back to the typed value &lt;code>*ComponentStorage(Component)&lt;/code> (since within this function we know what the &lt;code>Component&lt;/code> type is) and then calls &lt;code>ptr.deinit(allocator)&lt;/code> which is just a standard method on the &lt;code>ComponentStorage&lt;/code> struct so it has a chance to free any memory it allocated, before ultimately we ask the allocator &lt;code>allocator.destroy(ptr)&lt;/code> to destroy the pointer where we&amp;rsquo;re storing that struct value &lt;code>ComponentStorage(Component)&lt;/code> we allocated earlier.&lt;/p>
&lt;p>Now if we have an &lt;code>*ErasedComponentStorage&lt;/code> value, we can call it&amp;rsquo;s &lt;code>.deinit&lt;/code> function and it knows how to cast back to the appropriate pointer type before freeing everything. We&amp;rsquo;ll reuse this pattern to do other generic operations on component storage later.&lt;/p>
&lt;h2 id="managing-entity-ids--pointers">Managing entity IDs / pointers&lt;/h2>
&lt;p>At this point we&amp;rsquo;ve got:&lt;/p>
&lt;ul>
&lt;li>&lt;code>Entities&lt;/code> (our database)&lt;/li>
&lt;li>&lt;code>ArchetypeStorage&lt;/code> (a table)&lt;/li>
&lt;li>&lt;code>ErasedComponentStorage&lt;/code> and &lt;code>ComponentStorage&lt;/code> - the columns in a table&lt;/li>
&lt;/ul>
&lt;p>It&amp;rsquo;s time to actually represent entities! We&amp;rsquo;ll do so with just an ID:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="c1">/// An entity ID uniquely identifies an entity globally within an Entities set.
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="kr">pub&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">EntityID&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">u64&lt;/span>&lt;span class="p">;&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>We need a way to know which table an entity is stored in, and which row in that table it&amp;rsquo;s component values are located at. You may be tempted to think that &lt;code>EntityID&lt;/code> could just be &lt;em>that information&lt;/em>, but remember than when we add or remove a component from an entity it will &lt;em>move&lt;/em> between &lt;code>ArchetypeStorage&lt;/code> tables! When that happens, it&amp;rsquo;s nice if other user code referencing that &lt;code>EntityID&lt;/code> can stay oblivious to that - so we&amp;rsquo;ll store a mapping of entity IDs to &lt;code>Pointer&lt;/code> values:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="kr">pub&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Entities&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">struct&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">...&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">counter&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">EntityID&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="mi">0&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="c1">/// A mapping of entity IDs (array indices) to where an entity&amp;#39;s component values are actually
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="c1">/// stored.
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">entities&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">std&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">AutoHashMapUnmanaged&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">EntityID&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Pointer&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">.{},&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="c1">/// Points to where an entity is stored, specifically in which archetype table and in which row
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="c1">/// of that table.
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">pub&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Pointer&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">struct&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">archetype_index&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">u16&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">row_index&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">u32&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">};&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="kr">pub&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">fn&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">deinit&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">entities&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">*&lt;/span>&lt;span class="n">Entities&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">void&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">entities&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">entities&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">deinit&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">entities&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">allocator&lt;/span>&lt;span class="p">);&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">...&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">}&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">...&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">};&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Remember how I said earlier it was important that the mapping of table names -&amp;gt; tables (&lt;code>Entities.archetypes&lt;/code>) is an &lt;em>array hash map&lt;/em>, not a regular hash map? That&amp;rsquo;s because we can index directly into it! Say we&amp;rsquo;re given an arbitrary &lt;code>EntityID&lt;/code> and want to find it&amp;rsquo;s component values, first we would find out which table/row the entity points to via a hash map lookup:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">ptr&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Pointer&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">entities&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">entities&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">get&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">entity_id&lt;/span>&lt;span class="p">).&lt;/span>&lt;span class="o">?&lt;/span>&lt;span class="p">;&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Now we know exactly which table and row it&amp;rsquo;s stored in, and can lookup the table, or component values, with simple O(1) array access operations. e.g. to get the archetype table the entity is stored in:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="kr">var&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">archetype&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">entities&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">archetypes&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">entries&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">get&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">ptr&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">archetype_index&lt;/span>&lt;span class="p">);&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Here, &lt;code>entities.archetypes.entries&lt;/code> is our &lt;code>AutoArrayHashMapUnmanaged&lt;/code> mapping table names to their &lt;code>ArchetypeStorage&lt;/code> - but we access the array inside the hash map directly instead of using a hash map lookup.&lt;/p>
&lt;h2 id="creating-an-entity">Creating an entity&lt;/h2>
&lt;p>&lt;a class="imglink" href="https://devlog.hexops.org/img/2022/lets-build-ecs-part-2-databases/img8.png">&lt;img src="https://devlog.hexops.org/img/2022/lets-build-ecs-part-2-databases/img9.png">&lt;/a>&lt;/p>
&lt;p>To create new entities, we&amp;rsquo;ll use an &lt;code>Entities.new&lt;/code> method that returns a new entity ID by incrementing a global counter in our database:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="kr">pub&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Entities&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">struct&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">...&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="c1">/// Returns a new entity.
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">pub&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">fn&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">new&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">entities&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">*&lt;/span>&lt;span class="n">Entities&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">!&lt;/span>&lt;span class="n">EntityID&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">new_id&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">entities&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">counter&lt;/span>&lt;span class="p">;&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">entities&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">counter&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">+=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="mi">1&lt;/span>&lt;span class="p">;&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">return&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">new_id&lt;/span>&lt;span class="p">;&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">}&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">};&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Initially, an entity will have no components, and thus we&amp;rsquo;ll put it into that special &amp;ldquo;void&amp;rdquo; archetype mentioned earlier (this just gives us a guarantee that an entity is &lt;em>always&lt;/em> residing in an archetype, even if it has &lt;em>no components&lt;/em> - a property that will come in handy later):&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="c1">/// Returns a new entity.
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">pub&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">fn&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">new&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">entities&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">*&lt;/span>&lt;span class="n">Entities&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">!&lt;/span>&lt;span class="n">EntityID&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">new_id&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">entities&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">counter&lt;/span>&lt;span class="p">;&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">entities&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">counter&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">+=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="mi">1&lt;/span>&lt;span class="p">;&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">var&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">void_archetype&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">entities&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">archetypes&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">getPtr&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">void_archetype_hash&lt;/span>&lt;span class="p">).&lt;/span>&lt;span class="o">?&lt;/span>&lt;span class="p">;&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">new_row&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">try&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">void_archetype&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">new&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">new_id&lt;/span>&lt;span class="p">);&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="k">return&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">new_id&lt;/span>&lt;span class="p">;&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">}&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>&lt;code>void_archetype&lt;/code> here is, of course, &lt;code>ArchetypeStorage&lt;/code> (a database table). We&amp;rsquo;re invoking &lt;code>void_archetype.new(new_id)&lt;/code> to reserve a row in the &lt;em>void archetype table&lt;/em>, which will be done using this:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="kr">pub&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">ArchetypeStorage&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">struct&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">...&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">entity_ids&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">std&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">ArrayListUnmanaged&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">EntityID&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">.{},&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="kr">pub&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">fn&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">deinit&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">storage&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">*&lt;/span>&lt;span class="n">ArchetypeStorage&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">void&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="k">for&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">storage&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">components&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">values&lt;/span>&lt;span class="p">())&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">|&lt;/span>&lt;span class="n">erased&lt;/span>&lt;span class="o">|&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">erased&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">deinit&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">erased&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">ptr&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">storage&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">allocator&lt;/span>&lt;span class="p">);&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">}&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">storage&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">entity_ids&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">deinit&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">storage&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">allocator&lt;/span>&lt;span class="p">);&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">storage&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">components&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">deinit&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">storage&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">allocator&lt;/span>&lt;span class="p">);&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">}&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">pub&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">fn&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">new&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">storage&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">*&lt;/span>&lt;span class="n">ArchetypeStorage&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">entity&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">EntityID&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">!&lt;/span>&lt;span class="kt">u32&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="c1">// Return a new row index
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">new_row_index&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">storage&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">entity_ids&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">items&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">len&lt;/span>&lt;span class="p">;&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">try&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">storage&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">entity_ids&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">append&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">storage&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">allocator&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">entity&lt;/span>&lt;span class="p">);&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">return&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="nb">@intCast&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="kt">u32&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">new_row_index&lt;/span>&lt;span class="p">);&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">}&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">};&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Now &lt;code>ArchetypeStorage&lt;/code> maintains a mapping of &lt;em>rows in the table&lt;/em> (&lt;code>entity_ids&lt;/code> indices) to the &lt;em>entity ID&lt;/em>. This will come in handy later - the only important thing to note here is that this is &lt;em>reserving&lt;/em> a new row in the table where the entity can live, but it&amp;rsquo;s not actually &lt;em>allocating the storage for that entity&amp;rsquo;s component values&lt;/em> yet.&lt;/p>
&lt;p>Back over to &lt;code>Entities&lt;/code> (the database), we need to record which archetype (table), and row number in that table, that the entity ID actually points to (remember-entity IDs are merely pointers to a specific table and row):&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="c1">/// Returns a new entity.
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">pub&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">fn&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">new&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">entities&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">*&lt;/span>&lt;span class="n">Entities&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">!&lt;/span>&lt;span class="n">EntityID&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">new_id&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">entities&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">counter&lt;/span>&lt;span class="p">;&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">entities&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">counter&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">+=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="mi">1&lt;/span>&lt;span class="p">;&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="kr">var&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">void_archetype&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">entities&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">archetypes&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">getPtr&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">void_archetype_hash&lt;/span>&lt;span class="p">).&lt;/span>&lt;span class="o">?&lt;/span>&lt;span class="p">;&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">new_row&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">try&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">void_archetype&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">new&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">new_id&lt;/span>&lt;span class="p">);&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">void_pointer&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Pointer&lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">archetype_index&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="mi">0&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="c1">// void archetype is guaranteed to be first index
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">row_index&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">new_row&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">};&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">entities&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">entities&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">put&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">entities&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">allocator&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">new_id&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">void_pointer&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">catch&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">|&lt;/span>&lt;span class="n">err&lt;/span>&lt;span class="o">|&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">void_archetype&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">undoNew&lt;/span>&lt;span class="p">();&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">return&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">err&lt;/span>&lt;span class="p">;&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">};&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="k">return&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">new_id&lt;/span>&lt;span class="p">;&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">}&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Remember how &lt;code>void_archetype.new&lt;/code> from earlier &lt;em>reserves&lt;/em> a row in the table? Well, what happens if we reserved that row, but then fail to record the pointer (&lt;code>entities.entities.put&lt;/code> OOMs)? In this case, our table has reserved a row for the entity to-be, but we don&amp;rsquo;t have enough memory to record which table/row the entity ID points to. So we need to &lt;em>undo&lt;/em> that reservation to ensure our table doesn&amp;rsquo;t have an unused (reserved) row. &lt;code>undoNew&lt;/code> does exactly that:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="kr">pub&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">ArchetypeStorage&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">struct&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">...&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="kr">pub&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">fn&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">new&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">storage&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">*&lt;/span>&lt;span class="n">ArchetypeStorage&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">entity&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">EntityID&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">!&lt;/span>&lt;span class="kt">u32&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="c1">// Return a new row index
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">new_row_index&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">storage&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">entity_ids&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">items&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">len&lt;/span>&lt;span class="p">;&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="k">try&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">storage&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">entity_ids&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">append&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">storage&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">allocator&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">entity&lt;/span>&lt;span class="p">);&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="k">return&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="nb">@intCast&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="kt">u32&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">new_row_index&lt;/span>&lt;span class="p">);&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">}&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">pub&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">fn&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">undoNew&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">storage&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">*&lt;/span>&lt;span class="n">ArchetypeStorage&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">void&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">_&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">storage&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">entity_ids&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">pop&lt;/span>&lt;span class="p">();&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">}&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">};&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Since the call to &lt;code>new&lt;/code> merely appended a value to &lt;code>entity_ids&lt;/code>, we only need to &lt;code>pop&lt;/code> the last value off in order to undo the call to &lt;code>new&lt;/code> and effectively unreserve the row we last reserved.&lt;/p>
&lt;p>At this point, the following test works:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="k">test&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="s">&amp;#34;ecs&amp;#34;&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">allocator&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">testing&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">allocator&lt;/span>&lt;span class="p">;&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="kr">var&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">world&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">try&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Entities&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">init&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">allocator&lt;/span>&lt;span class="p">);&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="k">defer&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">world&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">deinit&lt;/span>&lt;span class="p">();&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">player&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">try&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">world&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">new&lt;/span>&lt;span class="p">();&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">_&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">player&lt;/span>&lt;span class="p">;&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="p">}&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>(A copy of the full code at this point &lt;a href="https://gist.github.com/emidoots/477f9f4c68667e71fbe584a700cfd87d">is available here&lt;/a> and you can run it using &lt;code>zig test ecs.zig&lt;/code>)&lt;/p>
&lt;h2 id="working-with-component-storage">Working with component storage&lt;/h2>
&lt;p>As we work with component storage (table columns, where component values for a single type &lt;code>T&lt;/code> are stored contiguously in memory) we&amp;rsquo;re going to need some helper functions. The first one is to swap remove a value from a column:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="kr">pub&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">fn&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">ComponentStorage&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="kr">comptime&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Component&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">type&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">type&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="k">return&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">struct&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">...&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">data&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">std&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">ArrayListUnmanaged&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">Component&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">.{},&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">pub&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">fn&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">remove&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">storage&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">*&lt;/span>&lt;span class="n">Self&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">row_index&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">u32&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">void&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">if&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">storage&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">data&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">items&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">len&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">&amp;gt;&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">row_index&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">_&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">storage&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">data&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">swapRemove&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">row_index&lt;/span>&lt;span class="p">);&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">}&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">}&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Note that &lt;code>ComponentStorage&lt;/code> memory is lazily allocated, so we only remove if the table column does in fact have storage allocated.&lt;/p>
&lt;p>Next up is a simple copy function, a specific row&amp;rsquo;s value from &lt;code>src&lt;/code> to &lt;code>dst&lt;/code>:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">pub&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">inline&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">fn&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">copy&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">dst&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">*&lt;/span>&lt;span class="n">Self&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">allocator&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Allocator&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">src_row&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">u32&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">dst_row&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">u32&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">src&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">*&lt;/span>&lt;span class="n">Self&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">!&lt;/span>&lt;span class="kt">void&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">try&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">dst&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">set&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">allocator&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">dst_row&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">src&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">get&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">src_row&lt;/span>&lt;span class="p">));&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">}&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>And a helper to get the actual component value from a column:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">pub&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">inline&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">fn&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">get&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">storage&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Self&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">row_index&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">u32&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Component&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">return&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">storage&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">data&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">items&lt;/span>&lt;span class="p">[&lt;/span>&lt;span class="n">row_index&lt;/span>&lt;span class="p">];&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">}&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Here we do not need to check the length of &lt;code>storage.data.items&lt;/code>, because we assert the column must have that row.&lt;/p>
&lt;h2 id="working-with-type-erasedcomponentstorage">Working with type-ErasedComponentStorage&lt;/h2>
&lt;p>As we discussed earlier, we often won&amp;rsquo;t have a typed &lt;code>ComponentStorage(T)&lt;/code> and instead will have &lt;code>ErasedComponentStorage&lt;/code>. We need a few more helpers to operate on the columns of a table, this time without knowing the underlying data type.&lt;/p>
&lt;h3 id="cloning-componentstorage-types">Cloning ComponentStorage types&lt;/h3>
&lt;p>The first helper we need is the ability to create a new value of type &lt;code>ComponentStorage(T)&lt;/code> when we don&amp;rsquo;t know the actual type of &lt;code>T&lt;/code>:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="c1">/// A type-erased representation of ComponentStorage(T) (where T is unknown).
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">pub&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">ErasedComponentStorage&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">struct&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">ptr&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">*&lt;/span>&lt;span class="n">anyopaque&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">...&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">cloneType&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">fn&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">erased&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">ErasedComponentStorage&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">total_entities&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">*&lt;/span>&lt;span class="kt">usize&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">allocator&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Allocator&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">retval&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">*&lt;/span>&lt;span class="n">ErasedComponentStorage&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">error&lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="n">OutOfMemory&lt;/span>&lt;span class="p">}&lt;/span>&lt;span class="o">!&lt;/span>&lt;span class="kt">void&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>The goal of this helper is purely to clone the type, let&amp;rsquo;s see how it is implemented when when initializing erased component storage:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="kr">pub&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Entities&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">struct&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">...&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="kr">pub&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">fn&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">initErasedStorage&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">entities&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">*&lt;/span>&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Entities&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">total_rows&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">*&lt;/span>&lt;span class="kt">usize&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">comptime&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Component&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">type&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">!&lt;/span>&lt;span class="n">ErasedComponentStorage&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">...&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="k">return&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">ErasedComponentStorage&lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">...&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">cloneType&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="k">struct&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">pub&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">fn&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">cloneType&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">erased&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">ErasedComponentStorage&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">_total_rows&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">*&lt;/span>&lt;span class="kt">usize&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">allocator&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Allocator&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">retval&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">*&lt;/span>&lt;span class="n">ErasedComponentStorage&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">!&lt;/span>&lt;span class="kt">void&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">var&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">new_clone&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">try&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">allocator&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">create&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">ComponentStorage&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">Component&lt;/span>&lt;span class="p">));&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">new_clone&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="o">*&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">ComponentStorage&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">Component&lt;/span>&lt;span class="p">){&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">total_rows&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">_total_rows&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">};&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">var&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">tmp&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">erased&lt;/span>&lt;span class="p">;&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">tmp&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">ptr&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">new_clone&lt;/span>&lt;span class="p">;&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">retval&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="o">*&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">tmp&lt;/span>&lt;span class="p">;&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">}&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">}).&lt;/span>&lt;span class="n">cloneType&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">};&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">}&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Notably this &lt;em>doesn&amp;rsquo;t copy the actual values in the component storage&lt;/em>, it&amp;rsquo;s just creating a new &lt;code>ComponentStorage(T)&lt;/code> for us - just the &lt;code>struct&lt;/code> value, as if we&amp;rsquo;d written &lt;code>ComponentStorage(T){.total_rows = total_rows}&lt;/code>! This doesn&amp;rsquo;t allocate storage for the rows, it just says we anticipate there will be that many.&lt;/p>
&lt;h3 id="copying-component-values-between-tables">Copying component values between tables&lt;/h3>
&lt;p>The second helper we&amp;rsquo;ll need is a way to copy a component value from one &lt;code>ComponentStorage(T)&lt;/code> column row to another of the same type &lt;code>T&lt;/code>, again when we don&amp;rsquo;t know the underlying type and just have two &lt;code>ErasedComponentStorage&lt;/code> we need to copy a value between:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="c1">/// A type-erased representation of ComponentStorage(T) (where T is unknown).
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">pub&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">ErasedComponentStorage&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">struct&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">ptr&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">*&lt;/span>&lt;span class="n">anyopaque&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">...&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">copy&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">fn&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">dst_erased&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">*&lt;/span>&lt;span class="n">anyopaque&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">allocator&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Allocator&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">src_row&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">u32&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">dst_row&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">u32&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">src_erased&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">*&lt;/span>&lt;span class="n">anyopaque&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">error&lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="n">OutOfMemory&lt;/span>&lt;span class="p">}&lt;/span>&lt;span class="o">!&lt;/span>&lt;span class="kt">void&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>The implementation is simple:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="kr">pub&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Entities&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">struct&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">...&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="kr">pub&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">fn&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">initErasedStorage&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">entities&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">*&lt;/span>&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Entities&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">total_rows&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">*&lt;/span>&lt;span class="kt">usize&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">comptime&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Component&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">type&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">!&lt;/span>&lt;span class="n">ErasedComponentStorage&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">...&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="k">return&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">ErasedComponentStorage&lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">...&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">copy&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="k">struct&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">pub&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">fn&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">copy&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">dst_erased&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">*&lt;/span>&lt;span class="n">anyopaque&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">allocator&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Allocator&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">src_row&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">u32&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">dst_row&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">u32&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">src_erased&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">*&lt;/span>&lt;span class="n">anyopaque&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">!&lt;/span>&lt;span class="kt">void&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">var&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">dst&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">ErasedComponentStorage&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">cast&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">dst_erased&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Component&lt;/span>&lt;span class="p">);&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">var&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">src&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">ErasedComponentStorage&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">cast&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">src_erased&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Component&lt;/span>&lt;span class="p">);&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">return&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">dst&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">copy&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">allocator&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">src_row&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">dst_row&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">src&lt;/span>&lt;span class="p">);&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">}&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">}).&lt;/span>&lt;span class="n">copy&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;h3 id="removing-a-row-from-a-column">Removing a row from a column&lt;/h3>
&lt;p>Removing a single component value from a column in a table looks as you&amp;rsquo;d expect:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="c1">/// A type-erased representation of ComponentStorage(T) (where T is unknown).
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">pub&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">ErasedComponentStorage&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">struct&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">ptr&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">*&lt;/span>&lt;span class="n">anyopaque&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">...&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">remove&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">fn&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">erased&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">*&lt;/span>&lt;span class="n">anyopaque&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">row&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">u32&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">void&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>The implementation is simple:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="kr">pub&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Entities&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">struct&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">...&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="kr">pub&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">fn&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">initErasedStorage&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">entities&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">*&lt;/span>&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Entities&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">total_rows&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">*&lt;/span>&lt;span class="kt">usize&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">comptime&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Component&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">type&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">!&lt;/span>&lt;span class="n">ErasedComponentStorage&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">...&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">return&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">ErasedComponentStorage&lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">...&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">copy&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="k">struct&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">pub&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">fn&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">copy&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">dst_erased&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">*&lt;/span>&lt;span class="n">anyopaque&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">allocator&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Allocator&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">src_row&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">u32&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">dst_row&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">u32&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">src_erased&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">*&lt;/span>&lt;span class="n">anyopaque&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">!&lt;/span>&lt;span class="kt">void&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">var&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">dst&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">ErasedComponentStorage&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">cast&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">dst_erased&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Component&lt;/span>&lt;span class="p">);&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">var&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">src&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">ErasedComponentStorage&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">cast&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">src_erased&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Component&lt;/span>&lt;span class="p">);&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">return&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">dst&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">copy&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">allocator&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">src_row&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">dst_row&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">src&lt;/span>&lt;span class="p">);&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">}&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">}).&lt;/span>&lt;span class="n">copy&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="kr">pub&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Entities&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">struct&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">...&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="kr">pub&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">fn&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">initErasedStorage&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">entities&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">*&lt;/span>&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Entities&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">total_rows&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">*&lt;/span>&lt;span class="kt">usize&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">comptime&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Component&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">type&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">!&lt;/span>&lt;span class="n">ErasedComponentStorage&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">...&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="k">return&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">ErasedComponentStorage&lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">...&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">remove&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="k">struct&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">pub&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">fn&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">remove&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">erased&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">*&lt;/span>&lt;span class="n">anyopaque&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">row&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">u32&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">void&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">var&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">ptr&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">ErasedComponentStorage&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">cast&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">erased&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Component&lt;/span>&lt;span class="p">);&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">ptr&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">remove&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">row&lt;/span>&lt;span class="p">);&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">}&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">}).&lt;/span>&lt;span class="n">remove&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;h2 id="removing-an-entire-row-from-a-table">Removing an entire row from a table&lt;/h2>
&lt;p>When we want to remove an &lt;em>entire row&lt;/em> (all column values), we need to invoke &lt;code>remove&lt;/code> on each column:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="kr">pub&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">ArchetypeStorage&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">struct&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">...&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">pub&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">fn&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">remove&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">storage&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">*&lt;/span>&lt;span class="n">ArchetypeStorage&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">row_index&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">u32&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">!&lt;/span>&lt;span class="kt">void&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">_&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">storage&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">entity_ids&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">swapRemove&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">row_index&lt;/span>&lt;span class="p">);&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">for&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">storage&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">components&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">values&lt;/span>&lt;span class="p">())&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">|&lt;/span>&lt;span class="n">component_storage&lt;/span>&lt;span class="o">|&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">component_storage&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">remove&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">component_storage&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">ptr&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">row_index&lt;/span>&lt;span class="p">);&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">}&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">}&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>This also swap removes the row from the table &lt;code>entity_ids&lt;/code> mapping of row indices -&amp;gt; entity ID.&lt;/p>
&lt;h2 id="adding-components">Adding components&lt;/h2>
&lt;p>Now on to the fun part: adding components to our entity. Suppose we want to do this:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="k">test&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="s">&amp;#34;ecs&amp;#34;&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">...&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Location&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">struct&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">x&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">f32&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="mi">0&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">y&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">f32&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="mi">0&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">z&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">f32&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="mi">0&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">};&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">try&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">world&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">setComponent&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">player&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="s">&amp;#34;Name&amp;#34;&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="s">&amp;#34;jane&amp;#34;&lt;/span>&lt;span class="p">);&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="c1">// add Name component
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">try&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">world&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">setComponent&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">player&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="s">&amp;#34;Location&amp;#34;&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Location&lt;/span>&lt;span class="p">{});&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="c1">// add Location component
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">try&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">world&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">setComponent&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">player&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="s">&amp;#34;Name&amp;#34;&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="s">&amp;#34;joe&amp;#34;&lt;/span>&lt;span class="p">);&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="c1">// update Name component
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">}&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>When we call &lt;code>setComponent&lt;/code>, we may be updating the value of an existing component OR adding a new component to the entity. If the latter, we need to move the entity to a new archetype table (which may or may not exist!) - so the first thing we need to do is figure out: what archetype table is this entity currently in, and where does it need to be?&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="kr">pub&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Entities&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">struct&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">...&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">pub&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">inline&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">fn&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">archetypeByID&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">entities&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">*&lt;/span>&lt;span class="n">Entities&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">entity&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">EntityID&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">*&lt;/span>&lt;span class="n">ArchetypeStorage&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">ptr&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">entities&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">entities&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">get&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">entity&lt;/span>&lt;span class="p">).&lt;/span>&lt;span class="o">?&lt;/span>&lt;span class="p">;&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">return&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">&amp;amp;&lt;/span>&lt;span class="n">entities&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">archetypes&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">values&lt;/span>&lt;span class="p">()[&lt;/span>&lt;span class="n">ptr&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">archetype_index&lt;/span>&lt;span class="p">];&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">}&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">pub&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">fn&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">setComponent&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">entities&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">*&lt;/span>&lt;span class="n">Entities&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">entity&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">EntityID&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">name&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">[]&lt;/span>&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">u8&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">component&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">anytype&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">!&lt;/span>&lt;span class="kt">void&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">var&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">archetype&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">entities&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">archetypeByID&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">entity&lt;/span>&lt;span class="p">);&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">old_hash&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">archetype&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">hash&lt;/span>&lt;span class="p">;&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">var&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">have_already&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">archetype&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">components&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">contains&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">name&lt;/span>&lt;span class="p">);&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">new_hash&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">if&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">have_already&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">old_hash&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">else&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">old_hash&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">^&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">std&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">hash_map&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">hashString&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">name&lt;/span>&lt;span class="p">);&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">};&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>&lt;code>archetypeByID&lt;/code> takes an entity ID and gives us an actual memory pointer to the &lt;code>*ArchetypeStorage&lt;/code> table where the entity is stored. It does this by looking up the entity ID (&lt;code>entities.entities.get(entity).?&lt;/code>) so we know which &lt;code>archetype_index&lt;/code> it is stored in, and then simply returns a pointer to that table.&lt;/p>
&lt;p>Then &lt;code>setComponent&lt;/code> first finds out which archetype table the entity is stored in &lt;em>now&lt;/em>, and determines if that table already has the component we&amp;rsquo;re adding/updating the entity with. Recall how earlier we mentioned &lt;a href="#why-our-archetype-table-names-are-hashes-entities-move-between-tables">why our archetype table names are hashes&lt;/a> - the hash here is simply a hash of every component name stored by the archetype table.&lt;/p>
&lt;h3 id="creating-new-archetype-tables">Creating new archetype tables&lt;/h3>
&lt;p>As noted earlier, when adding a new component to an entity (say going from components &lt;code>(Location, Name)&lt;/code> -&amp;gt; &lt;code>(Location, Name, Weapon)&lt;/code>) it will move from the old table to the new one. If the new one doesn&amp;rsquo;t exist yet we need to create it. We know this because &lt;code>new_hash&lt;/code> is the name of the table, encompassing all the component types it stores:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="kr">pub&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Entities&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">struct&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">...&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="kr">pub&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">fn&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">setComponent&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">entities&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">*&lt;/span>&lt;span class="n">Entities&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">entity&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">EntityID&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">name&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">[]&lt;/span>&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">u8&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">component&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">anytype&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">!&lt;/span>&lt;span class="kt">void&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">...&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">var&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">archetype_entry&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">try&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">entities&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">archetypes&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">getOrPut&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">entities&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">allocator&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">new_hash&lt;/span>&lt;span class="p">);&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">if&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="o">!&lt;/span>&lt;span class="n">archetype_entry&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">found_existing&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">archetype_entry&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">value_ptr&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="o">*&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">ArchetypeStorage&lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">allocator&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">entities&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">allocator&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">components&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">.{},&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">hash&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="mi">0&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">};&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">var&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">new_archetype&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">archetype_entry&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">value_ptr&lt;/span>&lt;span class="p">;&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Merely creating a new &lt;code>ArchetypeStorage&lt;/code> table is not enough, though - we need to create storage columns in the table to store all of the existing components found on the entity &lt;code>(Location, Name)&lt;/code> - by iterating the components in the old table, which we don&amp;rsquo;t know the actual type of (they&amp;rsquo;re &lt;code>ErasedComponentStorage&lt;/code>, not &lt;code>ComponentStorage(Location)&lt;/code>), so we use a &lt;code>cloneType&lt;/code> helper (which we&amp;rsquo;ll define later):&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">var&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">column_iter&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">archetype&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">components&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">iterator&lt;/span>&lt;span class="p">();&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">while&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">column_iter&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">next&lt;/span>&lt;span class="p">())&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">|&lt;/span>&lt;span class="n">entry&lt;/span>&lt;span class="o">|&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">var&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">erased&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">ErasedComponentStorage&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kc">undefined&lt;/span>&lt;span class="p">;&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">entry&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">value_ptr&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">cloneType&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">entry&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">value_ptr&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="o">*&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">&amp;amp;&lt;/span>&lt;span class="n">new_archetype&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">entity_ids&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">items&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">len&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">entities&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">allocator&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">&amp;amp;&lt;/span>&lt;span class="n">erased&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">catch&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">...;&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">new_archetype&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">components&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">put&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">entities&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">allocator&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">entry&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">key_ptr&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="o">*&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">erased&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">catch&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">...;&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">}&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>And finally, create storage / a new column for the new &lt;code>component&lt;/code> we&amp;rsquo;re adding to the entity (&lt;code>Weapon&lt;/code>):&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="c1">// Create storage/column for the new component.
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">erased&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">entities&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">initErasedStorage&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="o">&amp;amp;&lt;/span>&lt;span class="n">new_archetype&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">entity_ids&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">items&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">len&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="nb">@TypeOf&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">component&lt;/span>&lt;span class="p">))&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">catch&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">...;&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">new_archetype&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">components&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">put&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">entities&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">allocator&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">name&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">erased&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">catch&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">...;&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">new_archetype&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">calculateHash&lt;/span>&lt;span class="p">();&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">}&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">}&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="p">...&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>You may have noticed we wrote &lt;code>catch ...&lt;/code> in the snippets above, these are simply written as:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="k">catch&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">|&lt;/span>&lt;span class="n">err&lt;/span>&lt;span class="o">|&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">assert&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">entities&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">archetypes&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">swapRemove&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">new_hash&lt;/span>&lt;span class="p">));&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="k">return&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">err&lt;/span>&lt;span class="p">;&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="p">}&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>The reason for this is simple: If we fail to clone the storage columns, or add the new storage column, then we failed to create the archetype storage table! In this case, we need to clean up after ourselves so as to not leave the database in a bad state - by removing the entry we added to &lt;code>entities.archetypes&lt;/code> earlier.&lt;/p>
&lt;p>Finally, we implement &lt;code>ArchetypeStorage.calculateHash&lt;/code>:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="kr">pub&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">ArchetypeStorage&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">struct&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">...&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">hash&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">u64&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">pub&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">fn&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">calculateHash&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">storage&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">*&lt;/span>&lt;span class="n">ArchetypeStorage&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">void&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">storage&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">hash&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="mi">0&lt;/span>&lt;span class="p">;&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">var&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">iter&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">storage&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">components&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">iterator&lt;/span>&lt;span class="p">();&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">while&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">iter&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">next&lt;/span>&lt;span class="p">())&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">|&lt;/span>&lt;span class="n">entry&lt;/span>&lt;span class="o">|&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">component_name&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">entry&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">key_ptr&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="o">*&lt;/span>&lt;span class="p">;&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">storage&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">hash&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">^=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">std&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">hash_map&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">hashString&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">component_name&lt;/span>&lt;span class="p">);&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">}&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">}&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>This simply walks over each &lt;code>storage.components&lt;/code> entry (the columns in the table), and hashes the component type names.&lt;/p>
&lt;h3 id="making-entitiessetcomponent-update-component-values">Making &lt;code>Entities.setComponent&lt;/code> update component values&lt;/h3>
&lt;p>At this point our &lt;code>setComponent&lt;/code> method finds the &lt;code>archetype_entry&lt;/code> table that needs to be updated, and has created it if necessary:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="kr">pub&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Entities&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">struct&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">...&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="kr">pub&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">fn&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">setComponent&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">entities&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">*&lt;/span>&lt;span class="n">Entities&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">entity&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">EntityID&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">name&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">[]&lt;/span>&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">u8&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">component&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">anytype&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">!&lt;/span>&lt;span class="kt">void&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">...&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="kr">var&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">archetype_entry&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">try&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">entities&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">archetypes&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">getOrPut&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">entities&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">allocator&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">new_hash&lt;/span>&lt;span class="p">);&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="k">if&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="o">!&lt;/span>&lt;span class="n">archetype_entry&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">found_existing&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="c1">// ... creates new archetype table
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">}&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Now it&amp;rsquo;s time to actually &lt;em>update&lt;/em> the table, putting our component values into it:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">var&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">current_archetype_storage&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">archetype_entry&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">value_ptr&lt;/span>&lt;span class="p">;&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">if&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">new_hash&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">==&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">old_hash&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">ptr&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">entities&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">entities&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">get&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">entity&lt;/span>&lt;span class="p">).&lt;/span>&lt;span class="o">?&lt;/span>&lt;span class="p">;&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">try&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">current_archetype_storage&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">set&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">ptr&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">row_index&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">name&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">component&lt;/span>&lt;span class="p">);&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">return&lt;/span>&lt;span class="p">;&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">}&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Here, &lt;code>current_archetype_storage&lt;/code> is going to be either the new storage table (if the entity moved from an old table to a new table), or the prior storage table (if we&amp;rsquo;re just updating the value of a component that was already on the entity.) We then compare &lt;code>new_hash == old_hash&lt;/code> and, if equal, that implies we&amp;rsquo;re just updating the value of the existing component on the entity.&lt;/p>
&lt;p>Now, if we&amp;rsquo;re moving the entity to a new table, things are a bit more involved. First, we need to copy all component values for our entity from the &lt;em>old archetype storage table&lt;/em> to the &lt;em>destination storage table&lt;/em>. We do this by creating a new row in the destination table, iterating each component value in the old row, and copying it over:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">new_row&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">try&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">current_archetype_storage&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">new&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">entity&lt;/span>&lt;span class="p">);&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">old_ptr&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">entities&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">entities&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">get&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">entity&lt;/span>&lt;span class="p">).&lt;/span>&lt;span class="o">?&lt;/span>&lt;span class="p">;&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">var&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">column_iter&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">archetype&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">components&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">iterator&lt;/span>&lt;span class="p">();&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">while&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">column_iter&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">next&lt;/span>&lt;span class="p">())&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">|&lt;/span>&lt;span class="n">entry&lt;/span>&lt;span class="o">|&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">var&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">old_component_storage&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">entry&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">value_ptr&lt;/span>&lt;span class="p">;&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">var&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">new_component_storage&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">current_archetype_storage&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">components&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">get&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">entry&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">key_ptr&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="o">*&lt;/span>&lt;span class="p">).&lt;/span>&lt;span class="o">?&lt;/span>&lt;span class="p">;&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">new_component_storage&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">copy&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">new_component_storage&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">ptr&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">entities&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">allocator&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">new_row&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">old_ptr&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">row_index&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">old_component_storage&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">ptr&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">catch&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">|&lt;/span>&lt;span class="n">err&lt;/span>&lt;span class="o">|&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">current_archetype_storage&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">undoNew&lt;/span>&lt;span class="p">();&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">return&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">err&lt;/span>&lt;span class="p">;&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">};&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">}&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>We also need to update the table&amp;rsquo;s mapping of &lt;code>entity_ids&lt;/code> (row indices -&amp;gt; entity ID):&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">current_archetype_storage&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">entity_ids&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">items&lt;/span>&lt;span class="p">[&lt;/span>&lt;span class="n">new_row&lt;/span>&lt;span class="p">]&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">entity&lt;/span>&lt;span class="p">;&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>And since we only copied over the old components -&amp;gt; the new table row, we don&amp;rsquo;t yet have the &lt;em>new component&lt;/em> in the &lt;em>new table row&lt;/em> - it&amp;rsquo;s undefined memory at present. We update it:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">current_archetype_storage&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">set&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">new_row&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">name&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">component&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">catch&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">|&lt;/span>&lt;span class="n">err&lt;/span>&lt;span class="o">|&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">current_archetype_storage&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">undoNew&lt;/span>&lt;span class="p">();&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">return&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">err&lt;/span>&lt;span class="p">;&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">};&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>At this point, our entity would be in the new table! The new table has a new row with all of our component values, too! But the old table row still exists: we need to remove it.&lt;/p>
&lt;h3 id="removing-the-old-table-row-updating-pointers">Removing the old table row, updating pointers&lt;/h3>
&lt;p>We&amp;rsquo;ll use a swap removal (swapping the row that may be in the middle of the table somewhere, with the last row in the table, and finally decreasing the size of the table by one.)&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">var&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">swapped_entity_id&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">archetype&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">entity_ids&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">items&lt;/span>&lt;span class="p">[&lt;/span>&lt;span class="n">archetype&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">entity_ids&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">items&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">len&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">-&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="mi">1&lt;/span>&lt;span class="p">];&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">archetype&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">remove&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">old_ptr&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">row_index&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">catch&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">|&lt;/span>&lt;span class="n">err&lt;/span>&lt;span class="o">|&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">current_archetype_storage&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">undoNew&lt;/span>&lt;span class="p">();&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">return&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">err&lt;/span>&lt;span class="p">;&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">};&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Notably, &lt;code>archetype.remove&lt;/code> swap removes &lt;code>old_ptr.row_index&lt;/code> from the table. But in doing so, our global mapping of &lt;code>entities&lt;/code> entity ID -&amp;gt; entity ptr has become invalid! So we correct it:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">try&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">entities&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">entities&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">put&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">entities&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">allocator&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">swapped_entity_id&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">old_ptr&lt;/span>&lt;span class="p">);&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Last but not least, the entity we were using &lt;code>setComponent&lt;/code> on has moved to a new archetype table, and a new row. We update it&amp;rsquo;s pointer in the global &lt;code>entities&lt;/code> map:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">try&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">entities&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">entities&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">put&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">entities&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">allocator&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">entity&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Pointer&lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">archetype_index&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="nb">@intCast&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="kt">u16&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">archetype_entry&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">index&lt;/span>&lt;span class="p">),&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">row_index&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">new_row&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">});&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">return&lt;/span>&lt;span class="p">;&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;h3 id="setting-the-value-of-a-component-in-a-table">Setting the value of a component in a table&lt;/h3>
&lt;p>Earlier in &lt;code>Entities.setComponent&lt;/code> we had invoked &lt;code>ArchetypeStorage.set&lt;/code>:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="k">if&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">new_hash&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">==&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">old_hash&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">ptr&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">entities&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">entities&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">get&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">entity&lt;/span>&lt;span class="p">).&lt;/span>&lt;span class="o">?&lt;/span>&lt;span class="p">;&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="k">try&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">current_archetype_storage&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">set&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">ptr&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">row_index&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">name&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">component&lt;/span>&lt;span class="p">);&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="k">return&lt;/span>&lt;span class="p">;&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">}&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>This function will just find the &lt;code>ErasedComponentStorage&lt;/code> (column storage) for &lt;code>name&lt;/code>, cast it to the type of &lt;code>component&lt;/code> so we have &lt;code>ComponentStorage(T)&lt;/code> and update &lt;code>row_index&lt;/code> to have the value &lt;code>component&lt;/code>:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="kr">pub&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">ArchetypeStorage&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">struct&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">...&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">pub&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">fn&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">set&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">storage&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">*&lt;/span>&lt;span class="n">ArchetypeStorage&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">row_index&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">u32&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">name&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">[]&lt;/span>&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">u8&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">component&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">anytype&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">!&lt;/span>&lt;span class="kt">void&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">var&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">component_storage_erased&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">storage&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">components&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">get&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">name&lt;/span>&lt;span class="p">).&lt;/span>&lt;span class="o">?&lt;/span>&lt;span class="p">;&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">var&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">component_storage&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">ErasedComponentStorage&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">cast&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">component_storage_erased&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">ptr&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="nb">@TypeOf&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">component&lt;/span>&lt;span class="p">));&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">try&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">component_storage&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">set&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">storage&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">allocator&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">row_index&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">component&lt;/span>&lt;span class="p">);&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">}&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="p">};&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Over in &lt;code>ComponentStorage&lt;/code>, we implement the &lt;code>set&lt;/code> method - which is quite simple - if the data array isn&amp;rsquo;t large enough (we haven&amp;rsquo;t actually allocated storage for the row yet), then we allocate it to &lt;code>undefined&lt;/code> memory - and finally we set &lt;code>row_index&lt;/code> to the &lt;code>component&lt;/code> value:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="kr">pub&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">fn&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">ComponentStorage&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="kr">comptime&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Component&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">type&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">type&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="k">return&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">struct&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">total_rows&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">*&lt;/span>&lt;span class="kt">usize&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">data&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">std&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">ArrayListUnmanaged&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">Component&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">.{},&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Self&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="nb">@This&lt;/span>&lt;span class="p">();&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">pub&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">fn&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">set&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">storage&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">*&lt;/span>&lt;span class="n">Self&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">allocator&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Allocator&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">row_index&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">u32&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">component&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Component&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">!&lt;/span>&lt;span class="kt">void&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">if&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">storage&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">data&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">items&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">len&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">&amp;lt;=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">row_index&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">try&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">storage&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">data&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">appendNTimes&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">allocator&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kc">undefined&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">storage&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">data&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">items&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">len&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="mi">1&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">-&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">row_index&lt;/span>&lt;span class="p">);&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">storage&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">data&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">items&lt;/span>&lt;span class="p">[&lt;/span>&lt;span class="n">row_index&lt;/span>&lt;span class="p">]&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">component&lt;/span>&lt;span class="p">;&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">}&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">};&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="p">}&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;h2 id="finally-we-can-create-entities-and-addupdate-components-on-them">Finally, we can create entities &lt;em>and&lt;/em> add/update components on them!&lt;/h2>
&lt;p>&lt;code>Entities.setComponent&lt;/code> is fully implemented! These lines from our test earlier now work:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="k">try&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">world&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">setComponent&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">player&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="s">&amp;#34;Name&amp;#34;&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="s">&amp;#34;jane&amp;#34;&lt;/span>&lt;span class="p">);&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="c1">// add Name component
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">try&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">world&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">setComponent&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">player&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="s">&amp;#34;Location&amp;#34;&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Location&lt;/span>&lt;span class="p">{});&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="c1">// add Location component
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">try&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">world&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">setComponent&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">player&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="s">&amp;#34;Name&amp;#34;&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="s">&amp;#34;joe&amp;#34;&lt;/span>&lt;span class="p">);&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="c1">// update Name component
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>A copy of the full code at this point &lt;a href="https://gist.github.com/emidoots/7c3d36a8324dd733fb0377b087ed057c">is available here&lt;/a> and you can run it using &lt;code>zig test ecs.zig&lt;/code> as before.&lt;/p>
&lt;h2 id="getting-component-values">Getting component values&lt;/h2>
&lt;p>Getting component values is pretty simple:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="kr">pub&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Entities&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">struct&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">...&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">pub&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">fn&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">getComponent&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">entities&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">*&lt;/span>&lt;span class="n">Entities&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">entity&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">EntityID&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">name&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">[]&lt;/span>&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">u8&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">comptime&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Component&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">type&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">?&lt;/span>&lt;span class="n">Component&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">var&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">archetype&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">entities&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">archetypeByID&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">entity&lt;/span>&lt;span class="p">);&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">var&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">component_storage_erased&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">archetype&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">components&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">get&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">name&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">orelse&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">return&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kc">null&lt;/span>&lt;span class="p">;&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">ptr&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">entities&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">entities&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">get&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">entity&lt;/span>&lt;span class="p">).&lt;/span>&lt;span class="o">?&lt;/span>&lt;span class="p">;&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">var&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">component_storage&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">ErasedComponentStorage&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">cast&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">component_storage_erased&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">ptr&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Component&lt;/span>&lt;span class="p">);&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">return&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">component_storage&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">get&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">ptr&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">row_index&lt;/span>&lt;span class="p">);&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">}&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>This finds the &lt;code>archetype&lt;/code> table the entity is stored in, then finds the &lt;code>components&lt;/code> column the named component is stored in, and finally casts the &lt;code>ErasedComponentStorage&lt;/code> -&amp;gt; &lt;code>ComponentStorage(Component)&lt;/code> so we can get the row value. Notably, this means &lt;em>both the name of the component and the provided type must be correct&lt;/em>, or else undefined behavior could occur. This is a fatal flaw in our ECS implementation which we will fix!&lt;/p>
&lt;h2 id="removing-components-entities">Removing components, entities&lt;/h2>
&lt;p>Removing components is similar to adding them (because the entity needs to move between ArchetypeStorage tables.) Removing entities is similar as well. The code is lengthy, and nearly identical, so we won&amp;rsquo;t cover it here.&lt;/p>
&lt;h2 id="conclusions">Conclusions&lt;/h2>
&lt;p>By this point you have a relatively solid archetypal ECS. The full source code for this article &lt;a href="https://gist.github.com/emidoots/aecbf725896d2947459ba915fc9103a7">is available here&lt;/a>.&lt;/p>
&lt;p>Notably, it is lacking the following which we&amp;rsquo;ll cover in part 3:&lt;/p>
&lt;ul>
&lt;li>&lt;strong>Querying&lt;/strong> of actual entites, iterators over components for a single entity, etc.&lt;/li>
&lt;li>&lt;strong>Type-safety&lt;/strong>, as noted earlier if you pass the wrong name / component type it will result in undefined behavior!&lt;/li>
&lt;li>&amp;hellip; and more&lt;/li>
&lt;/ul>
&lt;p>&lt;a href="https://github.com/hexops/mach/tree/main/ecs">&lt;code>mach/ecs&lt;/code> is available on GitHub&lt;/a>, slightly ahead of this series and changing rapidly. Once it becomes stable, it will also be available as a standalone Zig library anyone can use in their own engine/game.&lt;/p>
&lt;p>Join &lt;a href="https://matrix.to/#/#ecs:matrix.org">our Matrix chat room&lt;/a> for ECS discussion &amp;amp; to help us reach Mach 1.0.&lt;/p>
&lt;h2 id="support-my-work">Support my work&lt;/h2>
&lt;p>If you like my work on &lt;a href="https://machengine.org">Mach engine&lt;/a>, &lt;a href="https://zigmonthly.org">zigmonthly.org&lt;/a>, etc. you can &lt;a href="https://github.com/sponsors/emidoots">sponsor me on GitHub&lt;/a>.&lt;/p></description></item><item><title>Mach v0.1 - cross-platform Zig graphics in ~60 seconds</title><link>https://devlog.hexops.org/2022/mach-v0.1-zig-graphics-in-60s/</link><pubDate>Sat, 26 Mar 2022 00:00:00 +0000</pubDate><guid>https://devlog.hexops.com/2022/mach-v0.1-zig-graphics-in-60s/</guid><description>&lt;p>Five months ago we announced some of our vision for Mach &amp;amp; &lt;a href="https://devlog.hexops.org/2021/mach-engine-the-future-of-graphics-with-zig">the future of graphics with Zig&lt;/a>. Today we&amp;rsquo;ve reached Mach v0.1 with over 1,100 commits.&lt;/p>
&lt;h2 id="cross-platform-graphics-in-60-seconds">Cross-platform graphics in 60 seconds&lt;/h2>
&lt;p>If you have &lt;a href="https://ziglang.org/">Zig v0.10&lt;/a> you can get started with cross-platform graphics in under 60 seconds, try it for yourself:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-sh" data-lang="sh">&lt;span class="line">&lt;span class="cl">git clone https://github.com/hexops/mach
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="nb">cd&lt;/span> mach/gpu
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">zig build run-example
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>(not working? see &lt;a href="https://github.com/hexops/mach/blob/main/doc/known-issues.md#known-issues">known issues&lt;/a>)&lt;/p>
&lt;img class="color img-center" src="https://devlog.hexops.org/img/2022/mach-v0.1-zig-graphics-in-60s/img2.png">
&lt;h2 id="all-you-need-is-zig-git-and-curl">All you need is zig, git, and curl.&lt;/h2>
&lt;p>One key point we&amp;rsquo;re solving with Mach is the developer experience. We&amp;rsquo;re tired of people wasting hours and sometimes days getting the right versions of dependencies on their system in order to build projects!&lt;/p>
&lt;img class="color img-center" src="https://devlog.hexops.org/img/2022/mach-v0.1-zig-graphics-in-60s/img3.png">
&lt;p>Our &lt;a href="https://github.com/hexops/mach-glfw">glfw bindings&lt;/a> build GLFW 100% from source using Zig. We even go so far as to build the DirectX Shader Compiler from source via Zig&amp;rsquo;s build system.&lt;/p>
&lt;p>For the few inescapable system dependencies, such as &lt;code>Metal.framework&lt;/code> or &lt;code>libx11&lt;/code>, we &lt;a href="https://github.com/hexops/mach-system-sdk">package them up ourselves&lt;/a> and our build system knows how to fetch them as needed.&lt;/p>
&lt;h2 id="effortless-cross-compilation">Effortless cross-compilation&lt;/h2>
&lt;p>Because of our aggressive approach to solving dependencies, you get effortless cross-compilation to any OS literally at the flip of a switch:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-sh" data-lang="sh">&lt;span class="line">&lt;span class="cl">$ zig build -Dtarget&lt;span class="o">=&lt;/span>x86_64-windows
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">$ zig build -Dtarget&lt;span class="o">=&lt;/span>x86_64-linux
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">$ zig build -Dtarget&lt;span class="o">=&lt;/span>x86_64-macos.12
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">$ zig build -Dtarget&lt;span class="o">=&lt;/span>aarch64-macos.12
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>The binaries you end up with are virtually static.&lt;/p>
&lt;p>For Linux, Zig lets you target any glibc version, and musl too, so no more worrying if that binary will run on other machines.&lt;/p>
&lt;h3 id="unified-graphics-api-metal-vulkan-directx-12">Unified graphics API (Metal, Vulkan, DirectX 12)&lt;/h3>
&lt;p>Backed by Metal, Vulkan, DirectX 12 &amp;amp; OpenGL (as a fallback), you get a truly cross-platform graphics API for Windows, Linux, and macOS (Browser and Mobile coming in the future)&lt;/p>
&lt;img class="color img-of-code" src="https://devlog.hexops.org/img/2022/mach-v0.1-zig-graphics-in-60s/img4.png">
&lt;h3 id="unified-shader-language--compute-shaders">Unified shader language &amp;amp; compute shaders&lt;/h3>
&lt;p>There&amp;rsquo;s no need to write shaders for each graphics backend, instead you write shaders in a single modern language (WGSL):&lt;/p>
&lt;img class="color img-of-code" src="https://devlog.hexops.org/img/2022/mach-v0.1-zig-graphics-in-60s/img5.png">
&lt;p>With compute shaders, you have the ability to leverage computation on the GPU outside of graphical applications (machine learning, physics calculations, etc.) using a straightforward &amp;amp; approachable API that works with every GPU vendor.&lt;/p>
&lt;h3 id="behind-the-scenes">Behind the scenes&lt;/h3>
&lt;img class="color" style="max-height: 175px; display: block; margin: auto;" src="https://devlog.hexops.org/img/2022/mach-v0.1-zig-graphics-in-60s/img6.png">
&lt;p>Mach &lt;a href="https://gpuweb.github.io/gpuweb/explainer/">leverages WebGPU&lt;/a>, a successor to WebGL. WebGPU is an up and coming graphics API being built by Mozilla, Google, Apple, Microsoft, Intel and others.&lt;/p>
&lt;p>Natively, Mach uses Zig as a C/C++ compiler to build &lt;a href="https://github.com/hexops/mach-gpu-dawn">Google Chrome&amp;rsquo;s native WebGPU implementation&lt;/a> and we use Zig&amp;rsquo;s build system so you don&amp;rsquo;t even have to deal with cmake/ninja/gn/etc.&lt;/p>
&lt;p>Our infrastructure produces binary releases so you don&amp;rsquo;t even have to wait the handful of minutes it would take to compile by default. From-source builds are literally at your fingertips, though, just add &lt;code>-Ddawn-from-source=true&lt;/code> to your &lt;code>zig build&lt;/code> command.&lt;/p>
&lt;h2 id="machgpu-the-gpu-interface-for-zig">&lt;code>mach/gpu&lt;/code>: the GPU interface for Zig&lt;/h2>
&lt;p>&lt;a href="https://github.com/hexops/mach/tree/main/gpu">mach/gpu&lt;/a> is our Zig interface to WebGPU and comes in at just over 250 commits.&lt;/p>
&lt;img class="img-center color-auto" style="max-height: 125px;" src="https://devlog.hexops.org/img/2022/mach-v0.1-zig-graphics-in-60s/img7.png">
&lt;p>It provides a &lt;code>gpu.Interface&lt;/code>, similar to how Zig provides a &lt;code>std.mem.Allocator&lt;/code> interface, it&amp;rsquo;s backed by any implementation:&lt;/p>
&lt;ul>
&lt;li>A &lt;code>NativeInstance&lt;/code> like Dawn (Google Chrome&amp;rsquo;s WebGPU implementation.)&lt;/li>
&lt;li>(future) A browser implementation when targeting WebAssembly.&lt;/li>
&lt;/ul>
&lt;p>Imagine future implementations: maybe a pure-Zig implementation? Maybe a debugger implementation that &lt;em>wraps an existing one&lt;/em> and streams all API calls to disk so you can replay them later and step through graphics API calls. Lots of possibilities here!&lt;/p>
&lt;p>It&amp;rsquo;s a comprehensive interface, covering the 176 methods, 73 data structures, and 43 enum types that WebGPU exposes today - but there&amp;rsquo;s still much to do around documentation, fixing bugs, and ensuring we match the browser API nicely. but the foundation is all there!&lt;/p>
&lt;h2 id="zig--greatness">Zig ≈ greatness&lt;/h2>
&lt;p>Zig provides &lt;a href="https://ziglang.org/learn/overview/#performance-and-safety-choose-two">some excellent runtime safety&lt;/a> and catches many of the mistakes people make in C/C++ (memory leaks, integer overflow, index out of bounds, etc.)&lt;/p>
&lt;p>In fact, we&amp;rsquo;ve caught several instances of undefined behavior in GLFW, and even &lt;a href="https://github.com/microsoft/DirectXShaderCompiler/pull/4178#discussion_r780733405">illegal integer coercion in the DirectX Shader Compiler&lt;/a> - all just by compiling C/C++ code using Zig.&lt;/p>
&lt;p>&lt;a href="https://devlog.hexops.org/2021/perfecting-glfw-for-zig-and-finding-undefined-behavior">&lt;img class="color img-center" style="max-height: 125px;" src="https://devlog.hexops.org/img/2022/mach-v0.1-zig-graphics-in-60s/img8.png">&lt;/a>&lt;/p>
&lt;p>The reason we&amp;rsquo;re &lt;em>really&lt;/em> ecstatic about Zig, though, are what it promises us in the future:&lt;/p>
&lt;ul>
&lt;li>Blazing fast compilation, compiling and running a program faster than Python can interpret it - impressive!&lt;/li>
&lt;li>&lt;a href="http://www.jakubkonka.com/2022/03/16/hcs-zig.html">Hot code swapping&lt;/a>, how cool would it be to edit variables etc. as your game is running?&lt;/li>
&lt;/ul>
&lt;h2 id="entity-component-system">Entity Component System&lt;/h2>
&lt;p>We&amp;rsquo;re building an Entity Component System &lt;a href="https://devlog.hexops.org/categories/build-an-ecs/">in a blog series&lt;/a> and inspired by:&lt;/p>
&lt;ul>
&lt;li>Data Oriented Design&lt;/li>
&lt;li>Database design&lt;/li>
&lt;li>Advice from the authors of &lt;a href="https://bevyengine.org">Bevy&lt;/a> (a very popular ECS)&lt;/li>
&lt;/ul>
&lt;p>It&amp;rsquo;s still early stages, we&amp;rsquo;ve got some ways to go! But we&amp;rsquo;re &lt;a href="https://github.com/hexops/mach/tree/main/ecs">on our third major revision&lt;/a> and beginning to face the interesting problems. Keep an eye out for updates on that blog series coming soon.&lt;/p>
&lt;h2 id="sounds-great-whats-the-catch">Sounds great! What&amp;rsquo;s the catch?&lt;/h2>
&lt;p>Mach (and Zig) are still very early stages! APIs are going to change and break. Mach is missing &lt;em>a lot!&lt;/em>&lt;/p>
&lt;ul>
&lt;li>Documentation..&lt;/li>
&lt;li>Examples..&lt;/li>
&lt;li>Demos..&lt;/li>
&lt;li>Browser and Mobile support..&lt;/li>
&lt;li>..Literally everything else that makes a game engine&lt;/li>
&lt;/ul>
&lt;p>If you&amp;rsquo;re looking for cross-platform graphics in Zig, Mach is for you! Otherwise, you&amp;rsquo;ll probably need to wait a bit.&lt;/p>
&lt;h2 id="getting-started">Getting started&lt;/h2>
&lt;p>Check out &lt;a href="https://github.com/hexops/mach">the GitHub&lt;/a> and in particular &lt;a href="https://github.com/hexops/mach/tree/main/gpu/examples">this example code&lt;/a>.&lt;/p>
&lt;p>There&amp;rsquo;s a ton of material out there about WebGPU already, check out &lt;a href="https://surma.dev/things/webgpu/">this excellent and comprehensive introductory article&lt;/a> and &lt;a href="https://github.com/austinEng/webgpu-samples">these awesome samples&lt;/a>. It should be easy to map any of these to the Mach &lt;code>gpu.Interface&lt;/code> since it&amp;rsquo;s the same API, just Ziggified!&lt;/p>
&lt;p>Join our &lt;a href="https://matrix.to/#/#hexops:matrix.org">Matrix chat room&lt;/a> to get help, discuss, etc.&lt;/p>
&lt;h2 id="whats-next">What&amp;rsquo;s next?&lt;/h2>
&lt;img class="color img-center" style="max-height: 250px;" src="https://devlog.hexops.org/img/2022/mach-v0.1-zig-graphics-in-60s/img9.png">
&lt;p>My lightning talk in which I&amp;rsquo;ll be making the case for Mach engine and conveying the vision for where we go from here will be presented at the first-ever European &lt;a href="https://zig.news/kristoff/zig-milan-party-2022-final-info-schedule-1jc1">Zig meetup in Milan, Italy on Apr 9-10&lt;/a>.&lt;/p>
&lt;p>If like me you are unable to attend in person, the short video will be available afterwards!&lt;/p>
&lt;h2 id="help-us-reach-mach-v10">Help us reach Mach v1.0&lt;/h2>
&lt;p>Consider &lt;a href="https://github.com/sponsors/emidoots">sponsoring me on GitHub&lt;/a> if you like my work, so I can do more of it!&lt;/p>
&lt;p>Join our &lt;a href="https://matrix.to/#/#hexops:matrix.org">Matrix chat room&lt;/a> to discuss ideas - collaboration very welcome!&lt;/p>
&lt;p>Thanks for coming along in our journey!&lt;/p></description></item><item><title>Zig hashmaps explained</title><link>https://devlog.hexops.org/2022/zig-hashmaps-explained/</link><pubDate>Sat, 29 Jan 2022 00:00:00 +0000</pubDate><guid>https://devlog.hexops.com/2022/zig-hashmaps-explained/</guid><description>&lt;p>If you just got started with &lt;a href="https://ziglang.org">Zig&lt;/a>, you might quickly want to use a hashmap. Zig provides good defaults, with a lot of customization options.&lt;/p>
&lt;p>Here I will try to guide you into choosing the right hashmap type.&lt;/p>
&lt;h2 id="60-second-explainer">60-second explainer&lt;/h2>
&lt;p>You probably want:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="kr">var&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">my_hash_map&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">std&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">StringHashMap&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">V&lt;/span>&lt;span class="p">).&lt;/span>&lt;span class="n">init&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">allocator&lt;/span>&lt;span class="p">);&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Or if you do not have string keys, you can use an &lt;code>Auto&lt;/code> hashmap instead:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="kr">var&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">my_hash_map&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">std&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">AutoHashMap&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">K&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">V&lt;/span>&lt;span class="p">).&lt;/span>&lt;span class="n">init&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">allocator&lt;/span>&lt;span class="p">);&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Where &lt;code>K&lt;/code> and &lt;code>V&lt;/code> are your key and value data types, respectively. e.g. &lt;code>[]const u8&lt;/code> for a string.&lt;/p>
&lt;p>You can then use these APIs:&lt;/p>
&lt;h3 id="insert-a-value">Insert a value&lt;/h3>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="k">try&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">my_hash_map&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">put&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">key&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">value&lt;/span>&lt;span class="p">);&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;h3 id="insert-a-value-assert-entry-does-not-already-exist">Insert a value, assert entry does not already exist&lt;/h3>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="k">try&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">my_hash_map&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">putNoClobber&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">key&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">value&lt;/span>&lt;span class="p">);&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Note &lt;code>putNoClobber&lt;/code> may be renamed to something like &lt;code>putAssumeNoEntry&lt;/code> in the near future: &lt;a href="https://github.com/ziglang/zig/issues/10736">ziglang/zig#10736&lt;/a>&lt;/p>
&lt;h3 id="get-a-value">Get a value&lt;/h3>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="kr">var&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">value&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">my_hash_map&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">get&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">key&lt;/span>&lt;span class="p">);&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="k">if&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">value&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">|&lt;/span>&lt;span class="n">v&lt;/span>&lt;span class="o">|&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="c1">// got value &amp;#34;v&amp;#34;
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="p">}&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">else&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="c1">// doesn&amp;#39;t exist
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="p">}&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;h3 id="get-a-value-insert-if-not-exist">Get a value, insert if not exist&lt;/h3>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="kr">var&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">v&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">try&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">my_hash_map&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">getOrPut&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">key&lt;/span>&lt;span class="p">);&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="k">if&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="o">!&lt;/span>&lt;span class="n">v&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">found_existing&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="c1">// We inserted an entry, specify the new value
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="c1">// This is a conditional in case creating the new value is expensive
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">v&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">value_ptr&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="o">*&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="s">&amp;#34;my value&amp;#34;&lt;/span>&lt;span class="p">;&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="p">}&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="kr">var&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">value&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">v&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">value_ptr&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="o">*&lt;/span>&lt;span class="p">;&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="c1">// use the value
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>You can find more APIs &lt;a href="https://github.com/ziglang/zig/blob/master/lib/std/hash_map.zig#L342">by going here&lt;/a> and using your browser&amp;rsquo;s builtin search for &lt;code>pub fn&lt;/code>.&lt;/p>
&lt;h2 id="about-key-data-types">About key data types&lt;/h2>
&lt;p>Zig hash map types start with the data type of the key:&lt;/p>
&lt;ul>
&lt;li>&lt;code>std.StringHashMap&lt;/code> - uses a good default hashing function for string keys&lt;/li>
&lt;li>&lt;code>std.AutoHashMap&lt;/code> - uses a good default hashing function for most data types&lt;/li>
&lt;li>&lt;code>std.HashMap&lt;/code> - the &amp;ldquo;bring your own hashing function&amp;rdquo; option&lt;/li>
&lt;/ul>
&lt;p>Note: &lt;code>AutoHashMap&lt;/code> does not support &lt;em>slices&lt;/em>, such as &lt;code>[]const u8&lt;/code> string slices, because that is a pointer to an array and it is ambiguous whether or not you intend to hash &lt;em>the array elements&lt;/em> or &lt;em>the pointer itself&lt;/em>. You can use the generic &lt;code>std.HashMap&lt;/code> for any slice type, you just have to provide your own hash functions.&lt;/p>
&lt;h2 id="hashmaps-are-also-sets">Hashmaps are also sets&lt;/h2>
&lt;p>A set in Zig is just a hashmap with a &lt;code>void&lt;/code> value:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="kr">var&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">my_hash_map&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">std&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">AutoHashMap&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">K&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">void&lt;/span>&lt;span class="p">).&lt;/span>&lt;span class="n">init&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">allocator&lt;/span>&lt;span class="p">);&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="k">try&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">my_hash_map&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">put&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">key&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{});&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="c1">// `{}` is a value of type `void`
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;h2 id="advanced-usages">Advanced usages&lt;/h2>
&lt;p>If you&amp;rsquo;re just getting started with Zig, don&amp;rsquo;t worry too much about the below. Just know that you have options available should you need to reduce memory usage or optimize your use of hashmaps in the future.&lt;/p>
&lt;h3 id="managed-vs-unmanaged-hashmaps">Managed vs. unmanaged hashmaps&lt;/h3>
&lt;p>You can add &lt;code>Unmanaged&lt;/code> to the end of a Zig hashmap data type, e.g. &lt;code>std.StringHashMapUnmanaged&lt;/code> in order to get the &lt;em>unmanaged&lt;/em> version.&lt;/p>
&lt;p>This merely doesn&amp;rsquo;t carry an &lt;code>allocator&lt;/code> internally, instead you must pass the allocator into every method of the hashmap. While only a few bytes, this can be a useful optimization if you&amp;rsquo;re storing many hashmaps for example.&lt;/p>
&lt;p>Managed:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="kr">var&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">my_hash_map&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">std&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">StringHashMap&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">V&lt;/span>&lt;span class="p">).&lt;/span>&lt;span class="n">init&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">allocator&lt;/span>&lt;span class="p">);&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Unmanaged:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="kr">var&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">my_hash_map&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">std&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">StringHashMapUnmanaged&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">V&lt;/span>&lt;span class="p">){};&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;h3 id="array-hash-maps">Array hash maps&lt;/h3>
&lt;p>Zig actually provides &lt;a href="https://github.com/ziglang/zig/pull/5999">&lt;em>two hashmap implementations&lt;/em>&lt;/a> in the standard library&lt;/p>
&lt;p>&lt;code>std.HashMap&lt;/code>, perfect for every-day use cases:&lt;/p>
&lt;ul>
&lt;li>Optimized for lookup times primarily&lt;/li>
&lt;li>Optimized for insertion/removal times secondarily&lt;/li>
&lt;/ul>
&lt;p>&lt;code>std.ArrayHashMap&lt;/code>, useful in &lt;em>some&lt;/em> situations:&lt;/p>
&lt;ul>
&lt;li>Iterating over the hashmap is an order of magnitude faster (a contiguous array)&lt;/li>
&lt;li>Insertion order is preserved.&lt;/li>
&lt;li>You can index into the underlying data like an array if you like&lt;/li>
&lt;li>Deletions can be performed one of two ways, mirroring the &lt;code>ArrayList&lt;/code> API:
&lt;ul>
&lt;li>&lt;code>swapRemove&lt;/code>: swaps the target element with the last element in the list to remove it&lt;/li>
&lt;li>&lt;code>orderedRemove&lt;/code>: removes target element by shifting all elements forward, maintaining current ordering&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="hashmap-context">Hashmap context&lt;/h3>
&lt;p>If you choose to use &lt;code>std.HashMap&lt;/code> or &lt;code>std.ArrayHashMap&lt;/code> directly (without the &lt;code>String&lt;/code> or &lt;code>Auto&lt;/code> prefix), then you&amp;rsquo;ll find it wants a &lt;em>context&lt;/em> parameter and &lt;em>max load percentage&lt;/em>:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="kr">var&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">my_hash_map&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">std&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">HashMap&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">K&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">V&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">std&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">hash_map&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">AutoContext&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">K&lt;/span>&lt;span class="p">),&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">std&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">hash_map&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">default_max_load_percentage&lt;/span>&lt;span class="p">);&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>The &lt;em>context&lt;/em> parameter lets you embed some of your own data within the hash map type. This can be useful for &lt;a href="https://zig.news/andrewrk/how-to-use-hash-map-contexts-to-save-memory-when-doing-a-string-table-3l33">reducing the amount of memory that a hash map takes up when doing a string table&lt;/a>.&lt;/p>
&lt;h3 id="pick-your-hashmap">Pick your hashmap&lt;/h3>
&lt;p>Regular implementation:&lt;/p>
&lt;table>
&lt;thead>
&lt;tr>
&lt;th>Key type&lt;/th>
&lt;th>Managed?&lt;/th>
&lt;th>How to initialize&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td>&lt;code>String&lt;/code>&lt;/td>
&lt;td>yes&lt;/td>
&lt;td>&lt;code>std.StringHashMap(V).init(allocator)&lt;/code>&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;code>Auto&lt;/code>&lt;/td>
&lt;td>yes&lt;/td>
&lt;td>&lt;code>std.AutoHashMap(K, V).init(allocator)&lt;/code>&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;code>String&lt;/code>&lt;/td>
&lt;td>&lt;code>Unmanaged&lt;/code>&lt;/td>
&lt;td>&lt;code>std.StringHashMapUnmanaged(V){}&lt;/code>&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;code>Auto&lt;/code>&lt;/td>
&lt;td>&lt;code>Unmanaged&lt;/code>&lt;/td>
&lt;td>&lt;code>std.AutoHashMapUnmanaged(K, V){}&lt;/code>&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;p>&lt;code>ArrayHashMap&lt;/code> implementation:&lt;/p>
&lt;table>
&lt;thead>
&lt;tr>
&lt;th>Key type&lt;/th>
&lt;th>Managed?&lt;/th>
&lt;th>How to initialize&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td>&lt;code>String&lt;/code>&lt;/td>
&lt;td>yes&lt;/td>
&lt;td>&lt;code>std.StringArrayHashMap(V).init(allocator)&lt;/code>&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;code>Auto&lt;/code>&lt;/td>
&lt;td>yes&lt;/td>
&lt;td>&lt;code>std.AutoArrayHashMap(K, V).init(allocator)&lt;/code>&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;code>String&lt;/code>&lt;/td>
&lt;td>&lt;code>Unmanaged&lt;/code>&lt;/td>
&lt;td>&lt;code>std.StringArrayHashMapUnmanaged(V){}&lt;/code>&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;code>Auto&lt;/code>&lt;/td>
&lt;td>&lt;code>Unmanaged&lt;/code>&lt;/td>
&lt;td>&lt;code>std.AutoArrayHashMapUnmanaged(K, V){}&lt;/code>&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;h3 id="learn-more">Learn more&lt;/h3>
&lt;p>The source code is very readable:&lt;/p>
&lt;ul>
&lt;li>&lt;a href="https://github.com/ziglang/zig/blob/master/lib/std/hash_map.zig">&lt;code>std.HashMap&lt;/code>&lt;/a>&lt;/li>
&lt;li>&lt;a href="https://github.com/ziglang/zig/blob/master/lib/std/hash_map.zig">&lt;code>std.ArrayHashMap&lt;/code>&lt;/a>&lt;/li>
&lt;/ul>
&lt;h3 id="help-improve-this-page">Help improve this page&lt;/h3>
&lt;p>I wrote this article quickly because I needed to explain my choice of hashmaps in the &lt;a href="https://devlog.hexops.org/categories/build-an-ecs/">&amp;ldquo;Let&amp;rsquo;s build an Entity Component System from scratch&amp;rdquo;&lt;/a> series and there was no better source of this info. I&amp;rsquo;m sure there are things that can be improved.&lt;/p>
&lt;p>&lt;a href="https://github.com/hexops/devlog/blob/main/_posts/2022-01-29-zig-hashmaps-explained.md">Feel free to send a PR!&lt;/a>&lt;/p></description></item><item><title>Let's build an Entity Component System from scratch (part 1)</title><link>https://devlog.hexops.org/2022/lets-build-ecs-part-1/</link><pubDate>Sun, 16 Jan 2022 00:00:00 +0000</pubDate><guid>https://devlog.hexops.com/2022/lets-build-ecs-part-1/</guid><description>&lt;p>In this multi-part series we&amp;rsquo;ll build the Entity Component System used in &lt;a href="https://hexops.com/mach">Mach engine&lt;/a> in &lt;a href="https://ziglang.org">the Zig programming language&lt;/a> from first principles (asking what an ECS is and walking through what problems it solves) all the way to writing an implementation in a low-level programming language. The only thing you need to follow along is some programming experience and a desire to learn.&lt;/p>
&lt;p>In this article, we&amp;rsquo;ll mostly go over the problem space, data oriented design, the things we need our ECS to solve, etc. In the next article, implementation will begin.&lt;/p>
&lt;ul>
&lt;li>&lt;a href="#motivation">Motivation&lt;/a>&lt;/li>
&lt;li>&lt;a href="#my-approach-to-complex-software-architecture">My approach to complex software architecture&lt;/a>&lt;/li>
&lt;li>&lt;a href="#what-really-is-an-entity-component-system-anyway">What really is an entity component system, anyway?&lt;/a>&lt;/li>
&lt;li>&lt;a href="#what-problems-does-an-ecs-solve">What problems does an ECS solve?&lt;/a>&lt;/li>
&lt;li>&lt;a href="#start-with-data-oriented-design">Start with data oriented design&lt;/a>&lt;/li>
&lt;li>&lt;a href="#what-would-data-oriented-design-look-like-code-starts-here">What would data oriented design look like? (code starts here!)&lt;/a>&lt;/li>
&lt;li>&lt;a href="#sparse-data-storage">Sparse data storage&lt;/a>
&lt;ul>
&lt;li>&lt;a href="#comptime-sparse-data">Comptime sparse data&lt;/a>&lt;/li>
&lt;li>&lt;a href="#runtime-sparse-data">Runtime sparse data&lt;/a>&lt;/li>
&lt;li>&lt;a href="#improving-performance">Improving performance&lt;/a>&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>&lt;a href="#archetype-storage">Archetype storage&lt;/a>
&lt;ul>
&lt;li>&lt;a href="#comptime-archetype-storage">Comptime archetype storage&lt;/a>&lt;/li>
&lt;li>&lt;a href="#runtime-archetype-storage">Runtime archetype storage&lt;/a>&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>&lt;a href="#designing-our-ecs">Designing our ECS&lt;/a>&lt;/li>
&lt;li>&lt;a href="#next-up-starting-our-ecs-implementation">Next up: starting our ECS implementation&lt;/a>&lt;/li>
&lt;/ul>
&lt;h2 id="motivation">Motivation&lt;/h2>
&lt;p>&lt;a class="imglink" href="https://devlog.hexops.org/img/2022/lets-build-ecs-part-1/img2.png">&lt;img class="color-auto" src="https://devlog.hexops.org/img/2022/lets-build-ecs-part-1/img2.png">&lt;/a>&lt;/p>
&lt;p>I&amp;rsquo;ve used and written more traditional &lt;a href="https://en.wikipedia.org/wiki/Object-oriented_programming">OOP&lt;/a> &lt;a href="https://en.wikipedia.org/wiki/Scene_graph">scene graphs&lt;/a> in the past. These are often the core engine architecture used to represent everything in game worlds: they&amp;rsquo;re used in Unity historically (which is now migrating to ECS due to popular demand) and even &lt;a href="https://godotengine.org/article/why-isnt-godot-ecs-based-game-engine">in other modern engines such as Godot&lt;/a>.&lt;/p>
&lt;p>For &lt;a href="https://hexops.com/mach">Mach engine&lt;/a>, however, we&amp;rsquo;re adopting an ECS as our core architecture. ECS has gained great momentum in recent years for its composition and performance benefits.&lt;/p>
&lt;h2 id="my-approach-to-complex-software-architecture">My approach to complex software architecture&lt;/h2>
&lt;ol>
&lt;li>What user problems does the proposed architecture (scene graphs, ECS, React-like frameworks, etc.) solve?&lt;/li>
&lt;li>How does the proposed architecture &lt;em>typically&lt;/em> solve such problems?&lt;/li>
&lt;/ol>
&lt;p>The key point here is that, personally, I find it useful to intentionally avoid looking directly at code for the implementations themselves.&lt;/p>
&lt;p>&lt;a class="imglink" href="https://devlog.hexops.org/img/2022/lets-build-ecs-part-1/img3.png">&lt;img class="color-auto" src="https://devlog.hexops.org/img/2022/lets-build-ecs-part-1/img3.png">&lt;/a>&lt;/p>
&lt;p>I&amp;rsquo;ve used this approach to &lt;a href="https://github.com/hexops/vecty">to great success before&lt;/a>: the nice thing about this is that the end result really &lt;em>fits the language&lt;/em>, using patterns and features specific to the language - it doesn&amp;rsquo;t just end up feeling like a port of some other language&amp;rsquo;s implementation.&lt;/p>
&lt;p>I&amp;rsquo;ve researched a bit about ECS in general, and have chatted with people familiar with ECS, but haven&amp;rsquo;t read any other&amp;rsquo;s code. No doubt, initially, I&amp;rsquo;ll get some aspects wrong! As this series of articles progresses over the coming months, though, you&amp;rsquo;ll see how this can be a winning tactic as we learn together!&lt;/p>
&lt;h2 id="what-really-is-an-entity-component-system-anyway">What really is an entity component system, anyway?&lt;/h2>
&lt;p>&lt;a class="imglink" href="https://devlog.hexops.org/img/2022/lets-build-ecs-part-1/img4.png">&lt;img class="color-auto" src="https://devlog.hexops.org/img/2022/lets-build-ecs-part-1/img4.png">&lt;/a>&lt;/p>
&lt;p>I&amp;rsquo;ve found the Rust project &lt;a href="https://bevyengine.org/learn/book/getting-started/ecs/#bevy-ecs">Bevy ECS to have a great succinct explanation&lt;/a>, which I further simplify here:&lt;/p>
&lt;ul>
&lt;li>&lt;strong>Entities&lt;/strong>: a unique integer&lt;/li>
&lt;li>&lt;strong>Components&lt;/strong>: structs of plain old data&lt;/li>
&lt;li>&lt;strong>Systems&lt;/strong>: normal functions&lt;/li>
&lt;/ul>
&lt;p>When you hear this, things may start to sounds a whole lot simpler! Those are the core concepts of an ECS.&lt;/p>
&lt;p>There is one other concept of an ECS that I think is particularly important:&lt;/p>
&lt;ul>
&lt;li>&lt;strong>Archetype&lt;/strong>: A &lt;em>chosen set of components&lt;/em> that an entity of a certain type will have.&lt;/li>
&lt;/ul>
&lt;p>&lt;a class="imglink" href="https://devlog.hexops.org/img/2022/lets-build-ecs-part-1/img5.png">&lt;img class="color-auto" src="https://devlog.hexops.org/img/2022/lets-build-ecs-part-1/img5.png">&lt;/a>&lt;/p>
&lt;h2 id="what-problems-does-an-ecs-solve">What problems does an ECS solve?&lt;/h2>
&lt;p>I&amp;rsquo;ve identified two problems it solves.&lt;/p>
&lt;p>&lt;a class="imglink" href="https://devlog.hexops.org/img/2022/lets-build-ecs-part-1/img6.png">&lt;img class="color-auto" src="https://devlog.hexops.org/img/2022/lets-build-ecs-part-1/img6.png">&lt;/a>&lt;/p>
&lt;p>First and foremost is &lt;em>making it easy for game developers to architect their code&lt;/em> compared to them doing it manually. If it&amp;rsquo;s easier for someone to structure their code themselves, manually, then such a system is not useful at all! Of course, as complexity and the scale of software increases then a &lt;em>consistent&lt;/em> system is &lt;em>far more useful&lt;/em> than a bunch of ad-hoc systems.&lt;/p>
&lt;p>The second problem ECS solves, I believe, is making your software architecture &lt;em>efficient&lt;/em> without you really having to think too much about it. You don&amp;rsquo;t have to think about how to structure all your code &amp;amp; data for logic first, &lt;em>and then for performance&lt;/em>, but rather get good performance by nature of following patterns.&lt;/p>
&lt;h2 id="start-with-data-oriented-design">Start with data oriented design&lt;/h2>
&lt;p>&lt;a class="imglink" href="https://devlog.hexops.org/img/2022/lets-build-ecs-part-1/img7.png">&lt;img class="color-auto" src="https://devlog.hexops.org/img/2022/lets-build-ecs-part-1/img7.png">&lt;/a>&lt;/p>
&lt;p>ECS overlaps with &lt;a href="https://dataorienteddesign.com/site.php">&lt;em>data oriented design&lt;/em>&lt;/a> in many ways (although it&amp;rsquo;s &lt;a href="https://github.com/hexops/mach/issues/127#issuecomment-1014176503">roots are &lt;em>much&lt;/em> earlier&lt;/a>). There are many talks about data oriented design including &lt;a href="https://www.youtube.com/watch?v=rX0ItVEVjHc">Mike Acton&amp;rsquo;s at CppCon&lt;/a>, and my personal favorite &lt;a href="https://media.handmade-seattle.com/practical-data-oriented-design/">&amp;ldquo;A Practical Guide to Applying Data-Oriented Design&amp;rdquo;&lt;/a> by Andrew Kelley. You don&amp;rsquo;t have to watch either, I&amp;rsquo;ll cover the important concepts we use here. But I highly suggest &lt;strong>every&lt;/strong> developer watch Andrew Kelley&amp;rsquo;s talk above. It&amp;rsquo;s eye opening no matter what kind of programming you are doing.&lt;/p>
&lt;p>Let&amp;rsquo;s work forwards, not backwards: We&amp;rsquo;re not starting by building an ECS, we&amp;rsquo;re starting by building a proper data oriented design for CPU cache and memory efficiency, and then we&amp;rsquo;re working towards &amp;ldquo;how do we make that easier for people to do by default?&amp;rdquo; and looking to existing ECS architectures for inspiration.&lt;/p>
&lt;h2 id="what-would-data-oriented-design-look-like-code-starts-here">What would data oriented design look like? (code starts here!)&lt;/h2>
&lt;p>A simple first approach would be something like this:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Player&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">struct&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">name&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">[]&lt;/span>&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">u8&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="c1">// a string / byte slice
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">location&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Vec3&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">velocity&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Vec3&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">health&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">u8&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">team&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Team&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">alive&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">bool&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="p">};&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Cat&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">struct&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">name&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">[]&lt;/span>&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">u8&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="c1">// a string / byte slice
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">location&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Vec3&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="p">};&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Monster&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">struct&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">location&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Vec3&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">health&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">u8&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="p">};&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="c1">// All the players, cats, monsters in our game world.
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="kr">var&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">players&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">ArrayList&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">Player&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">.{};&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="kr">var&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">cats&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">ArrayList&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">Cat&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">.{};&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="kr">var&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">monsters&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">ArrayList&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">Monster&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">.{};&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="c1">// The index of a player in the players array, a cat in the cats array, etc.!
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Entity&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">u32&lt;/span>&lt;span class="p">;&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Now we can refer to players, cats, or monsters by using an entity ID (their index in the array), which we call an &lt;em>entity&lt;/em>. We could also write functions (called systems) which iterate over these arrays and e.g. compute physics for players.&lt;/p>
&lt;p>However, we can improve this quite a bit!&lt;/p>
&lt;h2 id="sparse-data-storage">Sparse data storage&lt;/h2>
&lt;h3 id="comptime-sparse-data">Comptime sparse data&lt;/h3>
&lt;p>It&amp;rsquo;s likely that most players will be alive in our game, only a few will be dead at a time - but yet we&amp;rsquo;re paying the cost of storing which players are dead for &lt;em>every living player&lt;/em> (via the &lt;code>Player.alive&lt;/code> struct field)!&lt;/p>
&lt;p>We can eliminate paying the cost of &lt;code>alive: bool&lt;/code> per player by removing the field entirely, and having what I call &lt;em>compile time sparse data&lt;/em> instead:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="kr">var&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">alive_players&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">ArrayList&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">Player&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">.{};&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="kr">var&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">dead_players&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">ArrayList&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">Player&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">.{};&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>This not only reduces the amount of memory each &lt;code>Player&lt;/code> entity takes up because we no longer store an &lt;code>alive: bool&lt;/code> per player, but also it:&lt;/p>
&lt;ol>
&lt;li>Improves performance by ensuring more players fit into L1/L2/L3 cache.&lt;/li>
&lt;li>Reduces the amount of players we must skip (and reduces potential cache misses) because in some cases we might only be interested in alive players and have to skip over dead ones when iterating.&lt;/li>
&lt;/ol>
&lt;p>This introduces some complexity for us to deal with, though:&lt;/p>
&lt;ul>
&lt;li>Now if a player goes from dead-&amp;gt;alive, or alive-&amp;gt;dead, we need logic to remove it from the old array and put it in the new one.&lt;/li>
&lt;li>When we move a player from one array to another, the Entity ID we use to refer to that player (the array index) has changed! So if someone is storing a player Entity ID in order to have reference to it somewhere, we&amp;rsquo;d need to have logic to update that.&lt;/li>
&lt;/ul>
&lt;p>Now we start to see one thing our ECS needs to make simpler!&lt;/p>
&lt;p>I call this type of data &lt;em>comptime sparse data&lt;/em>.&lt;/p>
&lt;h3 id="runtime-sparse-data">Runtime sparse data&lt;/h3>
&lt;p>In an ideal world, we&amp;rsquo;re able to pre-declare all sparse data at compile time like we did above:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="kr">var&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">alive_players&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">ArrayList&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">Player&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">.{};&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="kr">var&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">dead_players&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">ArrayList&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">Player&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">.{};&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>But sometimes, this just isn&amp;rsquo;t possible:&lt;/p>
&lt;ul>
&lt;li>Maybe players in your game can give other players a customer nickname to display above their head. Again, for most players this won&amp;rsquo;t be set - but for some players it will be! Ideally we don&amp;rsquo;t have to pay the cost of storing a nickname string pointer for every player in the game without one&lt;/li>
&lt;li>Maybe a handful of players out of thousands are given the speciality of having a custom weapon, they get to choose it&amp;rsquo;s type, a custom name for it, and even the damage it should do! Where should we store that information?&lt;/li>
&lt;li>&amp;hellip;&lt;/li>
&lt;/ul>
&lt;p>In this case, we could use a hash map:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">PlayerNickname&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">[]&lt;/span>&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">u8&lt;/span>&lt;span class="p">;&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="c1">// a string
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Weapon&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">struct&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">custom_name&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">[]&lt;/span>&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">u8&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="c1">// a string
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">type&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">WeaponType&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">damage&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">u8&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="p">};&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="kr">var&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">players&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">ArrayList&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">Player&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">.{};&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="c1">// all players
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="kr">var&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">players_with_nicknames&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">AutoHashMap&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">Entity&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">PlayerNickname&lt;/span>&lt;span class="p">).&lt;/span>&lt;span class="n">init&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">allocator&lt;/span>&lt;span class="p">);&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="kr">var&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">players_with_weapons&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">AutoHashMap&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">Entity&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Weapon&lt;/span>&lt;span class="p">).&lt;/span>&lt;span class="n">init&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">allocator&lt;/span>&lt;span class="p">);&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Now we&amp;rsquo;ve got a mapping of player Entity IDs -&amp;gt; their nicknames and weapons. We only pay the cost of storing this information for players that do actually have these specialties - not for every player.&lt;/p>
&lt;p>I call this type of data &lt;em>runtime sparse data&lt;/em>.&lt;/p>
&lt;h3 id="improving-performance">Improving performance&lt;/h3>
&lt;p>Consider our player storage as it stands right now:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Player&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">struct&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">name&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">[]&lt;/span>&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">u8&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="c1">// a string / byte slice
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">location&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Vec3&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">velocity&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Vec3&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">health&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">u8&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">team&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Team&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="p">};&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="kr">var&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">players&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">ArrayList&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">Player&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">.{};&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="c1">// all players
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Because of the way structs get laid out in memory with padding, our players array above would end up having a larger memory footprint than needed. So we actually benefit from using a separate array for every type of data (thanks, Unity!):&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="kr">var&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">player_names&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">ArrayList&lt;/span>&lt;span class="p">([]&lt;/span>&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">u8&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">.{};&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="kr">var&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">player_locations&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">ArrayList&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">Vec3&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">.{};&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="kr">var&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">player_velocities&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">ArrayList&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">Vec3&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">.{};&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="kr">var&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">player_healths&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">ArrayList&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="kt">u8&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">.{};&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="kr">var&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">player_teams&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">ArrayList&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">Team&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">.{};&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Luckily, we don&amp;rsquo;t actually have to enumerate all our fields out like this: Zig has a nice &lt;code>MultiArrayList&lt;/code> type which does this for us, we need change only one line:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-diff" data-lang="diff">&lt;span class="line">&lt;span class="cl">const Player = struct {
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> name: []const u8, // a string / byte slice
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> location: Vec3,
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> velocity: Vec3,
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> health: u8,
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> team: Team,
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">};
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="gd">-var players: ArrayList(Player) = .{}; // all players
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="gd">&lt;/span>&lt;span class="gi">+var players: MultiArrayList(Player) = .{}; // all players
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Not only does this use less memory, it also improves CPU cache efficiency a ton, especially when iterating over a lot of players to do work with them. If you&amp;rsquo;re curious why, then you should watch Andrew Kelley&amp;rsquo;s &lt;a href="https://media.handmade-seattle.com/practical-data-oriented-design/">&amp;ldquo;A Practical Guide to Applying Data-Oriented Design&amp;rdquo;&lt;/a> talk!&lt;/p>
&lt;h2 id="archetype-storage">Archetype storage&lt;/h2>
&lt;h3 id="comptime-archetype-storage">Comptime archetype storage&lt;/h3>
&lt;p>Up until now, we&amp;rsquo;ve assumed we have pre-defined archetypes (&amp;ldquo;player&amp;rdquo;, &amp;ldquo;cat&amp;rdquo;, &amp;ldquo;monster&amp;rdquo;):&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="c1">// All the players, cats, monsters in our game world.
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="kr">var&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">players&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">ArrayList&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">Player&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">.{};&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="kr">var&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">cats&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">ArrayList&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">Cat&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">.{};&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="kr">var&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">monsters&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">ArrayList&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">Monster&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">.{};&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>This is ideal: we don&amp;rsquo;t need to ask the computer to do any work to find out where players, cats, or monsters are stored - we just &lt;em>know at compile time&lt;/em> because &lt;em>they&amp;rsquo;re in that variable&lt;/em>. When someone uses our ECS, we could have them write a compile time function like:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="n">World&lt;/span>&lt;span class="p">(.{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">Player&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">Cat&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">Monster&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="p">})&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>And that&amp;rsquo;s great because it means our ECS &amp;ldquo;world&amp;rdquo; can be aware ahead of time exactly which archetypes it needs to store. It could write out those &lt;code>var players: ArrayList...&lt;/code> variables for us.&lt;/p>
&lt;p>I call this &lt;em>comptime archetype storage&lt;/em>.&lt;/p>
&lt;h3 id="runtime-archetype-storage">Runtime archetype storage&lt;/h3>
&lt;p>However, real games are much more complex: we might not really know at the time we&amp;rsquo;re declaring the &lt;code>World&lt;/code> all the different archetypes we plan on storing. Code gets messy. In some cases, maybe we even need to define some archetypes &lt;em>of a common type&lt;/em> at runtime. For example, if we wanted to allow configuring &lt;code>red&lt;/code> and &lt;code>blue&lt;/code> here (or the number of teams) via a configuration file on disk or via a GUI:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="kr">var&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">red_team_players&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">ArrayList&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">Player&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">.{};&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="kr">var&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">blue_team_players&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">ArrayList&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">Player&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">.{};&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Of course our &lt;code>Player&lt;/code> could have a &lt;code>team&lt;/code> field in it to represent the team, but there may be cases where storing &lt;em>a separate list of entities&lt;/em> like this is needed without pre-declaring it. If we want to do that, we could use a hashmap:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="kr">var&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">runtime_archetypes&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">AutoHashMap&lt;/span>&lt;span class="p">([]&lt;/span>&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">u8&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">*&lt;/span>&lt;span class="n">anyopaque&lt;/span>&lt;span class="p">).&lt;/span>&lt;span class="n">init&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">allocator&lt;/span>&lt;span class="p">);&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>In this model, we could store the &lt;em>archtype string name&lt;/em> as the hashmap key (for example, the &lt;code>@typeName(Player)&lt;/code> if we wanted, or maybe a custom name like &lt;code>red&lt;/code>, &lt;code>blue&lt;/code>, etc.). The value of the hashmap would need to be different types: an &lt;code>ArrayList(Player)&lt;/code>, an &lt;code>ArrayList(Monster)&lt;/code>, etc. and so we would store a type-erased &lt;code>*anyopaque&lt;/code> (like a C &lt;code>void*&lt;/code>) pointer. When we get a value out, we&amp;rsquo;ll need to &amp;ldquo;know&amp;rdquo; what type of &lt;code>ArrayList&lt;/code> to cast the pointer back to. It won&amp;rsquo;t store that info for us.&lt;/p>
&lt;p>I call this &lt;em>runtime archetype storage&lt;/em>.&lt;/p>
&lt;h2 id="designing-our-ecs">Designing our ECS&lt;/h2>
&lt;p>We now start to see &lt;em>some&lt;/em> of the things our ECS architecture should solve:&lt;/p>
&lt;ul>
&lt;li>Typed entity storage (how you interact with a list of players, monsters, etc.)&lt;/li>
&lt;li>Sparse data: both comptime and runtime&lt;/li>
&lt;li>Archetype storage: both comptime and runtime&lt;/li>
&lt;/ul>
&lt;p>Additionally, these are the design principles I&amp;rsquo;ve come up with:&lt;/p>
&lt;ul>
&lt;li>Clean-room implementation (I&amp;rsquo;ve not read any other ECS implementation code), just working from first-principles as an engineer&lt;/li>
&lt;li>Solve the problems ECS solves, in a way that is natural to Zig and leverages Zig comptime.&lt;/li>
&lt;li>Fast. Optimal for CPU caches, multi-threaded, leverage comptime as much as is reasonable.&lt;/li>
&lt;li>Simple. Small API footprint, should be natural and fun - not like you&amp;rsquo;re writing boilerplate.&lt;/li>
&lt;li>Enable other libraries to provide tracing, editors, visualizers, profilers, etc.&lt;/li>
&lt;/ul>
&lt;p>From this, you can easily gather that storing entities is actually only a small (but critical) portion of this system. In the next article we will get into the details of implementing this in code, and go on to explore more challenging topics like multi-threading, systems, and scheduling in future articles.&lt;/p>
&lt;h2 id="next-up-starting-our-ecs-implementation">Next up: starting our ECS implementation&lt;/h2>
&lt;p>As this series develops, all the code is being developed in the Mach repository&amp;rsquo;s &lt;code>ecs&lt;/code> subfolder &lt;a href="https://github.com/hexops/mach/tree/main/libs/ecs">on GitHub&lt;/a>. The articles will lag slightly behind.&lt;/p>
&lt;p>&lt;a href="https://devlog.hexops.org/categories/build-an-ecs">As more articles come out, you can find them here&lt;/a>. Join us in developing it, give us advice, etc. &lt;a href="https://matrix.to/#/#ecs:matrix.org">on Matrix chat&lt;/a>.&lt;/p>
&lt;p>If you like what I&amp;rsquo;m doing, you can &lt;a href="https://github.com/sponsors/emidoots">sponsor me on GitHub&lt;/a>.&lt;/p></description></item><item><title>Perfecting GLFW for Zig, and finding lurking undefined behavior that went unnoticed for 6+ years</title><link>https://devlog.hexops.org/2021/perfecting-glfw-for-zig-and-finding-undefined-behavior/</link><pubDate>Sun, 31 Oct 2021 00:00:00 +0000</pubDate><guid>https://devlog.hexops.com/2021/perfecting-glfw-for-zig-and-finding-undefined-behavior/</guid><description>&lt;p>&lt;strong>Today, I am announcing &lt;a href="https://github.com/hexops/mach-glfw">mach-glfw&lt;/a>: Ziggified GLFW bindings with 100% API coverage, zero-fuss installation, cross compilation, and more.&lt;/strong>&lt;/p>
&lt;h2 id="building-mach-for-everyone">Building Mach for everyone&lt;/h2>
&lt;p>If &lt;a href="https://github.com/hexops/mach">Mach engine&lt;/a> only benefits people interested in using that engine, and not the broader Zig (and even gamedev) community I would consider that &lt;em>a total failure&lt;/em>.&lt;/p>
&lt;p>Whether you&amp;rsquo;re interested in using all of Mach, just some of it with your own engine / project, or just the tools/ideas we develop in the future (with Unity, Unreal, etc.), &lt;em>I truly aim to produce something that benefits you&lt;/em>.&lt;/p>
&lt;p>Mach is in super early stages, I&amp;rsquo;ve spent the last four months perfecting a Zig interface to GLFW, and making no-fuss installation and cross-compilation a reality. Today, you can benefit from that work too.&lt;/p>
&lt;h2 id="building-glfw-for-every-platform">Building GLFW for every platform&lt;/h2>
&lt;p>Just &lt;code>zig&lt;/code> and &lt;code>git&lt;/code>, that&amp;rsquo;s the idea. The GLFW C code is compiled with &lt;code>zig&lt;/code>, and the &lt;code>build.zig&lt;/code> file automatically uses &lt;code>git&lt;/code> to clone (a very minimal set of) system dependencies for you (X11 libraries, etc.)&lt;/p>
&lt;p>No installing apt packages. No dealing with missing header errors. It should just work out-of-the-box, and for every platform:&lt;/p>
&lt;p>&lt;a href="https://devlog.hexops.org/img/2021/perfecting-glfw-for-zig-and-finding-undefined-behavior/img2.png">&lt;img alt="Mach engine platform support, including Windows, Linux, Mac and cross-compilation between them with Android/iOS coming soon." class="color" src="https://devlog.hexops.org/img/2021/perfecting-glfw-for-zig-and-finding-undefined-behavior/img2.png">&lt;/a>&lt;/p>
&lt;p>Today, this works for GLFW itself. Cross-compilation of &lt;em>OpenGL and Vulkan apps&lt;/em> is not yet fully functional. &lt;a href="https://github.com/hexops/mach/issues/59">We&amp;rsquo;re working on it, though.&lt;/a>&lt;/p>
&lt;h2 id="perfecting-glfw-for-zig">Perfecting GLFW for Zig&lt;/h2>
&lt;p>Aside from platform support, mach-glfw now has:&lt;/p>
&lt;ul>
&lt;li>&lt;strong>100% API coverage&lt;/strong> of GLFW v3.3.4. Every function, type, constant, etc. has been wrapped in a ziggified API.&lt;/li>
&lt;li>&lt;strong>130+ tests&lt;/strong>, with CI testing Linux, Windows, Mac (x86 and M1/ARM) and cross-compilation between them.&lt;/li>
&lt;/ul>
&lt;p>You might be asking: &lt;em>why Zig bindings, when Zig can interface directly with C?&lt;/em> Ziggified bindings to GLFW get us:&lt;/p>
&lt;ul>
&lt;li>Errors as &lt;a href="https://ziglang.org/documentation/master/#Errors">zig errors&lt;/a> instead of via a callback function.&lt;/li>
&lt;li>&lt;strong>Enums&lt;/strong>: always know what value a GLFW function can accept as everything is strictly typed. And use the nice Zig syntax to access enums, like &lt;code>window.getKey(.escape)&lt;/code> instead of &lt;code>c.glfwGetKey(window, c.GLFW_KEY_ESCAPE)&lt;/code>&lt;/li>
&lt;li>Slices instead of C pointers and lengths.&lt;/li>
&lt;li>&lt;a href="https://ziglang.org/documentation/master/#packed-struct">packed structs&lt;/a> represent bit masks, so you can use &lt;code>if (joystick.down and joystick.right)&lt;/code> instead of &lt;code>&amp;amp;&lt;/code> &lt;code>|&lt;/code> etc. bitwise operators.&lt;/li>
&lt;li>&lt;code>true&lt;/code> and &lt;code>false&lt;/code> instead of &lt;code>c.GLFW_TRUE&lt;/code> and &lt;code>c.GLFW_FALSE&lt;/code>.&lt;/li>
&lt;li>Generics: use &lt;code>window.hint&lt;/code> instead of &lt;code>glfwWindowHint&lt;/code>, &lt;code>glfwWindowHintString&lt;/code>, etc.&lt;/li>
&lt;li>Methods, e.g. &lt;code>my_window.hint(...)&lt;/code> instead of &lt;code>glfwWindowHint(my_window, ...)&lt;/code>&lt;/li>
&lt;/ul>
&lt;h2 id="explicit-error-handling-solves-a-real-problem">Explicit error handling solves a real problem&lt;/h2>
&lt;p>GLFW traditionally passes errors to the user via a callback. This can make errors easy to ignore, as well as difficult to correlate and handle effectively at the time of the function invocation.&lt;/p>
&lt;p>We translated a &lt;a href="https://github.com/hexops/mach-glfw-vulkan-example">a Vulkan example to mach-glfw&lt;/a>, which you can try for yourself today:&lt;/p>
&lt;p>&lt;a href="https://devlog.hexops.org/img/2021/perfecting-glfw-for-zig-and-finding-undefined-behavior/img1.png">&lt;img alt="mach-glfw and vulkan-zig libraries working together to produce a triangle." class="color" src="https://devlog.hexops.org/img/2021/perfecting-glfw-for-zig-and-finding-undefined-behavior/img1.png">&lt;/a>&lt;/p>
&lt;p>After porting it, we found that the example was crashing with a &lt;code>NoWindowContext&lt;/code> error. Strange?&lt;/p>
&lt;p>As it turns out, we had found &lt;a href="https://github.com/Snektron/vulkan-zig/pull/21">a small bug in the vulkan-zig example code&lt;/a>, it was calling &lt;code>glfwSwapBuffers&lt;/code> which is not needed for Vulkan. The error went unnoticed because it&amp;rsquo;s easy to miss errors with GLFW&amp;rsquo;s error callback handling style. But with mach-glfw, it was an explicit error you have to handle e.g. via &lt;code>try glfw.swapBuffers()&lt;/code> - we literally couldn&amp;rsquo;t miss it.&lt;/p>
&lt;h2 id="finding-lurking-undefined-behavior-in-6-year-old-glfw-code">Finding lurking undefined behavior in 6+ year old GLFW code&lt;/h2>
&lt;p>One &lt;em>particularly frustrating&lt;/em> issue was tracking down why the last part of the GLFW API we needed to wrap for 100% coverage, the &lt;code>glfwSetWindowIcon&lt;/code> function, was crashing:&lt;/p>
&lt;pre tabindex="0">&lt;code>Test [76/135] Window.test &amp;#34;setIcon&amp;#34;... Illegal instruction at address 0x2cee09
upstream/glfw/src/x11_window.c:0:0: 0x2cee09 in _glfwPlatformSetWindowIcon (/mach/glfw/upstream/glfw/src/x11_window.c)
upstream/glfw/src/window.c:511:5: 0x2de484 in glfwSetWindowIcon (/mach/glfw/upstream/glfw/src/window.c)
_glfwPlatformSetWindowIcon(window, count, images);
^
/mach/glfw/src/Window.zig:508:28: 0x23a083 in Window.test &amp;#34;setIcon&amp;#34; (test)
c.glfwSetWindowIcon(self.handle, @intCast(c_int, im.len), &amp;amp;tmp[0]);
^
/usr/local/bin/lib/std/special/test_runner.zig:77:28: 0x25a0d1 in std.special.main (test)
} else test_fn.func();
^
/usr/local/bin/lib/std/start.zig:517:22: 0x2896bc in std.start.callMain (test)
root.main();
^
/usr/local/bin/lib/std/start.zig:469:12: 0x25c117 in std.start.callMainWithArgs (test)
return @call(.{ .modifier = .always_inline }, callMain, .{});
^
/usr/local/bin/lib/std/start.zig:434:12: 0x25bec2 in std.start.main (test)
return @call(.{ .modifier = .always_inline }, callMainWithArgs, .{ @intCast(usize, c_argc), c_argv, envp });
^
???:?:?: 0x7f4b7c3280b2 in ??? (???)
&lt;/code>&lt;/pre>&lt;p>That&amp;rsquo;s odd? &lt;code>Illegal instruction at address 0x2cee09&lt;/code> - are we corrupting the stack somehow? Is this a Zig compiler bug?&lt;/p>
&lt;p>Running in &lt;code>lldb&lt;/code> didn&amp;rsquo;t help with shining any light on the problem, either:&lt;/p>
&lt;p>&lt;a href="https://devlog.hexops.org/img/2021/perfecting-glfw-for-zig-and-finding-undefined-behavior/img3.png">&lt;img alt="lldb showing nothing particularly useful" class="color" src="https://devlog.hexops.org/img/2021/perfecting-glfw-for-zig-and-finding-undefined-behavior/img3.png">&lt;/a>&lt;/p>
&lt;p>After poking around at the stack, checking all pointers and lengths were valid, etc. I was at a loss. The mach-glfw code &lt;em>sure seemed valid&lt;/em>, and yet, this crash. I managed to track the crash down to the first iteration of a loop in GLFW&amp;rsquo;s &lt;code>x11_window.c&lt;/code>:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-c" data-lang="c">&lt;span class="line">&lt;span class="cl">&lt;span class="kt">void&lt;/span> &lt;span class="nf">_glfwSetWindowIconX11&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">_GLFWwindow&lt;/span>&lt;span class="o">*&lt;/span> &lt;span class="n">window&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="kt">int&lt;/span> &lt;span class="n">count&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="k">const&lt;/span> &lt;span class="n">GLFWimage&lt;/span>&lt;span class="o">*&lt;/span> &lt;span class="n">images&lt;/span>&lt;span class="p">)&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="p">{&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="k">if&lt;/span> &lt;span class="p">(&lt;/span>&lt;span class="n">count&lt;/span>&lt;span class="p">)&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="p">{&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="kt">int&lt;/span> &lt;span class="n">longCount&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="mi">0&lt;/span>&lt;span class="p">;&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="k">for&lt;/span> &lt;span class="p">(&lt;/span>&lt;span class="kt">int&lt;/span> &lt;span class="n">i&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="mi">0&lt;/span>&lt;span class="p">;&lt;/span> &lt;span class="n">i&lt;/span> &lt;span class="o">&amp;lt;&lt;/span> &lt;span class="n">count&lt;/span>&lt;span class="p">;&lt;/span> &lt;span class="n">i&lt;/span>&lt;span class="o">++&lt;/span>&lt;span class="p">)&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="n">longCount&lt;/span> &lt;span class="o">+=&lt;/span> &lt;span class="mi">2&lt;/span> &lt;span class="o">+&lt;/span> &lt;span class="n">images&lt;/span>&lt;span class="p">[&lt;/span>&lt;span class="n">i&lt;/span>&lt;span class="p">].&lt;/span>&lt;span class="n">width&lt;/span> &lt;span class="o">*&lt;/span> &lt;span class="n">images&lt;/span>&lt;span class="p">[&lt;/span>&lt;span class="n">i&lt;/span>&lt;span class="p">].&lt;/span>&lt;span class="n">height&lt;/span>&lt;span class="p">;&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="kt">long&lt;/span>&lt;span class="o">*&lt;/span> &lt;span class="n">icon&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="nf">_glfw_calloc&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">longCount&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="k">sizeof&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="kt">long&lt;/span>&lt;span class="p">));&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="kt">long&lt;/span>&lt;span class="o">*&lt;/span> &lt;span class="n">target&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="n">icon&lt;/span>&lt;span class="p">;&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="k">for&lt;/span> &lt;span class="p">(&lt;/span>&lt;span class="kt">int&lt;/span> &lt;span class="n">i&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="mi">0&lt;/span>&lt;span class="p">;&lt;/span> &lt;span class="n">i&lt;/span> &lt;span class="o">&amp;lt;&lt;/span> &lt;span class="n">count&lt;/span>&lt;span class="p">;&lt;/span> &lt;span class="n">i&lt;/span>&lt;span class="o">++&lt;/span>&lt;span class="p">)&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="p">{&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="o">*&lt;/span>&lt;span class="n">target&lt;/span>&lt;span class="o">++&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="n">images&lt;/span>&lt;span class="p">[&lt;/span>&lt;span class="n">i&lt;/span>&lt;span class="p">].&lt;/span>&lt;span class="n">width&lt;/span>&lt;span class="p">;&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="o">*&lt;/span>&lt;span class="n">target&lt;/span>&lt;span class="o">++&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="n">images&lt;/span>&lt;span class="p">[&lt;/span>&lt;span class="n">i&lt;/span>&lt;span class="p">].&lt;/span>&lt;span class="n">height&lt;/span>&lt;span class="p">;&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="k">for&lt;/span> &lt;span class="p">(&lt;/span>&lt;span class="kt">int&lt;/span> &lt;span class="n">j&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="mi">0&lt;/span>&lt;span class="p">;&lt;/span> &lt;span class="n">j&lt;/span> &lt;span class="o">&amp;lt;&lt;/span> &lt;span class="n">images&lt;/span>&lt;span class="p">[&lt;/span>&lt;span class="n">i&lt;/span>&lt;span class="p">].&lt;/span>&lt;span class="n">width&lt;/span> &lt;span class="o">*&lt;/span> &lt;span class="n">images&lt;/span>&lt;span class="p">[&lt;/span>&lt;span class="n">i&lt;/span>&lt;span class="p">].&lt;/span>&lt;span class="n">height&lt;/span>&lt;span class="p">;&lt;/span> &lt;span class="n">j&lt;/span>&lt;span class="o">++&lt;/span>&lt;span class="p">)&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="p">{&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="c1">// illegal instruction on first iteration?
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">&lt;/span> &lt;span class="o">*&lt;/span>&lt;span class="n">target&lt;/span>&lt;span class="o">++&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="p">(&lt;/span>&lt;span class="n">images&lt;/span>&lt;span class="p">[&lt;/span>&lt;span class="n">i&lt;/span>&lt;span class="p">].&lt;/span>&lt;span class="n">pixels&lt;/span>&lt;span class="p">[&lt;/span>&lt;span class="n">j&lt;/span> &lt;span class="o">*&lt;/span> &lt;span class="mi">4&lt;/span> &lt;span class="o">+&lt;/span> &lt;span class="mi">0&lt;/span>&lt;span class="p">]&lt;/span> &lt;span class="o">&amp;lt;&amp;lt;&lt;/span> &lt;span class="mi">16&lt;/span>&lt;span class="p">)&lt;/span> &lt;span class="o">|&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="p">(&lt;/span>&lt;span class="n">images&lt;/span>&lt;span class="p">[&lt;/span>&lt;span class="n">i&lt;/span>&lt;span class="p">].&lt;/span>&lt;span class="n">pixels&lt;/span>&lt;span class="p">[&lt;/span>&lt;span class="n">j&lt;/span> &lt;span class="o">*&lt;/span> &lt;span class="mi">4&lt;/span> &lt;span class="o">+&lt;/span> &lt;span class="mi">1&lt;/span>&lt;span class="p">]&lt;/span> &lt;span class="o">&amp;lt;&amp;lt;&lt;/span> &lt;span class="mi">8&lt;/span>&lt;span class="p">)&lt;/span> &lt;span class="o">|&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="p">(&lt;/span>&lt;span class="n">images&lt;/span>&lt;span class="p">[&lt;/span>&lt;span class="n">i&lt;/span>&lt;span class="p">].&lt;/span>&lt;span class="n">pixels&lt;/span>&lt;span class="p">[&lt;/span>&lt;span class="n">j&lt;/span> &lt;span class="o">*&lt;/span> &lt;span class="mi">4&lt;/span> &lt;span class="o">+&lt;/span> &lt;span class="mi">2&lt;/span>&lt;span class="p">]&lt;/span> &lt;span class="o">&amp;lt;&amp;lt;&lt;/span> &lt;span class="mi">0&lt;/span>&lt;span class="p">)&lt;/span> &lt;span class="o">|&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="p">(&lt;/span>&lt;span class="n">images&lt;/span>&lt;span class="p">[&lt;/span>&lt;span class="n">i&lt;/span>&lt;span class="p">].&lt;/span>&lt;span class="n">pixels&lt;/span>&lt;span class="p">[&lt;/span>&lt;span class="n">j&lt;/span> &lt;span class="o">*&lt;/span> &lt;span class="mi">4&lt;/span> &lt;span class="o">+&lt;/span> &lt;span class="mi">3&lt;/span>&lt;span class="p">]&lt;/span> &lt;span class="o">&amp;lt;&amp;lt;&lt;/span> &lt;span class="mi">24&lt;/span>&lt;span class="p">);&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="p">}&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="p">}&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="p">...&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;h2 id="reaching-my-limits">Reaching my limits&lt;/h2>
&lt;p>At this point, I feel confident in saying:&lt;/p>
&lt;ul>
&lt;li>The Zig code is correct, the pointers are valid, the lengths are correct, everything&amp;rsquo;s right.&lt;/li>
&lt;li>The GLFW code is pretty popular, and it&amp;rsquo;s been around for 6 years. Seems unlikely it&amp;rsquo;s a bug in GLFW?&lt;/li>
&lt;/ul>
&lt;p>Luckily, my brother (and reverse engineer) &lt;a href="https://github.com/Andoryuuta">@Andoryuuta&lt;/a> was available to help debug, so I pulled him in. Stepping through instructions, we could see clearly that after a bit shift we were stepping into the abyss:&lt;/p>
&lt;pre tabindex="0">&lt;code>* thread #1, name = &amp;#39;test&amp;#39;, stop reason = instruction step over
frame #0: 0x00000000002c6f84 test`_glfwPlatformSetWindowIcon(window=0x00000000004e53d0, count=1, images=0x00007fffec0b3000) at x11_window.c:2156:58
2153 *target++ = (images[i].pixels[j * 4 + 0] &amp;lt;&amp;lt; 16) |
2154 (images[i].pixels[j * 4 + 1] &amp;lt;&amp;lt; 8) |
2155 (images[i].pixels[j * 4 + 2] &amp;lt;&amp;lt; 0) |
-&amp;gt; 2156 (images[i].pixels[j * 4 + 3] &amp;lt;&amp;lt; 24);
2157 printf(&amp;#34;DID WE GET HERE???x\n&amp;#34;);
2158 }
2159 }
(lldb)
Process 6516 stopped
* thread #1, name = &amp;#39;test&amp;#39;, stop reason = instruction step over
frame #0: 0x00000000002c6c21 test`_glfwPlatformSetWindowIcon(window=0x00000000004e53d0, count=1, images=0x00007fffec0b3000) at x11_window.c:0
1 //========================================================================
2 // GLFW 3.3 X11 - www.glfw.org
3 //------------------------------------------------------------------------
4 // Copyright (c) 2002-2006 Marcus Geelnard
5 // Copyright (c) 2006-2019 Camilla Löwy &amp;lt;elmindreda@glfw.org&amp;gt;
6 //
7 // This software is provided &amp;#39;as-is&amp;#39;, without any express or implied
(lldb)
Process 6516 stopped
&lt;/code>&lt;/pre>&lt;p>Inspecting the binary in IDA Pro we were able to see that we were jumping into an &lt;code>__asm { ud1 }&lt;/code> section (ud1 standing for &amp;ldquo;undefined instruction 1&amp;rdquo;):&lt;/p>
&lt;p>&lt;a href="https://devlog.hexops.org/img/2021/perfecting-glfw-for-zig-and-finding-undefined-behavior/img4.png">&lt;img alt="IDA Pro showing a jump to an undefined instruction 1" class="color" src="https://devlog.hexops.org/img/2021/perfecting-glfw-for-zig-and-finding-undefined-behavior/img4.png">&lt;/a>&lt;/p>
&lt;p>It turns out that clang&amp;rsquo;s UBSan inserts these instructions as traps for when the compiler thinks there is undefined behavior occurring, such as if a pointer addition leads to an overflow. This is super interesting, but unfortunately doesn&amp;rsquo;t always give a compiler error. We got lucky and found someone else who ran into this through Google:&lt;/p>
&lt;blockquote>
&lt;p>I &lt;em>believe&lt;/em> LLVM explicitly generates a ud2 x86 instruction because &amp;ldquo;it determined&amp;rdquo; there&amp;rsquo;s undefined behavior in the C code. So first I wonder which flags you&amp;rsquo;re passing it through zig (i.e. how strict are you being with the settings?) — Abner (@AbnerCoimbre)&lt;/p>
&lt;/blockquote>
&lt;p>And indeed, compiling via &lt;code>zig build test -Drelease-fast&lt;/code> (which turns off UBsan) made the crash go away. So where&amp;rsquo;s the undefined behavior?&lt;/p>
&lt;p>If we squint at the code and assume all pointers, counts, and indices are correct, you might be able to spot it:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-c" data-lang="c">&lt;span class="line">&lt;span class="cl">&lt;span class="kt">void&lt;/span> &lt;span class="nf">_glfwSetWindowIconX11&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">_GLFWwindow&lt;/span>&lt;span class="o">*&lt;/span> &lt;span class="n">window&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="kt">int&lt;/span> &lt;span class="n">count&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="k">const&lt;/span> &lt;span class="n">GLFWimage&lt;/span>&lt;span class="o">*&lt;/span> &lt;span class="n">images&lt;/span>&lt;span class="p">)&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="p">{&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="p">...&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="kt">long&lt;/span>&lt;span class="o">*&lt;/span> &lt;span class="n">target&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="n">icon&lt;/span>&lt;span class="p">;&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="k">for&lt;/span> &lt;span class="p">(&lt;/span>&lt;span class="kt">int&lt;/span> &lt;span class="n">i&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="mi">0&lt;/span>&lt;span class="p">;&lt;/span> &lt;span class="n">i&lt;/span> &lt;span class="o">&amp;lt;&lt;/span> &lt;span class="n">count&lt;/span>&lt;span class="p">;&lt;/span> &lt;span class="n">i&lt;/span>&lt;span class="o">++&lt;/span>&lt;span class="p">)&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="p">{&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="p">...&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="k">for&lt;/span> &lt;span class="p">(&lt;/span>&lt;span class="kt">int&lt;/span> &lt;span class="n">j&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="mi">0&lt;/span>&lt;span class="p">;&lt;/span> &lt;span class="n">j&lt;/span> &lt;span class="o">&amp;lt;&lt;/span> &lt;span class="n">images&lt;/span>&lt;span class="p">[&lt;/span>&lt;span class="n">i&lt;/span>&lt;span class="p">].&lt;/span>&lt;span class="n">width&lt;/span> &lt;span class="o">*&lt;/span> &lt;span class="n">images&lt;/span>&lt;span class="p">[&lt;/span>&lt;span class="n">i&lt;/span>&lt;span class="p">].&lt;/span>&lt;span class="n">height&lt;/span>&lt;span class="p">;&lt;/span> &lt;span class="n">j&lt;/span>&lt;span class="o">++&lt;/span>&lt;span class="p">)&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="p">{&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="c1">// illegal instruction on first iteration?
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">&lt;/span> &lt;span class="o">*&lt;/span>&lt;span class="n">target&lt;/span>&lt;span class="o">++&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="p">(&lt;/span>&lt;span class="n">images&lt;/span>&lt;span class="p">[&lt;/span>&lt;span class="n">i&lt;/span>&lt;span class="p">].&lt;/span>&lt;span class="n">pixels&lt;/span>&lt;span class="p">[&lt;/span>&lt;span class="n">j&lt;/span> &lt;span class="o">*&lt;/span> &lt;span class="mi">4&lt;/span> &lt;span class="o">+&lt;/span> &lt;span class="mi">0&lt;/span>&lt;span class="p">]&lt;/span> &lt;span class="o">&amp;lt;&amp;lt;&lt;/span> &lt;span class="mi">16&lt;/span>&lt;span class="p">)&lt;/span> &lt;span class="o">|&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="p">(&lt;/span>&lt;span class="n">images&lt;/span>&lt;span class="p">[&lt;/span>&lt;span class="n">i&lt;/span>&lt;span class="p">].&lt;/span>&lt;span class="n">pixels&lt;/span>&lt;span class="p">[&lt;/span>&lt;span class="n">j&lt;/span> &lt;span class="o">*&lt;/span> &lt;span class="mi">4&lt;/span> &lt;span class="o">+&lt;/span> &lt;span class="mi">1&lt;/span>&lt;span class="p">]&lt;/span> &lt;span class="o">&amp;lt;&amp;lt;&lt;/span> &lt;span class="mi">8&lt;/span>&lt;span class="p">)&lt;/span> &lt;span class="o">|&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="p">(&lt;/span>&lt;span class="n">images&lt;/span>&lt;span class="p">[&lt;/span>&lt;span class="n">i&lt;/span>&lt;span class="p">].&lt;/span>&lt;span class="n">pixels&lt;/span>&lt;span class="p">[&lt;/span>&lt;span class="n">j&lt;/span> &lt;span class="o">*&lt;/span> &lt;span class="mi">4&lt;/span> &lt;span class="o">+&lt;/span> &lt;span class="mi">2&lt;/span>&lt;span class="p">]&lt;/span> &lt;span class="o">&amp;lt;&amp;lt;&lt;/span> &lt;span class="mi">0&lt;/span>&lt;span class="p">)&lt;/span> &lt;span class="o">|&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="p">(&lt;/span>&lt;span class="n">images&lt;/span>&lt;span class="p">[&lt;/span>&lt;span class="n">i&lt;/span>&lt;span class="p">].&lt;/span>&lt;span class="n">pixels&lt;/span>&lt;span class="p">[&lt;/span>&lt;span class="n">j&lt;/span> &lt;span class="o">*&lt;/span> &lt;span class="mi">4&lt;/span> &lt;span class="o">+&lt;/span> &lt;span class="mi">3&lt;/span>&lt;span class="p">]&lt;/span> &lt;span class="o">&amp;lt;&amp;lt;&lt;/span> &lt;span class="mi">24&lt;/span>&lt;span class="p">);&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="p">}&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="p">}&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="p">...&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>What is happening here is that:&lt;/p>
&lt;ul>
&lt;li>&lt;code>images[i].pixels[j * 4 + 0]&lt;/code> is returning an &lt;code>unsigned char&lt;/code> (8 bits)&lt;/li>
&lt;li>&lt;del>It is then being shifted left by &lt;code>&amp;lt;&amp;lt; 16&lt;/code> bits. !!! That&amp;rsquo;s further than an 8-bit number can be shifted left by, so that&amp;rsquo;s UB&lt;/del>
&lt;ul>
&lt;li>EDIT: Actually, it turns out that&amp;rsquo;s not exactly right, it&amp;rsquo;s the &lt;code>&amp;lt;&amp;lt; 24&lt;/code> that&amp;rsquo;s the cause of the UB, thanks &lt;a href="https://github.com/Maato">@Maato&lt;/a> for &lt;a href="https://github.com/glfw/glfw/pull/1986#issuecomment-955784179">pointing this out and explaining in better detail than I could&lt;/a>.&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;p>Suddenly, it all makes sense. And &lt;a href="https://godbolt.org/z/ddq75WsYK">if we load an equal snippet of code into Godbolt&lt;/a> we can see what is happening when we compile without UBSan / the &lt;code>-fsanitize=undefined&lt;/code> flag:&lt;/p>
&lt;p>&lt;a href="https://devlog.hexops.org/img/2021/perfecting-glfw-for-zig-and-finding-undefined-behavior/img5.png">&lt;img alt="Compilation with godbolt with UBSan turned off shows movement into 32-bit EAX register" class="color" src="https://devlog.hexops.org/img/2021/perfecting-glfw-for-zig-and-finding-undefined-behavior/img5.png">&lt;/a>&lt;/p>
&lt;p>Without UBsan, clang merely uses the 32-bit EAX register as an optimization. It loads the 8-bit number into the 32-bit register, and then performs the left shift. Although the shift exceeds 8 bits, it &lt;em>does not get truncated to zero&lt;/em> - instead it is effectively as if the number was converted to a &lt;code>long&lt;/code> (32 bits) prior to the left-shift operation.&lt;/p>
&lt;p>This explains why nobody has caught this UB in GLFW yet, too: it works by accident! Just because the compiler likes to use 32-bit registers in this context.&lt;/p>
&lt;p>And this change benefits all the languages out there using GLFW: &lt;a href="https://github.com/glfw/glfw/pull/1986">glfw/glfw#1986&lt;/a>&lt;/p>
&lt;h2 id="defaults-are-_critical_">Defaults are &lt;em>critical&lt;/em>&lt;/h2>
&lt;p>This code, and undefined behavior, has been in GLFW for over 6 years according to &lt;code>git blame&lt;/code>.&lt;/p>
&lt;p>Anybody using GLFW &lt;em>could have&lt;/em> enabled UBSan in their C compiler. Anybody &lt;em>could have&lt;/em> run into this same crash and debugged it in the last 6 years. But they didn&amp;rsquo;t.&lt;/p>
&lt;p>In mach-glfw, we compile all of GLFW&amp;rsquo;s C code with Zig (which is also a fully functional C and C++ compiler), with UBSan enabled by default.&lt;/p>
&lt;p>Only because Zig has good defaults, because it places so much emphasis on things being right &lt;em>out of the box&lt;/em>, and because there is such an emphasis on having safety checks for undefined behavior - were we able to catch this undefined behavior that went unnoticed in GLFW for the last 6 years.&lt;/p>
&lt;h2 id="thanks-for-reading">Thanks for reading&lt;/h2>
&lt;p>All key Mach engine developments will be posted here.&lt;/p>
&lt;p>Follow &lt;a href="https://github.com/hexops/mach">Mach engine on GitHub&lt;/a>, and if you like what I&amp;rsquo;m doing please consider &lt;a href="https://github.com/sponsors/emidoots">sponsoring my work&lt;/a>.&lt;/p></description></item><item><title>Mach Engine: The future of graphics (with Zig)</title><link>https://devlog.hexops.org/2021/mach-engine-the-future-of-graphics-with-zig/</link><pubDate>Sun, 17 Oct 2021 00:00:00 +0000</pubDate><guid>https://devlog.hexops.com/2021/mach-engine-the-future-of-graphics-with-zig/</guid><description>&lt;p>In the coming months, we&amp;rsquo;ll begin to have truly cross-platform low-level graphics, with the ability to cross compile GPU-accelerated applications written in Zig from any OS and deploy to desktop, mobile, and (in the future) web.&lt;/p>
&lt;h2 id="mach-engine">Mach engine&lt;/h2>
&lt;img class="color-auto" alt="Mach: Game engine &amp; graphics toolkit for the future" src="https://devlog.hexops.org/img/2021/mach-engine-the-future-of-graphics-with-zig/img1.png">
&lt;p>I&amp;rsquo;ve been working on &lt;a href="https://github.com/hexops/mach">Mach Engine&lt;/a> for about 4 months now, although it as a project is many years in the making, and I believe in the next 4-6 months we&amp;rsquo;ll have completion of the first key milestone: truly cross platform graphics and seamless cross compilation.&lt;/p>
&lt;h2 id="vision">Vision&lt;/h2>
&lt;p>Today, I share only the first milestone: Mach engine core. I&amp;rsquo;ve been working on this for around 1 year now, and we&amp;rsquo;re close (maybe 4-6 months away) from completion:&lt;/p>
&lt;p>&lt;a class="imglink" href="https://devlog.hexops.org/img/2021/mach-engine-the-future-of-graphics-with-zig/img2.png">&lt;img class="color-auto" alt="Zero fuss installation, out of the box cross compilation, and a truly cross-platform graphics API" src="https://devlog.hexops.org/img/2021/mach-engine-the-future-of-graphics-with-zig/img2.png">&lt;/a>&lt;/p>
&lt;h2 id="zero-fuss-installation--cross-compilation">Zero fuss installation &amp;amp; cross compilation&lt;/h2>
&lt;p>Only &lt;code>zig&lt;/code> and &lt;code>git&lt;/code> are needed to build from any OS and produce binaries for every OS. You do &lt;strong>not&lt;/strong> need any system dependencies, C libraries, SDKs (Xcode, etc.), C compilers or anything else.&lt;/p>
&lt;p>We&amp;rsquo;re able to achieve this thanks to two things:&lt;/p>
&lt;ol>
&lt;li>Zig has fantastic cross-compilation support, including its own custom linker &lt;code>zld&lt;/code> written by &lt;a href="http://www.jakubkonka.com/">Jakub Konka&lt;/a> which is capable of supporting MacOS cross compilation.&lt;/li>
&lt;li>Mach doing the heavy lifting of packaging the required system SDK libraries and C sources for e.g. GLFW so our Zig build scripts can simply &lt;code>git clone&lt;/code> them for you as needed for the target OS you&amp;rsquo;re building for, completely automagically.&lt;/li>
&lt;/ol>
&lt;h2 id="truly-cross-platform-graphics-api">Truly cross-platform graphics API&lt;/h2>
&lt;h3 id="directx-12-metal-vulkan--opengl">DirectX 12, Metal, Vulkan &amp;amp; OpenGL&lt;/h3>
&lt;p>Imagine a low-level, little to no overhead graphics API that unifies DirectX, Metal, Vulkan, and OpenGL (if no others are available):&lt;/p>
&lt;p>&lt;a class="imglink" href="https://devlog.hexops.org/img/2021/mach-engine-the-future-of-graphics-with-zig/img3.png">&lt;img class="color-auto" alt="Simple, low-level unified graphics API mapping to DirectX 12, Metal, Vulkan, and OpenGL" src="https://devlog.hexops.org/img/2021/mach-engine-the-future-of-graphics-with-zig/img3.png">&lt;/a>&lt;/p>
&lt;p>&lt;em>This isn&amp;rsquo;t anything new:&lt;/em> all modern engines provide this, Godot has been working towards this for &lt;em>years&lt;/em> (and still is), and there exist abstraction layers for Vulkan over most of these APIs as well.&lt;/p>
&lt;h3 id="vendor-support">Vendor support&lt;/h3>
&lt;p>&lt;strong>An API is only as good as the momentum behind it.&lt;/strong> What modern API can target the largest array of platforms with the most vendor backing?&lt;/p>
&lt;p>&lt;a class="imglink" href="https://devlog.hexops.org/img/2021/mach-engine-the-future-of-graphics-with-zig/img4.png">&lt;img class="color-auto" alt="Google to Vulkan, Microsoft to DirectX, Apple to Metal, AMD and NVidia to everything." src="https://devlog.hexops.org/img/2021/mach-engine-the-future-of-graphics-with-zig/img4.png">&lt;/a>&lt;/p>
&lt;ul>
&lt;li>&lt;strong>Microsoft sees DirectX as the future, not Vulkan.&lt;/strong> (DirectX 13 is coming by the end of 2022.)&lt;/li>
&lt;li>&lt;strong>Apple sees Metal as the future, not Vulkan.&lt;/strong> OpenGL and OpenCL are deprecated, and private legal arguments with Khoronos make it unlikely we&amp;rsquo;ll ever see OpenGL or Vulkan on Apple hardware ever again.&lt;/li>
&lt;li>Google, with their Fuschia OS &lt;a href="https://fuchsia.dev/fuchsia-src/concepts/graphics/magma">appears to be primarily into Vulkan&lt;/a> from a system-level POV.&lt;/li>
&lt;li>&lt;strong>NVIDIA, AMD, and Intel generally support as many graphics APIs as possible&lt;/strong>, they want to sell hardware.&lt;/li>
&lt;/ul>
&lt;h3 id="one-api-that-apple-microsoft-and-google-can-all-agree-on">One API that Apple, Microsoft, and Google can all agree on&lt;/h3>
&lt;p>&lt;a class="imglink" href="https://devlog.hexops.org/img/2021/mach-engine-the-future-of-graphics-with-zig/img5.png">&lt;img class="color-auto" alt="Mozilla, Google, Microsoft, Apple, and Intel all to WebGPU" src="https://devlog.hexops.org/img/2021/mach-engine-the-future-of-graphics-with-zig/img5.png">&lt;/a>&lt;/p>
&lt;p>Outside the bounds of traditional graphics APIs there exists an attempt to provide a unified API across all platforms, &lt;a href="https://en.wikipedia.org/wiki/WebGPU">WebGPU&lt;/a> (not to be confused with the much older &lt;em>WebGL&lt;/em>).&lt;/p>
&lt;p>Mozilla, Google, Apple, and Microsoft all got together to build an abstraction layer over the modern graphics APIs - finding the common ground between Direct3D 12, Metal, and Vulkan - plus a safe way to expose that functionality in browsers.&lt;/p>
&lt;p>The name &lt;em>WebGPU&lt;/em> might lead you to believe that this is only for browsers, and that it may not be low-level or fast - but this really couldn&amp;rsquo;t be further from the truth.&lt;/p>
&lt;h3 id="apple--googles-role-is-what-makes-webgpu-unique-and-why-we-chose-it">Apple &amp;amp; Google&amp;rsquo;s role is what makes WebGPU unique, and why we chose it&lt;/h3>
&lt;p>&lt;a class="imglink" href="https://devlog.hexops.org/img/2021/mach-engine-the-future-of-graphics-with-zig/img6.png">&lt;img class="color-auto" alt="Khronos group out of the piture in the future" src="https://devlog.hexops.org/img/2021/mach-engine-the-future-of-graphics-with-zig/img6.png">&lt;/a>&lt;/p>
&lt;p>What is new about WebGPU in my view is the vendors playing key roles in its development, and the fact that it grew outside the Khronos Group.&lt;/p>
&lt;p>Although abstraction layers over modern graphics APIs are nothing new - as Apple, Google, and Microsoft continue to get more into manufacturing their own hardware (it&amp;rsquo;s clear this is a strategic move for them) we should ask ourselves how this will change the landscape, and WebGPU is the first cross-vendor API to be produced by this new ecosystem.&lt;/p>
&lt;h3 id="webgpu-extended-thoughts">WebGPU extended thoughts&lt;/h3>
&lt;details>
&lt;summary>Is WebGPU "native enough"? Yes&lt;/summary>
&lt;p>For browsers, WebGPU will require sandboxing and validation layers. But in native uses, this can all be turned off, and the WebGPU developers are clearly thinking about this use case:&lt;/p>
&lt;ul>
&lt;li>Google's implementation of WebGPU, &lt;a href="https://dawn.googlesource.com/dawn">Dawn&lt;/a>, can be configured to effectively turn off all browser sandboxing / validation that could harm performance due to its client/server architecture.&lt;/li>
&lt;li>Mozilla / gfx-rs Rust engineers have published articles such as &lt;a href="http://kvark.github.io/web/gpu/native/2020/05/03/point-of-webgpu-native.html">"The point of WebGPU on native"&lt;/a>.&lt;/li>
&lt;/ul>
&lt;p>As for the quality of implementations, we could compare the amount of resources going into e.g. Google's WebGPU implementation vs. the amount of resources going into Unity/Unreal/MoltenVK/other graphics abstraction layers - but I suspect they're &lt;em>about equal&lt;/em>.&lt;/p>
&lt;/details>
&lt;details>
&lt;summary>Will WebGPU be implemented on GPUs natively? Maybe someday&lt;/summary>
&lt;p>Not anytime soon. We get some insight into this &lt;a href="https://github.com/gpuweb/gpuweb/issues/847#issuecomment-642883924">via @kvark&lt;/a>, a WebGPU developer:&lt;/p>
&lt;blockquote>
&lt;p>[...] We are not in Khronos, and therefore we have limited participation from IHVs (only Intel and Apple are active). WebGPU was never designed to be implemented by the drivers. I mean, it would totally be rad, in the context of how usable WebGPU &lt;a href="http://kvark.github.io/web/gpu/native/2020/05/03/point-of-webgpu-native.html">can be on native&lt;/a>, but it couldn't be the requirement from the start.&lt;/p>
&lt;/blockquote>
&lt;p>But as WebGPU usage grows or even becomes prodominate due to it being the most powerful API in browsers, and as Microsoft, Google, and Apple continue to develop their own hardware - I think it's not unreasonable to think that it's possible some day WebGPU will be an even more direct 1:1 mapping between a cross-platform API and low-level APIs, more direct than Vulkan abstraction layers such as MoltenVK (which is required to get Vulkan working on top of MacOS's Metal API) - with the potential that some vendor starts asking "what would a GPU native WebGPU implementation look like?"&lt;/p>
&lt;/details>
&lt;details>
&lt;summary>Momentum of WebGPU vs. Vulkan&lt;/summary>
&lt;p>To &lt;a href="https://news.ycombinator.com/item?id=23090432">quote&lt;/a> &lt;a href="http://kvark.github.io/about/">Dzmitry Malyshau / kvark&lt;/a>, a Mozilla engineer working on gfx-rs and WebGPU:&lt;/p>
&lt;blockquote>
&lt;p>At some point, it comes down to the amount of momentum behind the API. In case of WebGPU, we have strong support from Intel and Apple, which are hardware vendors, as well as Google, who can influence mobile hardware vendors. We are making the specification and have resources to appropriately test it and develop the necessary workarounds. It's the quantity to quality transition that sometimes just needs to cross a certain threshold in order to succeed.&lt;/p>
&lt;/blockquote>
&lt;p>According to some, Nvidia and AMD tend to develop new features with Microsoft as part of DirectX. Only then are they "ported" back to Vulkan and OpenGL. I think that says a lot.&lt;/p>
&lt;/details>
&lt;h2 id="what-progress-has-been-made-so-far-on-mach-engine">What progress has been made so far on Mach Engine?&lt;/h2>
&lt;p>Today, we have cross-compilation of GLFW on all desktop OSs working out of the box with nothing more than &lt;code>zig&lt;/code> and &lt;code>git&lt;/code>:&lt;/p>
&lt;p>&lt;a class="imglink" href="https://devlog.hexops.org/img/2021/mach-engine-the-future-of-graphics-with-zig/img7.png">&lt;img class="color-auto" alt="Cross compilation from Mac, Linux, and Windows to eachother on all major architectures." src="https://devlog.hexops.org/img/2021/mach-engine-the-future-of-graphics-with-zig/img7.png">&lt;/a>&lt;/p>
&lt;p>This involved:&lt;/p>
&lt;ul>
&lt;li>&lt;a href="https://github.com/hexops/sdk-macos-11.3">Packaging MacOS SDKs&lt;/a> and &lt;a href="https://github.com/hexops/sdk-linux-x86_64">Linux system X11/Wayland libraries&lt;/a> into SDKs, and creating Zig build scripts that could merely &lt;code>git clone&lt;/code> them and utilize them for cross-compilation.&lt;/li>
&lt;li>Purchasing Apple M1 hardware to test on, and for GitHub Actions as it doesn&amp;rsquo;t support it.&lt;/li>
&lt;li>Normalizing symlinks in Mac/Linux SDKs everywhere so that Windows users don&amp;rsquo;t have a hard time with Git symlink management.&lt;/li>
&lt;li>Contributing &lt;a href="https://github.com/ziglang/zig/pull/9734">a small fix to the Zig linker&lt;/a>&lt;/li>
&lt;/ul>
&lt;p>All this to say, we&amp;rsquo;re really taking a holistic approach to achieve this.&lt;/p>
&lt;h2 id="whats-next-webgpu">What&amp;rsquo;s next? WebGPU&lt;/h2>
&lt;p>I&amp;rsquo;m happy to report that a fair amount of progress on this front has been made.&lt;/p>
&lt;p>Here is Google&amp;rsquo;s WebGPU implementation, Dawn, compiled using &lt;code>zig&lt;/code>:
&lt;a class="imglink" href="https://devlog.hexops.org/img/2021/mach-engine-the-future-of-graphics-with-zig/img8.png">&lt;img alt="A red triangle in a black window titled 'Dawn Window', the" src="https://devlog.hexops.org/img/2021/mach-engine-the-future-of-graphics-with-zig/img8.png">&lt;/a>
&lt;a class="imglink" href="https://devlog.hexops.org/img/2021/mach-engine-the-future-of-graphics-with-zig/img9.png">&lt;img alt="A Zig code file, hello_triangle.zig showing Dawn and WebGPU API usage in Zig" src="https://devlog.hexops.org/img/2021/mach-engine-the-future-of-graphics-with-zig/img9.png">&lt;/a>&lt;/p>
&lt;p>This includes:&lt;/p>
&lt;ul>
&lt;li>A ~500 line port of the &lt;code>hello_triangle&lt;/code> example from Dawn to Zig&lt;/li>
&lt;li>A ~1200 line &lt;code>build.zig&lt;/code> file which compiles all the Dawn sources using Zig, without using Google&amp;rsquo;s ninja/etc development tools.&lt;/li>
&lt;li>A hack to workaround a bug in Zig where ObjC++ &lt;code>.mm&lt;/code> files are not yet recognized.&lt;/li>
&lt;li>C shims for the &lt;code>dawn_native&lt;/code> C++ API and utility APIs, which are required in order to bind Dawn to an actual GLFW window.&lt;/li>
&lt;/ul>
&lt;p>There are a few weeks of work to do before this can be merged and will be usable by others, please stay tuned for that.&lt;/p>
&lt;p>After that will be development of idiomatic Zig bindings to the &lt;a href="https://github.com/webgpu-native/webgpu-headers">WebGPU C API&lt;/a> which is shared between implementations such as Dawn and the Rust&amp;rsquo;s &lt;a href="https://github.com/gfx-rs/wgpu-native">gfx-rs/wgpu-native&lt;/a> implementation (we could theoretically switch between them at startup in the future, but we&amp;rsquo;ll probably stick with Dawn as it does not require a separate Rust toolchain and it would prevent out-of-the-box cross compilation.)&lt;/p>
&lt;h2 id="when-will-there-be-games-examples-etc">When will there be games, examples, etc.?&lt;/h2>
&lt;p>It&amp;rsquo;ll be a while because I am focusing purely on the groundwork first. It&amp;rsquo;s unlikely you&amp;rsquo;ll see anything with &lt;em>real demo value&lt;/em> before later next year.&lt;/p>
&lt;p>I&amp;rsquo;m sure that will be disheartening to hear - and may make you to think there&amp;rsquo;s nothing of substance here. I totally understand that view, but I hope you&amp;rsquo;ll stay tuned because I&amp;rsquo;m in this for the long haul and it&amp;rsquo;s not my first rodeo (I previously spent 4 years writing &lt;a href="https://azul3d.org">a game engine in Go&lt;/a>, and have worked &lt;a href="https://sourcegraph.com">at a devtools startup for 7 years&lt;/a>, with my biggest lesson from of those experiences being the importance of demos and examples.&lt;/p>
&lt;h2 id="follow-along">Follow along&lt;/h2>
&lt;p>Major developments will be posted here.&lt;/p>
&lt;p>You can also follow the project at &lt;a href="https://github.com/hexops/mach">github.com/hexops/mach&lt;/a>.&lt;/p>
&lt;p>If you like what I&amp;rsquo;m doing, you can &lt;a href="https://github.com/sponsors/emidoots">sponsor me on GitHub&lt;/a>.&lt;/p>
&lt;p>Thanks for reading!&lt;/p></description></item><item><title>Unicode data file compression: achieving 40-70% reduction over gzip alone</title><link>https://devlog.hexops.org/2021/unicode-data-file-compression/</link><pubDate>Sat, 03 Jul 2021 00:00:00 +0000</pubDate><guid>https://devlog.hexops.com/2021/unicode-data-file-compression/</guid><description>&lt;p>A little story about how writing a domain-specific compression algorithm in a few days can sometimes yield big benefits, why it&amp;rsquo;s sometimes worth giving it a shot, and how to tell when you should try. Note: this is about Unicode spec data files, not general purpose text compression.&lt;/p>
&lt;ul>
&lt;li>&lt;a href="#background">Background&lt;/a>&lt;/li>
&lt;li>&lt;a href="#problem">Problem&lt;/a>&lt;/li>
&lt;li>&lt;a href="#investigation">Investigation&lt;/a>
&lt;ul>
&lt;li>&lt;a href="#binary-encoding">Binary encoding?&lt;/a>&lt;/li>
&lt;li>&lt;a href="#differential-encodingcompression">Differential encoding/compression?&lt;/a>&lt;/li>
&lt;li>&lt;a href="#go-implementation">Go implementation&lt;/a>&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>&lt;a href="#zig-implementation">Zig implementation&lt;/a>
&lt;ul>
&lt;li>&lt;a href="#differential-encoding-state-machine">Differential encoding state machine&lt;/a>&lt;/li>
&lt;li>&lt;a href="#a-stream-of-op-codes">A stream of op codes&lt;/a>&lt;/li>
&lt;li>&lt;a href="#iteratively-finding-the-most-lucrative-opcodes">Iteratively finding the most lucrative opcodes&lt;/a>&lt;/li>
&lt;li>&lt;a href="#a-stream-of-opcodes-for-a-state-machine-a-natural-progression-from-a-binary-format">A stream of opcodes for a state machine: a natural progression from a binary format?&lt;/a>&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>&lt;a href="#results-better-than-gzipbrotli-and-even-better-with-them">Results? Better than gzip/brotli; and even better &lt;em>with&lt;/em> them!&lt;/a>
&lt;ul>
&lt;li>&lt;a href="#why-test-with-gzipbrotli-but-not-others">Why test with gzip/brotli but not others?&lt;/a>&lt;/li>
&lt;li>&lt;a href="#how-complex-is-the-implementation">How complex is the implementation?&lt;/a>&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>&lt;a href="#notable-mention">Notable mention&lt;/a>&lt;/li>
&lt;li>&lt;a href="#conclusion">Conclusion&lt;/a>&lt;/li>
&lt;/ul>
&lt;h2 id="background">Background&lt;/h2>
&lt;p>Two weeks ago, I began using &lt;a href="https://github.com/jecolon/ziglyph">Ziglyph&lt;/a> (&amp;ldquo;Unicode processing with Zig, and a UTF-8 string type: Zigstr.&amp;rdquo;) - an awesome library by &lt;a href="https://github.com/jecolon">@jecolon&lt;/a>, for grapheme cluster sorting in &lt;a href="https://github.com/hexops/zorex">Zorex, an omnipotent regexp engine&lt;/a>.&lt;/p>
&lt;p>I don&amp;rsquo;t personally have any prior experience working with the lower level details of Unicode, or compression algorithms for that matter.&lt;/p>
&lt;h2 id="problem">Problem&lt;/h2>
&lt;p>As I stumbled into the wondrous world that is Unicode text sorting (see also my article: &lt;a href="https://devlog.hexops.org/2021/unicode-sorting-why-browsers-added-special-emoji-matching">Unicode sorting is hard &amp;amp; why browsers added special emoji matching to regexp&lt;/a>) and began using Ziglyph, I came across an issue: the standard Unicode collation algorithm, which Ziglyph implements, depends on some large Unicode data tables for normalization and sort keys - even gzipped these were fairly large:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-sh" data-lang="sh">&lt;span class="line">&lt;span class="cl">hexops-mac:zorex emidoots$ du -sh asset/*
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">308K asset/uca-allkeys.txt.gz
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">260K asset/ucd-UnicodeData.txt.gz
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>These file sizes may seem small, but one of my goals is to make Zorex a real competitor to e.g. a browser&amp;rsquo;s native regexp engine. That&amp;rsquo;s challenging because WebAssembly bundle sizes matter &lt;em>a lot&lt;/em> in that context, and using the browser&amp;rsquo;s regexp implementation is virtually free.&lt;/p>
&lt;h2 id="investigation">Investigation&lt;/h2>
&lt;p>I set out to try and reduce the size of these data files. First I &lt;a href="https://github.com/jecolon/ziglyph/issues/3">opened an issue and asked&lt;/a> if anyone else had thoughts around reducing the size of this data. The author of Ziglyph &lt;a href="https://github.com/jecolon">@jecolon&lt;/a> is awesome and readily had some ideas and was able to reduce the two files substantially by removing unnecessary data (such as comments, etc.)&lt;/p>
&lt;p>Curious how much further we could go, I kept squinting at the data files (warning: large):&lt;/p>
&lt;ul>
&lt;li>&lt;a href="http://www.unicode.org/Public/UCA/latest/allkeys.txt">http://www.unicode.org/Public/UCA/latest/allkeys.txt&lt;/a>&lt;/li>
&lt;li>&lt;a href="http://www.unicode.org/Public/UNIDATA/UnicodeData.txt">http://www.unicode.org/Public/UNIDATA/UnicodeData.txt&lt;/a>&lt;/li>
&lt;/ul>
&lt;h3 id="binary-encoding">Binary encoding?&lt;/h3>
&lt;p>My first thoughts were that a binary encoding would likely reduce the size a lot. I pulled in some help from Hobbyist reverse engineer &lt;a href="https://github.com/Andoryuuta">@Andoryuuta&lt;/a> and he got started on a binary encoding for UnicodeData.txt based on the spec. With that, he was able to reduce the original 1.9M allkeys.txt file down to 250K (125K gzipped) - quite a win.&lt;/p>
&lt;h3 id="differential-encodingcompression">Differential encoding/compression?&lt;/h3>
&lt;p>My secondary thought was that, scrolling through these data files it was obvious most entries were derived from prior entries. Many entries were long runs of data where the next entry had the same value, plus a small increment. For example, at the start of the &lt;code>allkeys.txt&lt;/code> file:&lt;/p>
&lt;pre tabindex="0">&lt;code>0000 ; [.0000.0000.0000] # NULL (in ISO 6429)
0001 ; [.0000.0000.0000] # START OF HEADING (in ISO 6429)
0002 ; [.0000.0000.0000] # START OF TEXT (in ISO 6429)
0003 ; [.0000.0000.0000] # END OF TEXT (in ISO 6429)
0004 ; [.0000.0000.0000] # END OF TRANSMISSION (in ISO 6429)
0005 ; [.0000.0000.0000] # ENQUIRY (in ISO 6429)
0006 ; [.0000.0000.0000] # ACKNOWLEDGE (in ISO 6429)
0007 ; [.0000.0000.0000] # BELL (in ISO 6429)
0008 ; [.0000.0000.0000] # BACKSPACE (in ISO 6429)
000E ; [.0000.0000.0000] # SHIFT OUT (in ISO 6429)
000F ; [.0000.0000.0000] # SHIFT IN (in ISO 6429)
&lt;/code>&lt;/pre>&lt;p>Of course, not all sections are so sequential. Many sections are a bit more arbitrary:&lt;/p>
&lt;pre tabindex="0">&lt;code>FF9A ; [.4304.0020.0012] # HALFWIDTH KATAKANA LETTER RE
32F9 ; [.4304.0020.0013] # CIRCLED KATAKANA RE
3355 ; [.4304.0020.001C][.42FB.0020.001C] # SQUARE REMU
3356 ; [.4304.0020.001C][.430A.0020.001C][.42EE.0020.001C][.42E3.0020.001C][.0000.0037.001C][.430A.0020.001C] # SQUARE RENTOGEN
308D ; [.4305.0020.000E] # HIRAGANA LETTER RO
31FF ; [.4305.0020.000F] # KATAKANA LETTER SMALL RO
30ED ; [.4305.0020.0011] # KATAKANA LETTER RO
&lt;/code>&lt;/pre>&lt;p>Still, there are obvious patterns one can see in the way these values change.&lt;/p>
&lt;h3 id="go-implementation">Go implementation&lt;/h3>
&lt;p>I did a quick hacky Go implementation of differential encoding on these files to see how well that would work. The results were pretty good, and already beat just &lt;code>gzip -9&lt;/code> compression of the files:&lt;/p>
&lt;table>
&lt;thead>
&lt;tr>
&lt;th>File&lt;/th>
&lt;th>Original&lt;/th>
&lt;th>Original + &lt;code>gzip -9&lt;/code>&lt;/th>
&lt;th>My compression&lt;/th>
&lt;th>My compression + &lt;code>gzip -9&lt;/code>&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td>Decompositions.txt&lt;/td>
&lt;td>72K&lt;/td>
&lt;td>28K&lt;/td>
&lt;td>48K&lt;/td>
&lt;td>12K&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>allkeys-minimal.txt&lt;/td>
&lt;td>500K&lt;/td>
&lt;td>148K&lt;/td>
&lt;td>204K&lt;/td>
&lt;td>36K&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;p>However, because I chose to do these experiments in Go I found a number of inefficiencies:&lt;/p>
&lt;ul>
&lt;li>There were a lot of locations where I encoded things as 8-bit unsigned integers (Go&amp;rsquo;s smallest value type) instead of a more optimal 4-bit unsigned integer. I could&amp;rsquo;ve done bit shifting, but it would&amp;rsquo;ve been annoying.&lt;/li>
&lt;li>There were also many places where I encoded Unicode codepoints as 32-bit unsigned integers, rather than a more optimal 21-bit unsigned integer (because valid Unicode codepoints do not exceed that range.)&lt;/li>
&lt;/ul>
&lt;p>For a real implementation, I switched over to Zig.&lt;/p>
&lt;h2 id="zig-implementation">Zig implementation&lt;/h2>
&lt;p>Actually, two things made working on this in Zig much easier than in Go:&lt;/p>
&lt;ol>
&lt;li>Zig has variable bit-width integers: I could just write &lt;code>u4&lt;/code> and &lt;code>u21&lt;/code> values instead of needing to handle bit packing within larger size integers myself. That was &lt;em>nice&lt;/em>.&lt;/li>
&lt;li>In the Zig standard library it provides:&lt;/li>
&lt;/ol>
&lt;ul>
&lt;li>&lt;a href="https://sourcegraph.com/github.com/ziglang/zig@0.8.0/-/blob/lib/std/io/bit_writer.zig?L152-202">&lt;code>std.io.BitWriter&lt;/code>&lt;/a>&lt;/li>
&lt;li>&lt;a href="https://sourcegraph.com/github.com/ziglang/zig@0.8.0/-/blob/lib/std/io/bit_reader.zig?L176-248">&lt;code>std.io.BitReader&lt;/code>&lt;/a>&lt;/li>
&lt;/ul>
&lt;p>With these two features, it became incredibly easy to write the most optimal bit-packed encoding of the data.&lt;/p>
&lt;p>In fact, the basic uncompressed binary format &lt;a href="https://github.com/jecolon/ziglyph/pull/7/commits/7d4042d8df21cc11eaf42177c2f4d9b3afd9c4a7">was only a few lines to encode&lt;/a>:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="kr">pub&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">fn&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">compressTo&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">self&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">*&lt;/span>&lt;span class="n">DecompFile&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">writer&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">anytype&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">!&lt;/span>&lt;span class="kt">void&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="kr">var&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">buf_writer&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">std&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">io&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">bufferedWriter&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">writer&lt;/span>&lt;span class="p">);&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="kr">var&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">out&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">std&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">io&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">bitWriter&lt;/span>&lt;span class="p">(.&lt;/span>&lt;span class="n">Little&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">buf_writer&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">writer&lt;/span>&lt;span class="p">());&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="k">try&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">out&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">writeBits&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="nb">@intCast&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="kt">u16&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">self&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">entries&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">items&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">len&lt;/span>&lt;span class="p">),&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="mi">16&lt;/span>&lt;span class="p">);&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="k">while&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">self&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">next&lt;/span>&lt;span class="p">())&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">|&lt;/span>&lt;span class="n">entry&lt;/span>&lt;span class="o">|&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="k">try&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">out&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">writeBits&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">entry&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">key_len&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="mi">3&lt;/span>&lt;span class="p">);&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">_&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">try&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">out&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">write&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">entry&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">key&lt;/span>&lt;span class="p">[&lt;/span>&lt;span class="mi">0&lt;/span>&lt;span class="p">..&lt;/span>&lt;span class="n">entry&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">key_len&lt;/span>&lt;span class="p">]);&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="k">try&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">out&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">writeBits&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="nb">@enumToInt&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">entry&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">value&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">form&lt;/span>&lt;span class="p">),&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="nb">@bitSizeOf&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">Form&lt;/span>&lt;span class="p">));&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="k">try&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">out&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">writeBits&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">entry&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">value&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">len&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="mi">5&lt;/span>&lt;span class="p">);&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="k">for&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">entry&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">value&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">seq&lt;/span>&lt;span class="p">[&lt;/span>&lt;span class="mi">0&lt;/span>&lt;span class="p">..&lt;/span>&lt;span class="n">entry&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">value&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">len&lt;/span>&lt;span class="p">])&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">|&lt;/span>&lt;span class="n">s&lt;/span>&lt;span class="o">|&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">try&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">out&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">writeBits&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">s&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="mi">21&lt;/span>&lt;span class="p">);&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">}&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="k">try&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">out&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">flushBits&lt;/span>&lt;span class="p">();&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="k">try&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">buf_writer&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">flush&lt;/span>&lt;span class="p">();&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="p">}&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;h3 id="differential-encoding-state-machine">Differential encoding state machine&lt;/h3>
&lt;p>To handle the compression, I started out &lt;em>really&lt;/em> simple. First I encoded just a binary version of the data with no compression. The most important thing was to get to a point where I could start testing some theories about what would compress the data really well, and validate that it was in fact being losslessly compressed/decompressed without issues via tests:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="k">test&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="s">&amp;#34;compression_is_lossless&amp;#34;&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">allocator&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">testing&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">allocator&lt;/span>&lt;span class="p">;&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="c1">// Compress UnicodeData.txt -&amp;gt; Decompositions.bin
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">var&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">file&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">try&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">parseFile&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">allocator&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="s">&amp;#34;src/data/ucd/UnicodeData.txt&amp;#34;&lt;/span>&lt;span class="p">);&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="k">defer&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">file&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">deinit&lt;/span>&lt;span class="p">();&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="k">try&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">file&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">compressToFile&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="s">&amp;#34;src/data/ucd/Decompositions.bin&amp;#34;&lt;/span>&lt;span class="p">);&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="c1">// Reset the raw file iterator.
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">file&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">iter&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="mi">0&lt;/span>&lt;span class="p">;&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="c1">// Decompress the file.
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">var&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">decompressed&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">try&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">decompressFile&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">allocator&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="s">&amp;#34;src/data/ucd/Decompositions.bin&amp;#34;&lt;/span>&lt;span class="p">);&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="k">defer&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">decompressed&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">deinit&lt;/span>&lt;span class="p">();&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="k">while&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">file&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">next&lt;/span>&lt;span class="p">())&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">|&lt;/span>&lt;span class="n">expected&lt;/span>&lt;span class="o">|&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="kr">var&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">actual&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">decompressed&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">next&lt;/span>&lt;span class="p">().&lt;/span>&lt;span class="o">?&lt;/span>&lt;span class="p">;&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="k">try&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">testing&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">expectEqual&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">expected&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">actual&lt;/span>&lt;span class="p">);&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">}&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="p">}&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;h3 id="a-stream-of-op-codes">A stream of op codes&lt;/h3>
&lt;p>I settled on a really simple idea: these data files all have basically just a variable number of integers per line. And if I kept &amp;ldquo;registers&amp;rdquo; representing the current value for each integer, I could determine the difference between the past line and the subsequent one to produce a difference. If I encoded that difference as a stream of opcodes with associative data, then to decompress the file I could simply &amp;ldquo;replay&amp;rdquo; those operations based on the opcodes and then iteratively come up with more finely-specified, specific opcodes to handle specific types of data.&lt;/p>
&lt;p>I started out simple, really just with two opcodes:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-Zig" data-lang="Zig">&lt;span class="line">&lt;span class="cl">&lt;span class="c1">// A UDDC opcode for a decomposition file.
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Opcode&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">enum&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">u4&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="c1">// Sets all the register values with no compression.
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">set&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="c1">// Denotes the end of the opcode stream. This is so that we don&amp;#39;t need to encode the total
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="c1">// number of opcodes in the stream up front (note also the file is bit packed: there may be
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="c1">// a few remaining zero bits at the end as padding so we need an EOF opcode rather than say
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="c1">// catching the actual file read EOF.)
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">eof&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="p">};&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Using these two opcodes, I was able to effectively encode the entire file. The &lt;code>set&lt;/code> opcode had some associative data which effectively expressed an entire raw, uncompressed entry in the file (one line.) This increased the file size since it was effectively just adding 4 bits (the opcode) as additional overhead.&lt;/p>
&lt;h3 id="iteratively-finding-the-most-lucrative-opcodes">Iteratively finding the most lucrative opcodes&lt;/h3>
&lt;p>To find the most lucrative (i.e. compressed) opcodes, I printed the data I would associate with an opcode (like &lt;code>set&lt;/code>) and then looked for repetitions. Sometimes manually, and sometimes by e.g. piping data to a combination of &lt;code>sort|uniq -c|sort -r&lt;/code> to find common patterns.&lt;/p>
&lt;p>Since I was printing &lt;em>differences&lt;/em> between e.g. the current value and previous value, it was really easy to find common patterns that appeared in the file very frequently, such as specific fields incrementing by specific amounts with one field being arbitrary:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="c1">// increments key[3] += 1; sets value.seq[0]; emits an entry.
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="c1">// 1685 instances
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">increment_key_3_and_set_value_seq_0_and_emit&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Once I had narrowed down to a larger group of opcodes that more specifically represented the data, I was able to print the number of bits required to store the change in specific fields (like &lt;code>value.seq[0]&lt;/code>) and add even more specific opcodes to use variable bit widths:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="c1">// increments key[3] += 1; sets value.seq[0]; emits an entry.
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="c1">// 1685 instances
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">increment_key_3_and_set_value_seq_0_2bit_and_emit&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="c1">// 978 instances, 2323 byte reduction
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">increment_key_3_and_set_value_seq_0_8bit_and_emit&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="c1">// 269 instances, 437 byte reduction
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">increment_key_3_and_set_value_seq_0_21bit_and_emit&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="c1">// 438 instances
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>It being a stream of opcodes was quite nice, because it allowed me to determine how much space was being consumed by a given opcode in sum and target further reducing the size of opcodes that took up the most space. It also made it really easy to find opcodes that I though &lt;em>might&lt;/em> help, but in practice turned out to not be that frequent. Just print them, pipe to &lt;code>sort|uniq -c|sort -r&lt;/code> to count them - and remove the lowest hanging fruit.&lt;/p>
&lt;h3 id="a-stream-of-opcodes-for-a-state-machine-a-natural-progression-from-a-binary-format">A stream of opcodes for a state machine: a natural progression from a binary format?&lt;/h3>
&lt;p>I chose an opcode stream for a reason: so that I could encode some complex logic in the form of a state machine. This came in handy for the &lt;code>allkeys.txt&lt;/code> file in specific, as it allowed me to introduce &lt;em>incrementors&lt;/em> into the mix which would &lt;em>increment register values by a chosen amount each iteration (value &amp;ldquo;emission&amp;rdquo;)&lt;/em>.&lt;/p>
&lt;p>The final opcodes for the allkeys.txt file ended up being:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="c1">// A UDDC opcode for an allkeys file.
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Opcode&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">enum&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">u3&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="c1">// Sets an incrementor for the key register, incrementing the key by this much on each emission.
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="c1">// 10690 instances, 13,480.5 bytes
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">inc_key&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="c1">// Sets an incrementor for the value register, incrementing the value by this much on each emission.
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="c1">// 7668 instances, 62,970 bytes
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">inc_value&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="c1">// Emits a single value.
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="c1">// 31001 instances, 15,500.5 bytes
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">emit_1&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">emit_2&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">emit_4&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">emit_8&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">emit_32&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="c1">// Denotes the end of the opcode stream. This is so that we don&amp;#39;t need to encode the total
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="c1">// number of opcodes in the stream up front (note also the file is bit packed: there may be
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="c1">// a few remaining zero bits at the end as padding so we need an EOF opcode rather than say
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="c1">// catching the actual file read EOF.)
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">eof&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="p">};&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>This meant I could determine the difference in the &lt;code>key&lt;/code> and &lt;code>value&lt;/code> fields (what those actually are isn&amp;rsquo;t important, just that they are all minor incremental differences on the prior entry in the file) - set an &lt;em>incrementor&lt;/em> to do some work on each emission, such as say increment the &lt;code>key&lt;/code> array by &lt;code>[0, 1, 5]&lt;/code> each emission, and then say &amp;ldquo;now emit_32 values!&amp;rdquo;.&lt;/p>
&lt;p>Suddenly, instead of encoding 32 key entries (32 * 3 key values * 21 bits) I am just setting an incrementor (3 key values * 21 bits) and a single opcode to emit 32 values (3 bits).&lt;/p>
&lt;p>Overall, this gave me a very nice, natural-feeling progression from a &amp;ldquo;raw binary format&amp;rdquo; to something a bit more specific - a bit more &lt;em>compressed.&lt;/em>&lt;/p>
&lt;h2 id="results-better-than-gzipbrotli-and-even-better-_with_-them">Results? Better than gzip/brotli; and even better &lt;em>with&lt;/em> them!&lt;/h2>
&lt;p>For lack of better words, I&amp;rsquo;ll call my compression algorithm here Unicode Data Differential Compression, since it&amp;rsquo;s differential and specifically for the Unicode data table files - or UDDC for short.&lt;/p>
&lt;p>The two files went from the original 568K (with gzip) down to just 61K (with UDDC+gzip). With this, we are able to equal or match both &lt;code>gzip -9&lt;/code> and &lt;code>brotli -9&lt;/code> on their own, AND when combined with gzip or brotli we are able to reduce by 40-70%:&lt;/p>
&lt;table>
&lt;thead>
&lt;tr>
&lt;th>File&lt;/th>
&lt;th>Before (bytes)&lt;/th>
&lt;th>After (bytes)&lt;/th>
&lt;th>Change&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td>&lt;code>Decompositions.bin&lt;/code>&lt;/td>
&lt;td>48,242&lt;/td>
&lt;td>19,072&lt;/td>
&lt;td>-60.5% (-29,170 bytes)&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;code>Decompositions.bin.br&lt;/code>&lt;/td>
&lt;td>24,411&lt;/td>
&lt;td>14,783&lt;/td>
&lt;td>-39.4% (-9,628 bytes)&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;code>Decompositions.bin.gz&lt;/code>&lt;/td>
&lt;td>30,931&lt;/td>
&lt;td>15,670&lt;/td>
&lt;td>-49.34% (15,261 bytes)&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;/td>
&lt;td>&lt;/td>
&lt;td>&lt;/td>
&lt;td>&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;code>allkeys.bin&lt;/code>&lt;/td>
&lt;td>373,719&lt;/td>
&lt;td>100,907&lt;/td>
&lt;td>-73.0% (-272,812 bytes)&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;code>allkeys.bin.br&lt;/code>&lt;/td>
&lt;td>108,982&lt;/td>
&lt;td>44,860&lt;/td>
&lt;td>-58.8% (-64,122 bytes)&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;code>allkeys.bin.gz&lt;/code>&lt;/td>
&lt;td>163,237&lt;/td>
&lt;td>46,996&lt;/td>
&lt;td>-71.2% (-116,241 bytes)&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;ul>
&lt;li>Before represents binary format without UDDC compression.&lt;/li>
&lt;li>After represents binary format with UDDC compression.&lt;/li>
&lt;li>&lt;code>.br&lt;/code> represents &lt;code>brotli -9 &amp;lt;file&amp;gt;&lt;/code> compression&lt;/li>
&lt;li>&lt;code>.gz&lt;/code> represents &lt;code>gzip -9 &amp;lt;file&amp;gt;&lt;/code> compression&lt;/li>
&lt;/ul>
&lt;h3 id="why-test-with-gzipbrotli-but-not-others">Why test with gzip/brotli but not others?&lt;/h3>
&lt;p>I chose to compare against gzip/brotli specifically because you get those effectively for free in WebAssembly: browsers already know how to decompress those and ship with gzip/brotli decompressors - so you can use them for free without shipping any additional code.&lt;/p>
&lt;h3 id="how-complex-is-the-implementation">How complex is the implementation?&lt;/h3>
&lt;p>The final implementation for both files is only a few hundred lines (excluding blank lines, comments, and tests):&lt;/p>
&lt;ul>
&lt;li>&lt;a href="https://github.com/jecolon/ziglyph/blob/main/src/collator/AllKeysFile.zig">&lt;code>AllKeysFile.zig&lt;/code>&lt;/a>: 298 lines&lt;/li>
&lt;li>&lt;a href="https://github.com/jecolon/ziglyph/blob/main/src/normalizer/DecompFile.zig">&lt;code>DecompFile.zig&lt;/code>&lt;/a> 336 lines&lt;/li>
&lt;/ul>
&lt;p>I have not measured produced machine code size yet, but suspect it is relatively negligible compared to the gains.&lt;/p>
&lt;h2 id="notable-mention">Notable mention&lt;/h2>
&lt;p>I should mention that the Unicode spec, as &lt;a href="https://github.com/jecolon">@jecolon&lt;/a> pointed out to me, does suggest ways to reduce sort key lengths and implement Run-length Compression:&lt;/p>
&lt;ul>
&lt;li>&lt;a href="https://unicode.org/reports/tr10/#Reducing_Sort_Key_Lengths">https://unicode.org/reports/tr10/#Reducing_Sort_Key_Lengths&lt;/a>&lt;/li>
&lt;li>&lt;a href="https://unicode.org/reports/tr10/#Run-length_Compression">https://unicode.org/reports/tr10/#Run-length_Compression&lt;/a>&lt;/li>
&lt;/ul>
&lt;p>I wasn&amp;rsquo;t able to locate an implementation of this (I&amp;rsquo;d be curious to compare results!) but suspect that, as the run-length compression does not fit the data as tightly, it would not compress quite as well (although would handle any major changes to the type of data in the files without requiring compression algorithm changes better.)&lt;/p>
&lt;p>Also of note is that their algorithm only seems to be mentioned in the context of allkeys.txt / the Unicode Collation Algorithm, not in the context of normalization/decompositions from &lt;code>UnicodeData.txt&lt;/code>.&lt;/p>
&lt;h2 id="conclusion">Conclusion&lt;/h2>
&lt;p>Ask questions, stay curious, don&amp;rsquo;t be afraid to experiment even if it&amp;rsquo;s outside of your domain of expertise. You might surprise yourself and find something interesting, challenging, and worthwhile.&lt;/p></description></item><item><title>Unicode sorting is hard &amp; why browsers added special emoji matching to regexp</title><link>https://devlog.hexops.org/2021/unicode-sorting-why-browsers-added-special-emoji-matching/</link><pubDate>Sun, 27 Jun 2021 00:00:00 +0000</pubDate><guid>https://devlog.hexops.com/2021/unicode-sorting-why-browsers-added-special-emoji-matching/</guid><description>&lt;p>As I work on &lt;a href="https://github.com/hexops/zorex">Zorex, an omnipotent regexp engine&lt;/a> I have stumbled into a world of tales about why Unicode text sorting is so annoying in the modern day. Let&amp;rsquo;s talk about that.&lt;/p>
&lt;ul>
&lt;li>&lt;a href="#why-ascii-sorting-is-not-enough">Why ASCII sorting is not enough&lt;/a>&lt;/li>
&lt;li>&lt;a href="#twitters-emoji-problem---or-when-unicode-locale-aware-sorting-really-matters">Twitter&amp;rsquo;s emoji problem - or when Unicode locale-aware sorting Really Matters™&lt;/a>&lt;/li>
&lt;li>&lt;a href="#browsers-added-special-emoji-matching-to-regexp">Browsers added special emoji matching to regexp&lt;/a>&lt;/li>
&lt;li>&lt;a href="#language-comparison">Language comparison&lt;/a>
&lt;ul>
&lt;li>&lt;a href="#javascript-collator-sorting-is-not-guaranteed-across-browsers">JavaScript Collator sorting is not guaranteed across browsers&lt;/a>&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>&lt;a href="#go-sortstrings-is-not-locale-aware">Go sort.Strings is not locale aware&lt;/a>
&lt;ul>
&lt;li>&lt;a href="#rust-vec-sorting-is-not-locale-aware">Rust Vec sorting is not locale aware&lt;/a>&lt;/li>
&lt;li>&lt;a href="#swifts-default-is-not-locale-aware-but-unicode-support-is-notable">Swift&amp;rsquo;s default is not locale aware, but unicode support is notable&lt;/a>&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>&lt;a href="#zigs-ziglyph-package">Zig&amp;rsquo;s ziglyph package&lt;/a>&lt;/li>
&lt;li>&lt;a href="#why-is-localized-text-sorting-hard">Why is localized text sorting hard?&lt;/a>&lt;/li>
&lt;li>&lt;a href="#webassembly-may-make-things-worse">WebAssembly may make things worse?&lt;/a>&lt;/li>
&lt;/ul>
&lt;h2 id="why-ascii-sorting-is-not-enough">Why ASCII sorting is not enough&lt;/h2>
&lt;p>Perhaps you are sorting strings in JavaScript like this:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-javascript" data-lang="javascript">&lt;span class="line">&lt;span class="cl">&lt;span class="kr">const&lt;/span> &lt;span class="nx">words&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="p">[&lt;/span>&lt;span class="s1">&amp;#39;Bears&amp;#39;&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="s1">&amp;#39;Beetle&amp;#39;&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="s1">&amp;#39;kiss&amp;#39;&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="s1">&amp;#39;Similar&amp;#39;&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="s1">&amp;#39;Apples&amp;#39;&lt;/span>&lt;span class="p">];&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="nx">words&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="nx">sort&lt;/span>&lt;span class="p">();&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">// [ &amp;#34;Apples&amp;#34;, &amp;#34;Bears&amp;#34;, &amp;#34;Beetle&amp;#34;, &amp;#34;Similar&amp;#34;, &amp;#34;kiss&amp;#34; ]
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>And that works pretty well, until someone translates it to German:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-javascript" data-lang="javascript">&lt;span class="line">&lt;span class="cl">&lt;span class="kr">const&lt;/span> &lt;span class="nx">words&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="p">[&lt;/span>&lt;span class="s1">&amp;#39;Bären&amp;#39;&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="s1">&amp;#39;Käfer&amp;#39;&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="s1">&amp;#39;küssen&amp;#39;&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="s1">&amp;#39;Ähnlich&amp;#39;&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="s1">&amp;#39;Äpfel&amp;#39;&lt;/span>&lt;span class="p">];&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="nx">words&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="nx">sort&lt;/span>&lt;span class="p">();&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">// [ &amp;#34;Bären&amp;#34;, &amp;#34;Käfer&amp;#34;, &amp;#34;küssen&amp;#34;, &amp;#34;Ähnlich&amp;#34;, &amp;#34;Äpfel&amp;#34; ]
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>The preferred alphabetical sorting would be &lt;code>[ &amp;quot;Ähnlich&amp;quot;, &amp;quot;Äpfel&amp;quot;, &amp;quot;Bären&amp;quot;, &amp;quot;Käfer&amp;quot;, &amp;quot;küssen&amp;quot; ]&lt;/code> - &lt;code>Array.sort&lt;/code> doesn&amp;rsquo;t do that.&lt;/p>
&lt;p>That is because it is sorting lexicographically by byte values in the string, and not taking into account locales.&lt;/p>
&lt;h2 id="twitters-emoji-problem---or-when-unicode-locale-aware-sorting-really-matters">Twitter&amp;rsquo;s emoji problem - or when Unicode locale-aware sorting Really Matters™&lt;/h2>
&lt;p>Twitter is &lt;a href="https://9to5google.com/2018/05/21/twitter-android-emoji-updates/">no stranger to issues with emojis&lt;/a>, but have you ever thought about how they check if a hashtag contains only legal characters and emojis? Regexp, of course!&lt;/p>
&lt;p>You might think one could just use a regexp unicode character class, like &lt;code>[\u{1f300}-\u{1f5ff}]&lt;/code> - but that only covers a single codepoint! Emojis and other text rely on combining multiple Unicode codepoints to compose &lt;em>grapheme clusters&lt;/em> - and often what we see as a single visible character on our screen.&lt;/p>
&lt;p>The full regexp needed to match all emojis with codepoints would be:&lt;/p>
&lt;pre tabindex="0">&lt;code class="language-regexp" data-lang="regexp">(?:\ud83e\uddd1\ud83c\udffb\u200d\u2764\ufe0f\u200d\ud83d\udc8b\u200d\ud83e\uddd1\ud83c\udffc|\ud83e\uddd1\ud83c\udffb\u200d\u2764\ufe0f\u200d\ud83d
[102,816 characters omitted]
&lt;/code>&lt;/pre>&lt;p>For your sake, I&amp;rsquo;ve omitted the other 102,816 characters of that regexp. You can view it here: &lt;a href="https://regex101.com/r/2ia4m2/7">https://regex101.com/r/2ia4m2/7&lt;/a>&lt;/p>
&lt;h2 id="browsers-added-special-emoji-matching-to-regexp">Browsers added special emoji matching to regexp&lt;/h2>
&lt;p>Luckily for Twitter and others, ECMAScript&amp;rsquo;s &lt;a href="https://github.com/tc39/proposal-regexp-unicode-property-escapes">TC39 proposal a few years back&lt;/a> extended the regexp engine to support Unicode property escapes for emojis and a few other Unicode properties so you can write e.g.:&lt;/p>
&lt;pre tabindex="0">&lt;code class="language-regexp" data-lang="regexp">\p{Emoji_Presentation}
&lt;/code>&lt;/pre>&lt;p>Without packing several thousand bytes of Unicode data tables or regexp into your JS bundle.&lt;/p>
&lt;h2 id="language-comparison">Language comparison&lt;/h2>
&lt;p>As &lt;a href="https://lemire.me/blog/2018/12/17/sorting-strings-properly-is-stupidly-hard/">Daniel Lemire said&lt;/a>: &lt;em>sorting strings is stupidly hard&lt;/em>.&lt;/p>
&lt;h3 id="javascript-collator-sorting-is-not-guaranteed-across-browsers">JavaScript Collator sorting is not guaranteed across browsers&lt;/h3>
&lt;p>You may have found browser&amp;rsquo;s &lt;a href="https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/localeCompare">&lt;code>String.prototype.localCompare&lt;/code>&lt;/a> or &lt;a href="https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Intl/Collator">&lt;code>Intl.Collator&lt;/code>&lt;/a> and they &lt;strong>DO&lt;/strong> fix the issue&lt;a href="https://ourcodeworld.com/articles/read/958/how-to-sort-an-array-of-strings-alphabetically-with-special-characters-properly-with-javascript">[1]&lt;/a>:&lt;/p>
&lt;pre tabindex="0">&lt;code>const words = [&amp;#39;Bären&amp;#39;, &amp;#39;Käfer&amp;#39;, &amp;#39;küssen&amp;#39;, &amp;#39;Ähnlich&amp;#39;, &amp;#39;Äpfel&amp;#39;];
words.sort(Intl.Collator().compare);
// [ &amp;#34;Ähnlich&amp;#34;, &amp;#34;Äpfel&amp;#34;, &amp;#34;Bären&amp;#34;, &amp;#34;Käfer&amp;#34;, &amp;#34;küssen&amp;#34; ]
&lt;/code>&lt;/pre>&lt;p>(note, however, you may wish to use &lt;code>Intl.Collator('de').compare&lt;/code> instead to sort according to German language customs)&lt;/p>
&lt;p>However, beware that if you look at &lt;a href="https://tc39.es/ecma402/#sec-collator-comparestrings">the ECMA spec&lt;/a> for this you will find:&lt;/p>
&lt;blockquote>
&lt;p>It is &lt;strong>recommended&lt;/strong> that the CompareStrings abstract operation be implemented following Unicode Technical Standard 10, Unicode Collation Algorithm [&amp;hellip;]&lt;/p>
&lt;p>Applications should not assume that the behaviour of the CompareStrings abstract operation for Collator instances with the same resolved options will remain the same for different versions of the same implementation.&lt;/p>
&lt;/blockquote>
&lt;p>Although many browsers may produce similar sorting results - not all will. For one thing, not all locales are available across browsers.&lt;/p>
&lt;p>Further, different browsers may choose to sort things differently. For example IE 11 sorting &amp;ldquo;co-op&amp;rdquo; after &amp;ldquo;coop&amp;rdquo; while other browsers do the opposite.&lt;a href="https://stackoverflow.com/questions/33919257/sorting-strings-with-punctuation-using-intl-collator-is-inconsistent-across-brow">[2]&lt;/a>&lt;/p>
&lt;h2 id="go-sortstrings-is-not-locale-aware">Go sort.Strings is not locale aware&lt;/h2>
&lt;p>It may be interesting to note that Go&amp;rsquo;s &lt;code>sort.Strings&lt;/code> operates on byte comparisons, and has the same issue as JavaScript&amp;rsquo;s &lt;code>Array.prototype.sort&lt;/code>:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-Go" data-lang="Go">&lt;span class="line">&lt;span class="cl">&lt;span class="nx">words&lt;/span> &lt;span class="o">:=&lt;/span> &lt;span class="p">[]&lt;/span>&lt;span class="kt">string&lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="s">&amp;#34;Bären&amp;#34;&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="s">&amp;#34;Käfer&amp;#34;&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="s">&amp;#34;küssen&amp;#34;&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="s">&amp;#34;Ähnlich&amp;#34;&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="s">&amp;#34;Äpfel&amp;#34;&lt;/span>&lt;span class="p">}&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="nx">sort&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="nf">Strings&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="nx">words&lt;/span>&lt;span class="p">)&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">// [Bären Käfer küssen Ähnlich Äpfel]
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>One can easily perform unicode code point (rune) sorting in Go, which would fix the above example - but note that rune sorting is not locale-aware, and importantly that &lt;a href="https://www.reddit.com/r/golang/comments/o1o5hr/fyi_a_single_go_rune_is_not_the_same_as_a_single">a Go rune is not the same as a visible character&lt;/a> and would not take into account grapheme clusters.&lt;/p>
&lt;p>For proper Unicode locale-aware sorting in Go, you need to use the Unicode Collation Algorithm via &lt;a href="https://pkg.go.dev/golang.org/x/text/collate">golang.org/x/text/collate&lt;/a> but be sure to also apply normalization to your text first via &lt;a href="https://pkg.go.dev/golang.org/x/text@v0.3.6/unicode/norm">golang.org/x/text/unicode/norm&lt;/a>&lt;/p>
&lt;h3 id="rust-vec-sorting-is-not-locale-aware">Rust Vec sorting is not locale aware&lt;/h3>
&lt;p>A Rust &lt;code>Vec&lt;/code> of strings implements sorting&lt;a href="https://doc.rust-lang.org/std/primitive.str.html#impl-Ord">[3]&lt;/a> lexicographically by their byte values, consistent with Go&amp;rsquo;s &lt;code>sort.Strings&lt;/code> and JavaScripts &lt;code>Array.prototype.sort&lt;/code>:&lt;/p>
&lt;pre tabindex="0">&lt;code>let mut vec = Vec::new();
vec.push(&amp;#34;Bären&amp;#34;);
vec.push(&amp;#34;Käfer&amp;#34;);
vec.push(&amp;#34;küssen&amp;#34;);
vec.push(&amp;#34;Ähnlich&amp;#34;);
vec.sort(&amp;#34;Äpfel&amp;#34;);
println!(&amp;#34;{:?}&amp;#34;, vec);
// [&amp;#34;Bären&amp;#34;, &amp;#34;Käfer&amp;#34;, &amp;#34;küssen&amp;#34;, &amp;#34;Ähnlich&amp;#34;, &amp;#34;Äpfel&amp;#34;]
&lt;/code>&lt;/pre>&lt;p>Locale-aware sorting in Rust is provided &lt;a href="https://github.com/google/rust_icu">by ICU4C bindings by Google, google/rust_icu&lt;/a> (note however, there have been a number of &lt;a href="https://github.com/rust-lang/rust/issues/14656#issuecomment-45164318">vulnerabilities in the ICU4C library&lt;/a>) and there is ongoing work to implement internationalization in pure Rust as a safer alternative: &lt;a href="https://github.com/unicode-org/icu4x">unicode-org/icu4x&lt;/a>.&lt;/p>
&lt;h3 id="swifts-default-is-not-locale-aware-but-unicode-support-is-notable">Swift&amp;rsquo;s default is not locale aware, but unicode support is notable&lt;/h3>
&lt;p>Swift remains consistent with other languages in sorting strings lexicographically by byte value:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-swift" data-lang="swift">&lt;span class="line">&lt;span class="cl">&lt;span class="kd">var&lt;/span> &lt;span class="nv">words&lt;/span> &lt;span class="p">=&lt;/span> &lt;span class="p">[&lt;/span>&lt;span class="s">&amp;#34;Bären&amp;#34;&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="s">&amp;#34;Käfer&amp;#34;&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="s">&amp;#34;küssen&amp;#34;&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="s">&amp;#34;Ähnlich&amp;#34;&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="s">&amp;#34;Äpfel&amp;#34;&lt;/span>&lt;span class="p">]&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="n">words&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="bp">sort&lt;/span>&lt;span class="p">()&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">// [&amp;#34;Bären&amp;#34;, &amp;#34;Käfer&amp;#34;, &amp;#34;küssen&amp;#34;, &amp;#34;Ähnlich&amp;#34;, &amp;#34;Äpfel&amp;#34;]&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>However, it is notable that Swift includes locale sensitive sorting out of the box&lt;a href="https://sarunw.com/posts/different-ways-to-sort-array-of-strings-in-swift/">[4]&lt;/a>:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-swift" data-lang="swift">&lt;span class="line">&lt;span class="cl">&lt;span class="kd">var&lt;/span> &lt;span class="nv">words&lt;/span> &lt;span class="p">=&lt;/span> &lt;span class="p">[&lt;/span>&lt;span class="s">&amp;#34;Bären&amp;#34;&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="s">&amp;#34;Käfer&amp;#34;&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="s">&amp;#34;küssen&amp;#34;&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="s">&amp;#34;Ähnlich&amp;#34;&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="s">&amp;#34;Äpfel&amp;#34;&lt;/span>&lt;span class="p">]&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="kd">let&lt;/span> &lt;span class="nv">sorted&lt;/span> &lt;span class="p">=&lt;/span> &lt;span class="n">words&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="bp">sorted&lt;/span> &lt;span class="p">{&lt;/span> &lt;span class="p">(&lt;/span>&lt;span class="n">lhs&lt;/span>&lt;span class="p">:&lt;/span> &lt;span class="nb">String&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="n">rhs&lt;/span>&lt;span class="p">:&lt;/span> &lt;span class="nb">String&lt;/span>&lt;span class="p">)&lt;/span> &lt;span class="p">-&amp;gt;&lt;/span> &lt;span class="nb">Bool&lt;/span> &lt;span class="k">in&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="k">return&lt;/span> &lt;span class="n">lhs&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">localizedStandardCompare&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">rhs&lt;/span>&lt;span class="p">)&lt;/span> &lt;span class="p">==&lt;/span> &lt;span class="p">.&lt;/span>&lt;span class="n">orderedAscending&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="p">}&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">// [&amp;#34;Ähnlich&amp;#34;, &amp;#34;Äpfel&amp;#34;, &amp;#34;Bären&amp;#34;, &amp;#34;Käfer&amp;#34;, &amp;#34;küssen&amp;#34;]&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>It also seems quite notable just &lt;a href="https://developer.apple.com/documentation/swift/string">how very unicode-aware the Swift documentation is on their String type&lt;/a>. Other languages could learn a thing or two here in educating developers.&lt;/p>
&lt;h2 id="zigs-ziglyph-package">Zig&amp;rsquo;s ziglyph package&lt;/h2>
&lt;p>Zig&amp;rsquo;s standard library is still quite under development, however it seems likely that major unicode functionality will be outside the stdlib.&lt;/p>
&lt;p>Luckily, &lt;a href="https://github.com/jecolon">@jecolon&lt;/a> in the Zig community is working on an excellent package for this: &lt;a href="https://github.com/jecolon/ziglyph">ziglyph&lt;/a>.&lt;/p>
&lt;p>I mention this because I&amp;rsquo;m a fan of the language and have recently begun contributing to that package; but otherwise Zig isn&amp;rsquo;t any different than other languages listed here aside from there being no real &amp;ldquo;default&amp;rdquo; way to sort strings from what I know.&lt;/p>
&lt;h2 id="why-is-localized-text-sorting-hard">Why is localized text sorting hard?&lt;/h2>
&lt;p>I believe there are a combination of factors at play:&lt;/p>
&lt;ul>
&lt;li>Most languages leave Unicode locale-aware text sorting as an afterthought.&lt;/li>
&lt;li>Most developers don&amp;rsquo;t care enough to use Unicode, let alone implement locale-aware text sorting. Internationalization is always &amp;ldquo;that thing we&amp;rsquo;ll do if somebody complains&amp;rdquo; or an afterthought.&lt;/li>
&lt;li>It&amp;rsquo;s hard. It wasn&amp;rsquo;t until recently that we got semi-decent support for it across browsers, and what is there still leaves a lot to be desired.&lt;/li>
&lt;li>Many are still running into dated software, like NodeJS versions from ~2019 ish that &lt;a href="https://github.com/nodejs/node/issues/19214">didn&amp;rsquo;t have full ICU support on by default&lt;/a>.&lt;/li>
&lt;/ul>
&lt;h2 id="webassembly-may-make-things-worse">WebAssembly may make things worse?&lt;/h2>
&lt;p>As a closing thought, I just want to hint at why I think WebAssembly will make things worse before they get better.&lt;/p>
&lt;p>Whether your application is in Go and has it&amp;rsquo;s own Unicode Collation Algorithm (UCA) implementation, or Rust and uses bindings to the popular ICU4C library - one thing is going to remain true: it requires large data files to work.&lt;/p>
&lt;p>The UCA algorithm depends on two quite large data table files to work:&lt;/p>
&lt;ul>
&lt;li>&lt;a href="https://www.unicode.org/Public/9.0.0/ucd/UnicodeData.txt">UnicodeData.txt&lt;/a> for normalization, a step required before sorting can take place.&lt;/li>
&lt;li>&lt;a href="http://www.unicode.org/Public/UCA/12.0.0/allkeys.txt">allkeys.txt&lt;/a> for weighting certain text above others.&lt;/li>
&lt;li>And more, if you want truly locale-aware sorting and not just &amp;ldquo;the default&amp;rdquo; the UCA algorithm gives you.&lt;/li>
&lt;/ul>
&lt;p>Together, these files can add up to over a half a megabyte.&lt;/p>
&lt;p>While WASM languages could shell out to JavaScript browser APIs for collation, I suspect they won&amp;rsquo;t due to the lack of guarantees around those APIs.&lt;/p>
&lt;p>A more likely scenario is languages continuing to leave locale-aware sorting as an optional, opt-in feature - that also makes your application larger.&lt;/p>
&lt;p>I think this a worthwhile problem to solve, so I am working on &lt;a href="https://github.com/jecolon/ziglyph/issues/3">compression algorithms for these files specifically&lt;/a> in Zig to reduce them to only a few tens of kilobytes.&lt;/p></description></item><item><title>My game development journey &amp; why I'm increasing my contribution to Zig to $200/mo</title><link>https://devlog.hexops.org/2021/increasing-my-contribution-to-zig-to-200-a-month/</link><pubDate>Sat, 10 Apr 2021 00:00:00 +0000</pubDate><guid>https://devlog.hexops.com/2021/increasing-my-contribution-to-zig-to-200-a-month/</guid><description>&lt;p>Today, I increased my monthly donation to Zig to $200 a month. Before Zig, I have not contributed financially to any open source project.&lt;/p>
&lt;p>Before I can explain why I am so extremely excited about the &lt;a href="https://ziglang.org/">Zig&lt;/a> programming language and its community, I need to explain where I come from.&lt;/p>
&lt;ul>
&lt;li>&lt;a href="#i-grew-up-playing-linux-games-like-mania-drive">I grew up playing Linux games like Mania Drive&lt;/a>&lt;/li>
&lt;li>&lt;a href="#it-wasnt-long-before-i-found-that-the-mania-drive-game-engine-was-open-source">It wasn&amp;rsquo;t long before I found that the Mania Drive game engine was open-source.&lt;/a>&lt;/li>
&lt;li>&lt;a href="#i-was-so-infatuated-with-this-game-engine-i-convinced-my-dads-coworkers-to-pay-me-to-build-them-a-virtual-meeting-world">I was so infatuated with this game engine, I convinced my dad&amp;rsquo;s coworkers to pay me to build them a virtual meeting world&lt;/a>&lt;/li>
&lt;li>&lt;a href="#but-the-game-kept-crashing-at-random-and-i-had-no-idea-why">But the game kept crashing at random, and I had no idea why&lt;/a>&lt;/li>
&lt;li>&lt;a href="#panda3d-disneys-pythonc-game-engine">Panda3D: Disney&amp;rsquo;s Python/C++ game engine&lt;/a>&lt;/li>
&lt;li>&lt;a href="#the-panda3d-game-engine-opened-new-doors-for-me">The Panda3D game engine opened new doors for me&lt;/a>&lt;/li>
&lt;li>&lt;a href="#i-began-to-prevail">I began to prevail&lt;/a>&lt;/li>
&lt;li>&lt;a href="#but-my-limited-knowledge-hit-me-again">But my limited knowledge hit me again&lt;/a>&lt;/li>
&lt;li>&lt;a href="#learning-c">Learning C++&lt;/a>&lt;/li>
&lt;li>&lt;a href="#learning-go-writing-my-own-game-engine">Learning Go, writing my own game engine&lt;/a>&lt;/li>
&lt;li>&lt;a href="#my-game-engine-appeared-on-hacker-news-2014">My game engine appeared on Hacker News (2014)&lt;/a>&lt;/li>
&lt;li>&lt;a href="#joining-sourcegraph">Joining Sourcegraph&lt;/a>&lt;/li>
&lt;li>&lt;a href="#six-and-a-half-years-later-im-still-at-sourcegraph">Six and a half years later, I&amp;rsquo;m still at Sourcegraph.&lt;/a>&lt;/li>
&lt;li>&lt;a href="#but-im-still-a-game-developer-at-heart">But I&amp;rsquo;m still a game developer at heart&lt;/a>&lt;/li>
&lt;li>&lt;a href="#c-was-easier-for-me-as-a-beginner-than-c">C was easier for me as a beginner than C++&lt;/a>&lt;/li>
&lt;li>&lt;a href="#unity-is-the-new-flash">Unity is the new Flash&lt;/a>&lt;/li>
&lt;li>&lt;a href="#why-do-we-encourage-building-but-not-understanding">Why do we encourage building, but not understanding?&lt;/a>&lt;/li>
&lt;li>&lt;a href="#one-language-to-write-your-game-and-engine-in">One language to write your game and engine in&lt;/a>&lt;/li>
&lt;li>&lt;a href="#looking-for-the-one-language-to-rule-them-all">Looking for the one language to rule them all&lt;/a>
&lt;ul>
&lt;li>&lt;a href="#could-rust-be-it">Could Rust be it?&lt;/a>&lt;/li>
&lt;li>&lt;a href="#could-the-v-language-be-it">Could the V language be it?&lt;/a>&lt;/li>
&lt;li>&lt;a href="#could-i-build-it">Could I build it?&lt;/a>&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>&lt;a href="#discovering-zig">Discovering Zig&lt;/a>
&lt;ul>
&lt;li>&lt;a href="#learning-zig">Learning Zig&lt;/a>&lt;/li>
&lt;li>&lt;a href="#working-in-it">Working in it&lt;/a>&lt;/li>
&lt;li>&lt;a href="#the-community-is-incredible">The community is incredible&lt;/a>&lt;/li>
&lt;li>&lt;a href="#my-commitment-to-zig">My commitment to Zig&lt;/a>&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h2 id="i-grew-up-playing-linux-games-like-mania-drive">I grew up playing Linux games like Mania Drive&lt;/h2>
&lt;iframe width="720" height="480" src="https://www.youtube.com/embed/7YFicbaXHw0" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen>&lt;/iframe>
&lt;p>Mania Drive was an open-source clone of the popular Trackmania series. Me and my siblings in our early teens easily spent hundreds, if not thousands, of hours in Mania Drive.&lt;/p>
&lt;p>In retrospect it has quite bad graphics, physics, game-play mechanics, etc. But it was customizable! There was a simple tile-based level editor. We would spend days building the most confusing, crazy, impossible maps to beat so we could challenge each other. We would play it all night.&lt;/p>
&lt;p>Obsession over this game led to even more modding: the discovery of &lt;a href="https://www.blender.org">Blender&lt;/a> meant we could create even more custom maps than in the limited tile-based map editor. Although the Blender UI was pretty rough back then:&lt;/p>
&lt;img class="color" src="https://devlog.hexops.org/img/2021/increasing-my-contribution-to-zig-to-200-a-month/img2.png">
&lt;h2 id="it-wasnt-long-before-i-found-that-the-mania-drive-game-engine-was-open-source">It wasn&amp;rsquo;t long before I found that the Mania Drive game engine was open-source.&lt;/h2>
&lt;p>&lt;a href="http://memak.raydium.org/index.php">Raydium&lt;/a>, the C game engine behind Mania Drive, is still around today - one of the beauties of open source software! At the time, the things about it that just blew my mind were:&lt;/p>
&lt;ul>
&lt;li>It supported scripting through PHP! I had used PHP a lot with LAMP stacks, so the idea that I could script the engine in PHP was &lt;em>mind blowing to now 14-year old me.&lt;/em>&lt;/li>
&lt;li>2 years later, I got an iPod touch and the Raydium developers had just posted a demo video showing the engine running on the iPhone. 16 year old me thought this was &lt;em>literally&lt;/em> the coolest thing ever, albeit immensely disappointed I did not have a Mac to build it for my iPod:&lt;/li>
&lt;/ul>
&lt;iframe width="720" height="480" src="https://www.youtube.com/embed/wcPfxr9BgA4" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen>&lt;/iframe>
&lt;h2 id="i-was-so-infatuated-with-this-game-engine-i-convinced-my-dads-coworkers-to-pay-me-to-build-them-a-virtual-meeting-world">I was so infatuated with this game engine, I convinced my dad&amp;rsquo;s coworkers to pay me to build them a virtual meeting world&lt;/h2>
&lt;p>My dad was running one of his many startups at the time - it had some momentum behind it, basically a platform like Ebay but for selling services instead of goods. Several of his work friends were funding it with significant amounts of their own money.&lt;/p>
&lt;p>Unfortunately for them, they spent most of their focus on business operations than actually getting a product out the door. Lucky for me, however, this meant they had came across Sun&amp;rsquo;s &lt;a href="https://en.wikipedia.org/wiki/Open_Wonderland">Project Wonderland&lt;/a> - the delightfully terrible 3D virtual workplace of the future (or so Sun thought, before they had to sell to Oracle.) It was &lt;em>terrible,&lt;/em> barely a good demo:&lt;/p>
&lt;img class="color" src="https://devlog.hexops.org/img/2021/increasing-my-contribution-to-zig-to-200-a-month/img3.png">
&lt;p>It required something like 32 CPUs and 64G of memory to run the server for just 8 players - no small feat back in 2010! The client was laggy, there were virtual whiteboards you could draw on but everything was slow. Even its VOIP feature was glitchy - although quite novel at the time. It was all around &lt;em>a terrible experience.&lt;/em>&lt;/p>
&lt;p>16-year-old me convinced my dad and his coworkers to instead pay me to build them a better version: one using Raydium, C - and PHP.&lt;/p>
&lt;p>It wasn&amp;rsquo;t long before I had some amateur copy of Project Wonderland - ironically better than Sun&amp;rsquo;s in &lt;em>many aspects&lt;/em> - and surely worse in others. It even had a client auto-updater built with wxWidgets and Python (it just shelled out to an &lt;code>svn&lt;/code> client to download the latest copy of the game, hah!)&lt;/p>
&lt;h2 id="but-the-game-kept-crashing-at-random-and-i-had-no-idea-why">But the game kept crashing at random, and I had no idea why&lt;/h2>
&lt;p>The truth was literally a 16-year old script kiddy copying and pasting C code from various demos of Raydium, without a care in the world for freeing memory or avoiding stack corruption.&lt;/p>
&lt;pre tabindex="0">&lt;code>// Don&amp;#39;t remove this print statement. Game will crash!
&lt;/code>&lt;/pre>&lt;p>It was around this time that I began to really get into Python: it was simple, something I could really wrap my brain around, and it was powerful. I stumbled into cython and wrote OpenGL bindings - this time with more appreciation for memory management.&lt;/p>
&lt;h2 id="panda3d-disneys-pythonc-game-engine">Panda3D: Disney&amp;rsquo;s Python/C++ game engine&lt;/h2>
&lt;p>Panda3D was the game engine Disney used to create Toontown Online and Pirates of the Carribean Online:&lt;/p>
&lt;img class="color" src="https://devlog.hexops.org/img/2021/increasing-my-contribution-to-zig-to-200-a-month/img4.png">
&lt;img class="color" src="https://devlog.hexops.org/img/2021/increasing-my-contribution-to-zig-to-200-a-month/img5.png">
&lt;p>It was written in C++, with automatic binding generation for Python. In fact, many portions of the engine were written in &lt;em>just&lt;/em> Python and not usable from C++ at all. &lt;a href="https://www.panda3d.org">They revamped their website site recently, so I guess it&amp;rsquo;s still around.&lt;/a>&lt;/p>
&lt;h2 id="the-panda3d-game-engine-opened-new-doors-for-me">The Panda3D game engine opened new doors for me&lt;/h2>
&lt;p>Discovering Panda3D opened new doors for me. At around 16-17 years old now, I was able to really get my first real taste of game development: I could write games in this &amp;ndash; in Python &amp;ndash; and &lt;em>they wouldn&amp;rsquo;t crash in ways that I couldn&amp;rsquo;t understand.&lt;/em>&lt;/p>
&lt;p>Pretty soon, I had actual games in the works. I was starting to learn about why draw order matters - and how I had no understanding of mip-mapping:&lt;/p>
&lt;img class="color" src="https://devlog.hexops.org/img/2021/increasing-my-contribution-to-zig-to-200-a-month/img6.png">
&lt;img class="color" src="https://devlog.hexops.org/img/2021/increasing-my-contribution-to-zig-to-200-a-month/img7.png">
&lt;h2 id="i-began-to-prevail">I began to prevail&lt;/h2>
&lt;p>At this point I had several, actually working games - I was proud of what I was working on, had multiplayer functionality hooked up to a MySQL database even.&lt;/p>
&lt;img class="color" src="https://devlog.hexops.org/img/2021/increasing-my-contribution-to-zig-to-200-a-month/img1.png">
&lt;h2 id="but-my-limited-knowledge-hit-me-again">But my limited knowledge hit me again&lt;/h2>
&lt;p>For my game, I wanted nothing more than for my friends to be able to chat with me using a chat box. The problem was, Panda3D&amp;rsquo;s Python GUI library, DirectGUI, was just too slow at rendering text.&lt;/p>
&lt;p>I tried everything I could, and even got to the point where I was asking on the forums if it was possible to draw a TextNode with multi-threading:&lt;/p>
&lt;blockquote>
&lt;p>Calls to TextNode.generate() are very expensive.&lt;/p>
&lt;p>Is there a way for Panda to run all TextNode.generate() calls in a seperate thread? I’ve attempted doing it on my own using direct.stdpy.threading.Thread, only it causes dead locks, I would guess this is to my own lack of knowledge.&lt;/p>
&lt;p>could anyone help me?&lt;/p>
&lt;/blockquote>
&lt;p>I didn&amp;rsquo;t get a response. I couldn&amp;rsquo;t solve the issue. &amp;ldquo;I can&amp;rsquo;t add a chat box to my games&amp;rdquo; became a problem &lt;em>I could not solve.&lt;/em>&lt;/p>
&lt;h2 id="learning-c">Learning C++&lt;/h2>
&lt;p>I was at a point where I had rewritten most of Panda3D&amp;rsquo;s UI components myself in Python (mind you, theirs &lt;em>are&lt;/em> written in Python - you cannot use them from C++. I don&amp;rsquo;t know why I did this.)&lt;/p>
&lt;p>But I still needed a way to render text. I needed a way to make &lt;code>TextNode.generate()&lt;/code> faster. Little did I know at the time that it was generating geometry from freetype and creating one draw call per text drawn - which is super slow, and did not help my naive usage of its API!&lt;/p>
&lt;p>The harsh reality was that I didn&amp;rsquo;t have anybody to teach me. I spent months trying to learn C++, but it is a beast (and &amp;ldquo;Disney game engine C++&amp;rdquo; is, of course, a flavor of C++ not found in books.) It wasn&amp;rsquo;t something I could handle as 16-year-old kid without any real knowledge of low level languages.&lt;/p>
&lt;p>In trying to learn C++, something became painfully clear to me:&lt;/p>
&lt;blockquote>
&lt;p>&lt;em>Having part of my application written in Python and part of it written in C++, two very different languages, was only great until I realized I &lt;em>had&lt;/em> to dive into this large C++ code base and had no knowledge of it.&lt;/em>&lt;/p>
&lt;/blockquote>
&lt;p>I gave up.&lt;/p>
&lt;h2 id="learning-go-writing-my-own-game-engine">Learning Go, writing my own game engine&lt;/h2>
&lt;p>When Google announced Go, I heard about it very early on. At this time, they were still advertising it as a low-level systems language, an alternative to C, &lt;em>a better C&lt;/em>. But more forgiving, because it had a garbage collector.&lt;/p>
&lt;p>Coming from a predominantly Python background at the time, this sounded incredible to me: I could write a game engine in this and understand my code &lt;em>end to end&lt;/em> and make sure there is no single piece that I do not understand.&lt;/p>
&lt;p>I spent the next 4 years of my life, almost 100% full-time working on &lt;a href="https://azul3d.org">Azul3D, a game engine in Go&lt;/a> - and spent only minimal time attending online community college on the weekends.&lt;/p>
&lt;p>There was &lt;em>so much&lt;/em> that I learned during this time, about software engineering, game engines, audio, input, math, image and audio codecs, blender plugins, file formats, physics, and working with other people (some cool things &lt;a href="https://github.com/nwidger/nintengo">like a NES emulator came out of that&lt;/a>)&lt;/p>
&lt;p>I learned an &lt;em>immense&lt;/em> amount, but I had nothing to show for it aside from &lt;a href="https://azul3d.org">a funny looking website&lt;/a> and some quite poor screenshots (to the dismay of every person I told.)&lt;/p>
&lt;img class="color" src="https://devlog.hexops.org/img/2021/increasing-my-contribution-to-zig-to-200-a-month/img8.png">
&lt;h2 id="my-game-engine-appeared-on-hacker-news-2014">My game engine appeared on Hacker News (2014)&lt;/h2>
&lt;p>&lt;a href="https://news.ycombinator.com/item?id=8151028">Someone posted it on Hacker News&lt;/a>, which was both exciting but also extremely depressing for me at the time. I took the feedback as statements that what I was doing &lt;em>was wrong&lt;/em>, rather than as feedback about how to improve:&lt;/p>
&lt;blockquote>
&lt;p>The web site looks cool, but it sets off a whole bunch of red flags for me.&lt;/p>
&lt;/blockquote>
&lt;blockquote>
&lt;p>the go programming language is not very suitable for games at all.&lt;/p>
&lt;/blockquote>
&lt;blockquote>
&lt;p>if you truly need a performant graphics engine, it&amp;rsquo;s going to be either C++, C or Rust anyway.&lt;/p>
&lt;/blockquote>
&lt;blockquote>
&lt;p>Azul3D is for programmers and doesn&amp;rsquo;t provide GUI-editors.&lt;/p>
&lt;p>So, you write your levels using a text editor? That&amp;rsquo;s not for programmers, that&amp;rsquo;s for people who hate themselves.&lt;/p>
&lt;/blockquote>
&lt;blockquote>
&lt;p>No screenshots of the game at all?&lt;/p>
&lt;/blockquote>
&lt;blockquote>
&lt;p>Garbage collector FAQ isn&amp;rsquo;t necessarily reassuring, since it seems to say &amp;ldquo;go through the same hoops other GC gaming platforms push you through&amp;rdquo;. Obligatory Rust gaming comment goes here.&lt;/p>
&lt;/blockquote>
&lt;blockquote>
&lt;p>in Rust you have code without GC, but the compiler makes sure that everything is freed.&lt;/p>
&lt;/blockquote>
&lt;p>I learned so much from this interaction:&lt;/p>
&lt;ul>
&lt;li>Being transparent about project status is important.&lt;/li>
&lt;li>I shouldn&amp;rsquo;t have &amp;ldquo;hidden&amp;rdquo; screenshots of the project. I was worried people would judge what the engine is capable of based on bad programmer artwork: instead, they judged it for having none.&lt;/li>
&lt;li>I should&amp;rsquo;ve talked about the interesting parts more:
&lt;ul>
&lt;li>Did you know there is a D* lite pathfinding algorithm that was used in one of the Mars rovers, is super simple, and handles dynamic terrains? Much nicer than A* and other variants.&lt;/li>
&lt;li>What my vision for a game engine deeply integrated with Blender, and developer-first, would look like in practice.&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>I frankly shouldn&amp;rsquo;t have cared so much. I thought what I was doing was awesome, and I let others&amp;rsquo; viewpoints affect my own view of my work negatively.&lt;/li>
&lt;/ul>
&lt;h2 id="joining-sourcegraph">Joining Sourcegraph&lt;/h2>
&lt;p>It was around this time that I was basically deciding: &lt;em>what would I do for a living?&lt;/em>&lt;/p>
&lt;p>Luckily, someone in the Go community (whom I&amp;rsquo;d never talked to before) reached out to me and asked &amp;ldquo;hey, what are you doing?&amp;rdquo; - I told them I was in school, and left out the part where I was a college student living with parents, scraping by, and likely going grocery-store-part-time-job seeking soon.&lt;/p>
&lt;img class="color" src="https://devlog.hexops.org/img/2021/bill-thanks.jpg">
&lt;p>I didn&amp;rsquo;t come from a background that would lead me to believe I could make a living programming in Go, to the contrary my parents often warned me I couldn&amp;rsquo;t and that I would need to go into Cisco network infrastructure instead.&lt;/p>
&lt;p>I was told in blunt terms, I could scrape by doing what I love - or make a killing doing something I hate. My parents were mechanical engineers at aerospace companies (I&amp;rsquo;ll let you guess which path they took.)&lt;/p>
&lt;p>Bill&amp;rsquo;s short ~20 minute conversation with me, quite literally changed my life in ways I couldn&amp;rsquo;t have imagined. I often think about where I would be today had he not reached out to me, and I never quite knew how to reach back out and say thank you in a way that was as meaningful to him as what he did was for me.&lt;/p>
&lt;h2 id="six-and-a-half-years-later-im-still-at-sourcegraph">Six and a half years later, I&amp;rsquo;m still at Sourcegraph.&lt;/h2>
&lt;p>I&amp;rsquo;ve learned &lt;em>so much&lt;/em> about startups, being a good engineer, management, business operations, cloud infrastructure, teamwork, communication, and so much more in the last six years I&amp;rsquo;ve spent at Soucegraph. There are so many stories I have, and so many great people I have had the opportunity to work with because of it.&lt;/p>
&lt;p>We grew from awkward little startup without a clear product, a tiny team, an uncertain future - into a sprawling metropolis with massive amounts of happy users, customers, $50m i series C funding, and have grown the team to over a hundred people all over the world. I have played a key role in that, and continue to this day.&lt;/p>
&lt;p>A passion for making games as a kid, turned into a passion for making developer tools all around better. I still have much to do here.&lt;/p>
&lt;h2 id="but-im-still-a-game-developer-at-heart">But I&amp;rsquo;m still a game developer at heart&lt;/h2>
&lt;p>If there&amp;rsquo;s one thing I return to &lt;em>regularly&lt;/em>, &lt;em>consistently&lt;/em>, and &lt;em>frequently&lt;/em> despite working a demanding job at a startup - it&amp;rsquo;s game development. And you&amp;rsquo;re going to hear a lot more about that soon.&lt;/p>
&lt;p>Since March of last year, I began basically working two jobs: every day after I sign off from work at Sourcegraph, I spend around 8 hours working on game development.&lt;/p>
&lt;p>I am more determined than ever before, and success or fail - &lt;em>I will try.&lt;/em>&lt;/p>
&lt;h2 id="c-was-easier-for-me-as-a-beginner-than-c">C was easier for me as a beginner than C++&lt;/h2>
&lt;p>Hacking together games in Raydium&amp;rsquo;s C API taught me that C is hard, but also showed me in retrospect that if I had &lt;em>just a little bit more guidance&lt;/em>, If C was just &lt;em>slightly&lt;/em> easier, if I only knew the tricks of how to debug C programs: I would have been immensely successful in working with it.&lt;/p>
&lt;p>With Panda3D, writing some decent games in its Python API only to later find I needed to dive into this magical box of a complex C++ core made me believe that:&lt;/p>
&lt;ol>
&lt;li>&lt;strong>C++ is less beginner friendly than C.&lt;/strong> One major reason for this is due to the different C++ dialects: you&amp;rsquo;re not going to understand Panda3D C++, or Unreal C++ - by going and reading books about the language or taking a class. They create their own dialects through the language. Today with different C++ versions, even the textbooks and classes you find will be using different dialects.&lt;/li>
&lt;li>There are not good tutorials or explanations online about how game engines work and why. I regularly find that very experienced software engineers and even people who work in Unity or Unreal regularly, simply do not have a decent grasp of how game engines work. &amp;ldquo;What do you mean polygon count is not very important?!&amp;rdquo; are among the most basic questions that arise, with modern game engines abstracting away so many bits that your average developer merely says:&lt;/li>
&lt;/ol>
&lt;blockquote>
&lt;p>&amp;ldquo;Game engines are just magical ultra-complex things I could never even begin to understand! Only the professional AAA studios and god programmers like Jonathan Blow should even try to do that!&amp;rdquo;&lt;/p>
&lt;/blockquote>
&lt;p>I do not subscribe to this belief - and believe that most game developers &lt;em>have been robbed&lt;/em> of the proper end-to-end understanding of game engines they deserve.&lt;/p>
&lt;h2 id="unity-is-the-new-flash">Unity is the new Flash&lt;/h2>
&lt;p>You, dear reader, do not understand &lt;em>just how far the bar for game development has been lowered.&lt;/em>&lt;/p>
&lt;p>&lt;a href="https://www.youtube.com/watch?v=Nj8gt_92c-M">&lt;img class="color" src="https://devlog.hexops.org/img/2021/increasing-my-contribution-to-zig-to-200-a-month/img9.png">&lt;/a>&lt;/p>
&lt;p>Putting together a game in Unity is so beyond ridiculously easy today with Unity that it is incredible, the game engine is truly the new Adobe Flash equivalent.&lt;/p>
&lt;p>You could pick up that engine today, and have a silly little game you yourself put together the next.&lt;/p>
&lt;p>Of course, with Unity, comes large problems for serious game developers:&lt;/p>
&lt;ul>
&lt;li>There are &lt;em>so many&lt;/em> people hacking together Unity games that the quality of the information out there is quite bad.&lt;/li>
&lt;li>The quality of what is on the Unity asset store is quite bad.&lt;/li>
&lt;li>Unity encourages hacking things together to get a quick demo running - and it shows. Game developers hide the fact that they use Unity, because it has such a negative connotation with players that Unity == low quality.&lt;/li>
&lt;/ul>
&lt;h2 id="why-do-we-encourage-building-but-not-understanding">Why do we encourage building, but not understanding?&lt;/h2>
&lt;p>Game engines today are the epitome of &lt;em>large complex code-bases&lt;/em>:&lt;/p>
&lt;ul>
&lt;li>The people and companies working on them value features over quality.&lt;/li>
&lt;li>When there is a major issue, there are few people with an understanding of it to be found.&lt;/li>
&lt;li>Teaching people how to write good software is hard - and that&amp;rsquo;s our customer base (I imagine Unity/Unreal say) - far easier to give them something akin to a scripting language. It&amp;rsquo;s &lt;em>good&lt;/em> even if our users don&amp;rsquo;t understand how all of this works.&lt;/li>
&lt;/ul>
&lt;p>Teaching is hard, but if done right is invaluable. There is a reason NeHe Productions&amp;rsquo; OpenGL tutorials are still revered today: they are incremental, and teach in the form of building blocks on top of what you previously learned.&lt;/p>
&lt;p>There&amp;rsquo;s a reason many AAA studios simply &lt;em>throw out everything&lt;/em> and start from scratch when working on their next title.&lt;/p>
&lt;p>We encourage building new things, but not understanding existing things.&lt;/p>
&lt;h2 id="one-language-to-write-your-game-and-engine-in">One language to write your game and engine in&lt;/h2>
&lt;p>Scripting languages for game engines stem from multiple desires - the most common being some variant of:&lt;/p>
&lt;ul>
&lt;li>C++ is hard, but we need it for performance.&lt;/li>
&lt;li>My level designers can&amp;rsquo;t write C++ code!&lt;/li>
&lt;li>I cannot understand C++, but do know C#/Python/Java/etc.&lt;/li>
&lt;/ul>
&lt;p>A lot of people have a &lt;em>terrible&lt;/em> experience from school where they were taught C or C++, had absolutely no understanding of what was going on - and were told &amp;ldquo;This is programming!&amp;rdquo;&lt;/p>
&lt;p>I believe that in general, writing your game in a different language than the engine (Unity&amp;rsquo;s C#/C++ core model, Panda3D&amp;rsquo;s Python/C++ core model, and yes - perhaps even Unreal&amp;rsquo;s &lt;a href="https://blueprintsfromhell.tumblr.com/">Blueprints&lt;/a>/C++ core model - which I admit is the better of the three)&lt;/p>
&lt;p>&lt;a href="https://blueprintsfromhell.tumblr.com">&lt;img class="color" src="https://devlog.hexops.org/img/2021/increasing-my-contribution-to-zig-to-200-a-month/img10.png">&lt;/a>&lt;/p>
&lt;p>Pictured: The Unreal character controller blueprint for a game called &lt;a href="https://store.steampowered.com/app/1037260/Diacrisis/">Diacrisis&lt;/a>.&lt;/p>
&lt;p>Whether you have good code or bad code, good blueprints or bad blueprints - the truth is that having one part of your application in a completely different language &lt;em>creates a significant barrier to learning.&lt;/em> I believe that is a bad thing, and the long-term costs outweigh the benefits.&lt;/p>
&lt;h2 id="looking-for-the-one-language-to-rule-them-all">Looking for the one language to rule them all&lt;/h2>
&lt;h3 id="could-rust-be-it">Could Rust be it?&lt;/h3>
&lt;p>Initially, I spent a substantial amount of time considering Rust as that language. It&amp;rsquo;s offer of memory safety guarantees is extremely compelling to me.&lt;/p>
&lt;p>I even convinced us to adopt Rust at Sourcegraph in some form, our syntax highlighter is &lt;a href="https://github.com/sourcegraph/syntect_server">a little Rust HTTP server&lt;/a> that was basically write-and-forget. We haven&amp;rsquo;t maintained it at all, and it&amp;rsquo;s held up pretty well for over 5 years.&lt;/p>
&lt;p>But maintaining it has been &lt;em>brutal&lt;/em>. We mostly have Go developers there, and despite a strong desire from many of them to learn Rust really none of them have been able to successfully dive into the codebase and get started.&lt;/p>
&lt;p>Rust&amp;rsquo;s learning curve is &lt;em>steep&lt;/em>. Steeper than C++ in my view, and definitely steeper than C (despite its many, massive flaws.)&lt;/p>
&lt;p>I spent upwards of 6 months on-and-off trying to become proficient at writing Rust code, and I never really became productive: regularly stumbling across complex issues in downstream dependencies (often used by everyone, but maintained by no one in the rust-lang-nursery.)&lt;/p>
&lt;p>&lt;strong>I &lt;em>love&lt;/em> the idea of Rust. I love what it promises. And I kept going back to it on-and-off for over 6 months &lt;em>because I truly wanted to be able to be productive in it.&lt;/em>&lt;/strong>&lt;/p>
&lt;p>It didn&amp;rsquo;t work. &amp;ldquo;I&amp;rsquo;m just not smart enough to use this language&amp;rdquo; I often thought. And I fear this will be the takeaway of many who hear the promise of the language, only to discover another &amp;ldquo;I took a C++ class in school and it was terrible&amp;rdquo; experience, leading so many more developers to conclude &amp;ldquo;I&amp;rsquo;m not good enough for low-level programming, I should learn JavaScript instead&amp;rdquo;.&lt;/p>
&lt;h3 id="could-the-v-language-be-it">Could the V language be it?&lt;/h3>
&lt;p>&lt;strong>UPDATE:&lt;/strong> The V language author reached out and it would seem my memory was faulty about what happened here, this was due to a misunderstanding almost 100% on my side and I have falsely mis-characterized the V community here as being less friendly then they were in practice, and I am deeply sorry for that.&lt;/p>
&lt;p>I believe my criticisms below about the controversy surrounding the project and the secretive nature &lt;em>when it launched&lt;/em> are still valid, and were ultimately major factors in why I chose to not further consider it.&lt;/p>
&lt;p>At the same time, &lt;strong>I want to point out that V does not look the same as when it launched - and anybody who like me left due to those issues may do well &lt;a href="https://vlang.io/">to reconsider it today&lt;/a> as the project and details surrounding it appear to have changed substantially.&lt;/strong>&lt;/p>
&lt;p>What this section originally said was:&lt;/p>
&lt;blockquote>
&lt;p>When I heard about &lt;a href="https://news.ycombinator.com/item?id=25511073">the V programming language&lt;/a>, it seemed right on the spot.&lt;/p>
&lt;p>I immediately jumped into the community to chat with the author, despite the controversy surrounding it - and tried to get more info about it, how he was thinking of the language, etc.&lt;/p>
&lt;p>I asked if there were plans to support raw multi-line string literals, like Go. I was struck by a firm &amp;lsquo;No. Go doesn&amp;rsquo;t have raw string literals either.&amp;quot; - it was the unfriendly community I came across, the controversy surrounding it, and the &lt;em>secretive nature of the project&lt;/em> (&amp;ldquo;I have this, but I&amp;rsquo;m not going to share it yet&amp;rdquo;) that made me lose faith in its promise.&lt;/p>
&lt;p>This wasn&amp;rsquo;t a language whose community I could join and contribute to.&lt;/p>
&lt;/blockquote>
&lt;h3 id="could-i-build-it">Could I build it?&lt;/h3>
&lt;p>When the COVID-19 pandemic first hit, I thought to myself:&lt;/p>
&lt;blockquote>
&lt;p>If Go isn&amp;rsquo;t it, Rust isn&amp;rsquo;t it, the V language isn&amp;rsquo;t it - could I build it? Could I create the &amp;ldquo;better C&amp;rdquo; I am looking for? What would it look like?&lt;/p>
&lt;/blockquote>
&lt;p>4 months later, I had a pretty good picture. I had an early stages compiler for the language in Go using LLVM, and knew what I wanted in a &amp;ldquo;better C&amp;rdquo;. There was a &lt;em>long&lt;/em> road ahead, but I had a picture of it. Until..&lt;/p>
&lt;blockquote>
&lt;p>&lt;em>cat spills coffee on $2800 laptop, frying SSD with ~4 months of uncommitted work on EBNF parser generators&lt;/em> yeah.. no, that&amp;rsquo;s.. that&amp;rsquo;s okay, I wanted to rewrite all of that code. Yeah. This is fine.&lt;/p>
&lt;p>— Emi (@emidoots), May 27, 2020&lt;/p>
&lt;/blockquote>
&lt;p>Obviously, I was an idiot and should&amp;rsquo;ve just &lt;code>git push&lt;/code>d my code - or backed up my laptop - but nonetheless this was a setback.&lt;/p>
&lt;h2 id="discovering-zig">Discovering Zig&lt;/h2>
&lt;p>I continued to look for this mythical &amp;ldquo;better C&amp;rdquo; - and one name that kept arising in my sphere was &lt;a href="https://ziglang.org">Zig&lt;/a>.&lt;/p>
&lt;p>I didn&amp;rsquo;t pay much attention to it, until I shared it with my brother for the 3rd time:&lt;/p>
&lt;img class="color" src="https://devlog.hexops.org/img/2021/ando-thanks.jpg">
&lt;blockquote>
&lt;p>&amp;ldquo;&amp;hellip;I already shared this with you?&amp;rdquo;&lt;/p>
&lt;p>&amp;ldquo;I am really excited about this. It&amp;rsquo;s literally the language I was trying to build before I think&amp;rdquo;&lt;/p>
&lt;/blockquote>
&lt;h3 id="learning-zig">Learning Zig&lt;/h3>
&lt;p>In trying to learn Zig, there were two things that struck me:&lt;/p>
&lt;ul>
&lt;li>I could be productive in Zig right away. Transitioning from Go at work to Zig after-hours every day &lt;em>was easy.&lt;/em>&lt;/li>
&lt;li>The community was so friendly, inviting, and helpful in answering my questions.&lt;/li>
&lt;li>I continuously saw a theme of &amp;ldquo;this is a decentralized community, there is no &amp;lsquo;official&amp;rsquo; thing we&amp;rsquo;ll ever push onto you, we want everyone to contribute and &lt;em>truly be a part of this&lt;/em>&amp;rdquo;&lt;/li>
&lt;/ul>
&lt;p>Zig became the first open-source project I had &lt;em>ever&lt;/em> contributed to financially.&lt;/p>
&lt;blockquote>
&lt;p>And if none of the above convinces you, let me tell you the following: @ziglang is the first language I have felt strongly I should try and contribute to, and the ONLY open source project I have ever donated to. No other has been so compelling&lt;/p>
&lt;p>— Emi (@emidoots), October 23, 2020&lt;/p>
&lt;/blockquote>
&lt;h3 id="working-in-it">Working in it&lt;/h3>
&lt;p>Thus far, I&amp;rsquo;ve worked on two things in Zig:&lt;/p>
&lt;ul>
&lt;li>&lt;a href="https://github.com/hexops/xorfilter">an implementation of Xor Filters and Fuse Filters, which are faster and smaller than Bloom and Cuckoo filters and allow for quickly checking if a key is part of a set.&lt;/a>&lt;/li>
&lt;li>&lt;a href="https://devlog.hexops.org/2021/zig-parser-combinators-and-why-theyre-awesome">Zig, Parser Combinators - and Why They&amp;rsquo;re Awesome&lt;/a>&lt;/li>
&lt;/ul>
&lt;p>I continue to work in Zig daily, with no plans to stop - mark my words, this is an amazing language to work in.&lt;/p>
&lt;h3 id="the-community-is-incredible">The community is incredible&lt;/h3>
&lt;p>Over time, I watched and read more content from the Zig developers. It&amp;rsquo;s been beautiful to see:&lt;/p>
&lt;ul>
&lt;li>Them constantly, proactively advocate against zealotry of the language.&lt;/li>
&lt;li>Them constantly advocate for new members of the community to actually help others.&lt;/li>
&lt;/ul>
&lt;p>Not only that, but I began to notice the Zig foundation actually &lt;em>directly paying open source developers through donations&lt;/em>:&lt;/p>
&lt;ul>
&lt;li>&lt;a href="https://www.reddit.com/r/Zig/comments/fvfguq/please_welcome_vexu_to_the_core_zig_team/">Please welcome Vexu to the core Zig team&lt;/a>&lt;/li>
&lt;li>&lt;a href="https://www.reddit.com/r/Zig/comments/j2u1ww/please_welcome_jakub_konka_to_the_core_zig_team/">Please welcome Jakub Konka to the Core Zig Team&lt;/a>&lt;/li>
&lt;li>&lt;a href="https://www.reddit.com/r/Zig/comments/ixvjsf/please_welcome_alex_nask_to_the_core_zig_team/">Please welcome Alex Nask to the Core Zig Team&lt;/a>&lt;/li>
&lt;li>&lt;a href="https://www.reddit.com/r/Zig/comments/mgluix/please_welcome_frank_denis_to_the_core_zig_team/">Please welcome Frank Denis to the Core Zig Team&lt;/a>&lt;/li>
&lt;/ul>
&lt;blockquote>
&lt;p>ZSF is a small organization and makes efficient use of monetary resources. The plan is to keep it that way, but we do want to turn our unpaid volunteers into paid maintainers to help merge pull requests and make swifter progress towards 1.0. The whole point of ZSF being non-profit is to benefit people. &lt;strong>We’re trying to get open source maintainers paid for their time.&lt;/strong>&lt;/p>
&lt;/blockquote>
&lt;p>(from &lt;a href="https://ziglang.org/zsf">https://ziglang.org/zsf&lt;/a>)&lt;/p>
&lt;p>This is such a beautiful thing to see happening, and I hope that other open source communities take lessons from Zig here. The execution here is so important, and so far the Zig community&amp;rsquo;s execution has been incredible here.&lt;/p>
&lt;h3 id="my-commitment-to-zig">My commitment to Zig&lt;/h3>
&lt;p>For me, Zig ticks all the boxes of a programming language that could fundamentally upend the way that video games are built for the better.&lt;/p>
&lt;p>I want to see it succeed - and make it succeed at exactly that. Today, I raise my monthly contribution &lt;a href="https://github.com/sponsors/ziglang">on GitHub sponsors&lt;/a> to $200/mo. I would encourage anyone reading this to go and find ways to contribute (financially or not) to a vision you believe in.&lt;/p>
&lt;p>In addition to the above, I am committed to building the following in Zig:&lt;/p>
&lt;ul>
&lt;li>A game engine for the future&lt;/li>
&lt;li>Better developer tools (not just for game developers)&lt;/li>
&lt;li>Several real video games, which I believe can be competitive with what AAA studios offer today.&lt;/li>
&lt;/ul>
&lt;p>Thanks for reading my journey, and I hope you&amp;rsquo;ll consider following it in the future.&lt;/p></description></item><item><title>Zig, Parser Combinators - and Why They're Awesome</title><link>https://devlog.hexops.org/2021/zig-parser-combinators-and-why-theyre-awesome/</link><pubDate>Wed, 10 Mar 2021 00:00:00 +0000</pubDate><guid>https://devlog.hexops.com/2021/zig-parser-combinators-and-why-theyre-awesome/</guid><description>&lt;p>In this article we will be exploring what &lt;a href="https://en.wikipedia.org/wiki/Parser_combinator">parser combinators&lt;/a> are, what &lt;em>runtime parser generation&lt;/em> is - why they&amp;rsquo;re useful, and then walking through a &lt;a href="https://ziglang.org">Zig&lt;/a> implementation of them.&lt;/p>
&lt;ul>
&lt;li>&lt;a href="#what-are-parser-combinators">What are parser combinators?&lt;/a>&lt;/li>
&lt;li>&lt;a href="#why-are-parser-combinators-useful">Why are parser combinators useful?&lt;/a>&lt;/li>
&lt;li>&lt;a href="#going-deeper-runtime-parser-generation">Going deeper: &lt;em>runtime parser generation&lt;/em>&lt;/a>&lt;/li>
&lt;li>&lt;a href="#a-note-about-traditional-regex-engines">A note about traditional regex engines&lt;/a>&lt;/li>
&lt;li>&lt;a href="#implementing-the-parser-interface">Implementing the Parser interface&lt;/a>
&lt;ul>
&lt;li>&lt;a href="#compile-time-vs-run-time">Compile-time vs. run-time&lt;/a>&lt;/li>
&lt;li>&lt;a href="#the-parser-interface">The parser interface&lt;/a>&lt;/li>
&lt;li>&lt;a href="#zig-generics-are-provided-via-type-parameters">Zig generics are provided via type parameters&lt;/a>&lt;/li>
&lt;li>&lt;a href="#zig-runtime-interfaces">Zig runtime interfaces&lt;/a>&lt;/li>
&lt;li>&lt;a href="#type-parameters">Type parameters&lt;/a>&lt;/li>
&lt;li>&lt;a href="#errors-the-parser-interface-can-produce">Errors the Parser interface can produce&lt;/a>&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>&lt;a href="#our-first-parser">Our first Parser&lt;/a>
&lt;ul>
&lt;li>&lt;a href="#what-actually-is-a-reader">What actually is a &amp;ldquo;Reader&amp;rdquo;?&lt;/a>&lt;/li>
&lt;li>&lt;a href="#a-parser-that-parses-a-literal-string">A Parser that parses a literal string&lt;/a>&lt;/li>
&lt;li>&lt;a href="#passing-parameters-to-a-parser-implementation">Passing parameters to a parser implementation&lt;/a>&lt;/li>
&lt;li>&lt;a href="#understanding-zigs-wildconfusing-fieldparentptr">Understanding Zig&amp;rsquo;s wild/confusing &lt;code>@fieldParentPtr&lt;/code>&lt;/a>&lt;/li>
&lt;li>&lt;a href="#implementing-the-rest-of-parse">Implementing the rest of &lt;code>parse&lt;/code>&lt;/a>&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>&lt;a href="#our-first-parser-combinator">Our first &lt;em>parser combinator&lt;/em>&lt;/a>&lt;/li>
&lt;li>&lt;a href="#using-our-oneof-parser-combinator">Using our OneOf parser combinator&lt;/a>&lt;/li>
&lt;li>&lt;a href="#runtime-parser-generation">Runtime parser generation&lt;/a>&lt;/li>
&lt;li>&lt;a href="#closing-thoughts">Closing thoughts&lt;/a>&lt;/li>
&lt;/ul>
&lt;h2 id="what-are-parser-combinators">What are parser combinators?&lt;/h2>
&lt;p>A parser parses some text to produce a result:&lt;/p>
&lt;img class="color-auto" src="https://devlog.hexops.org/img/2021/zig-parser-combinators-and-why-theyre-awesome/img1.png">
&lt;p>A &lt;a href="https://en.wikipedia.org/wiki/Parser_combinator">parser combinator&lt;/a> is a &lt;a href="https://en.wikipedia.org/wiki/Higher-order_function">higher-order function&lt;/a> which &lt;em>takes parsers as input&lt;/em> and &lt;em>produces a new parser&lt;/em> as output:&lt;/p>
&lt;img class="color-auto" src="https://devlog.hexops.org/img/2021/zig-parser-combinators-and-why-theyre-awesome/img2.png">
&lt;h2 id="why-are-parser-combinators-useful">Why are parser combinators useful?&lt;/h2>
&lt;p>Let&amp;rsquo;s say we want to parse the syntax which describes a regular expression: &lt;code>a[bc].*abc&lt;/code>&lt;/p>
&lt;p>We can define some &lt;em>parsers&lt;/em> to help us parse this syntax (e.g. into tokens or AST nodes):&lt;/p>
&lt;img class="color-auto" src="https://devlog.hexops.org/img/2021/zig-parser-combinators-and-why-theyre-awesome/img3.png">
&lt;p>Suppose that for &lt;code>a[bc].*abc&lt;/code>:&lt;/p>
&lt;ul>
&lt;li>&lt;code>RegexLiteralParser&lt;/code> can parse &lt;code>a&lt;/code>, &lt;code>b&lt;/code>, and &lt;code>c&lt;/code>, but not &lt;code>abc&lt;/code> (the string.)&lt;/li>
&lt;li>&lt;code>RegexRangeOpenParser&lt;/code> can parse &lt;code>[&lt;/code>.&lt;/li>
&lt;li>&lt;code>RegexRangeCloseParser&lt;/code> can parse &lt;code>]&lt;/code>&lt;/li>
&lt;li>&lt;code>RegexAnyParser&lt;/code> can parse the &lt;code>.&lt;/code> &amp;ldquo;any character&amp;rdquo; syntax.&lt;/li>
&lt;li>&lt;code>RegexRepetitionParser&lt;/code> can parse the &lt;code>*&lt;/code> repetition operator.&lt;/li>
&lt;/ul>
&lt;p>Now that we have these &lt;em>parsers&lt;/em>, we can define &lt;em>parser combinators&lt;/em> to help us parse the full regular expression. First, we need something to parse a string &lt;code>abc&lt;/code> which we can define as:&lt;/p>
&lt;img class="color-auto" src="https://devlog.hexops.org/img/2021/zig-parser-combinators-and-why-theyre-awesome/img4.png">
&lt;p>What is &lt;code>OneOrMore&lt;/code>, though? That&amp;rsquo;s our first parser combinator!&lt;/p>
&lt;p>It takes a single parser as input (in this case, &lt;code>RegexLiteralParser&lt;/code>) and uses it to parse the input one or more times. If it succeeded once, the parser combinator succeeded. Otherwise, it failed to parse anything.&lt;/p>
&lt;p>Now if we want to parse the &lt;code>[bc]&lt;/code> part of our regex, let&amp;rsquo;s say it can only contain a literal like &lt;code>bc&lt;/code> (of course, real regex allows far more than this) we can e.g. reuse our new &lt;code>RegexStringLiteralParser&lt;/code>:&lt;/p>
&lt;img class="color-auto" src="https://devlog.hexops.org/img/2021/zig-parser-combinators-and-why-theyre-awesome/img5.png">
&lt;p>In this case, &lt;code>Sequence&lt;/code> is a parser combinator which takes multiple parsers and tries to parse them one-after-the-other in order, requiring all to succeed or failing otherwise.&lt;/p>
&lt;p>Building upon this basic idea, we can use parser combinators to build a full regex syntax parser:&lt;/p>
&lt;img class="color-auto" src="https://devlog.hexops.org/img/2021/zig-parser-combinators-and-why-theyre-awesome/img6.png">
&lt;h2 id="going-deeper-_runtime-parser-generation_">Going deeper: &lt;em>runtime parser generation&lt;/em>&lt;/h2>
&lt;p>From before, our &lt;em>parser combinator&lt;/em> &lt;code>RegexSyntaxParser&lt;/code> is built out of multiple parsers (&lt;code>Regex...Parser&lt;/code>) and ultimately produces an AST describing the syntax for a given regex.&lt;/p>
&lt;p>We can use the same combinatorial principle here to introduce a new &lt;em>parser generator&lt;/em> called &lt;code>RegexParser&lt;/code> which uses &lt;code>RegexSyntaxParser&lt;/code> to create a &lt;em>brand new parser at runtime&lt;/em> that is capable of parsing the actual semantics the regex describes - forming a full regex engine:&lt;/p>
&lt;img class="color-auto" src="https://devlog.hexops.org/img/2021/zig-parser-combinators-and-why-theyre-awesome/img7.png">
&lt;h2 id="a-note-about-traditional-regex-engines">A note about traditional regex engines&lt;/h2>
&lt;p>&lt;small>&lt;em>Revised Mar 10, 2021&lt;/em> to clarify a misunderstanding I had about about the difference between DFA and NFA regex engines. Thanks &lt;a href="https://news.ycombinator.com/item?id=26419048">@burntsushi&lt;/a> for helping me to learn!&lt;/small>&lt;/p>
&lt;p>Production grade regex engines are either &lt;em>finite automata based&lt;/em> or &lt;em>backtracking based&lt;/em>, and are described in great detail in &lt;a href="https://swtch.com/~rsc/regexp/regexp1.html">Russ Cox&amp;rsquo;s article here&lt;/a> and &lt;a href="https://swtch.com/~rsc/regexp/regexp2.html">his second article here&lt;/a> covering the virtual-machine approach commonly used in regex engines.&lt;/p>
&lt;p>It&amp;rsquo;s worth noting that combinatorial parsing and generating parsers at runtime is very much an &lt;em>uncommon&lt;/em> method of implementing a regular expression engine. This is &lt;em>somewhat&lt;/em> close to what &lt;a href="https://comby.dev">Comby&lt;/a> does in practice, although we use a runtime parser generator instead of parser parser combinators.&lt;/p>
&lt;p>One could argue this makes what we&amp;rsquo;re parsing not strictly &lt;em>regular expressions&lt;/em>, although as Larry Wall (author of the Perl programming language) &lt;a href="https://raku.org/archive/doc/design/apo/A05.html">writes&lt;/a>, neither are the modern &amp;ldquo;regexp&amp;rdquo; pattern matchers you are likely used to:&lt;/p>
&lt;blockquote>
&lt;p>&amp;ldquo;Regular expressions&amp;rdquo; […] are only marginally related to real regular expressions. Nevertheless, the term has grown with the capabilities of our pattern matching engines, so I&amp;rsquo;m not going to try to fight linguistic necessity here. I will, however, generally call them &amp;ldquo;regexes&amp;rdquo; (or &amp;ldquo;regexen&amp;rdquo;, when I&amp;rsquo;m in an Anglo-Saxon mood).&lt;/p>
&lt;/blockquote>
&lt;h2 id="implementing-the-parser-interface">Implementing the Parser interface&lt;/h2>
&lt;p>Parser combinators &lt;em>tend&lt;/em> to be written in higher-level languages with much fancier type-systems such as Haskell and OCaml, which lend themselves well to higher-order functions like parser combinators.&lt;/p>
&lt;p>We&amp;rsquo;ll be implementing this in &lt;a href="https://ziglang.org">Zig&lt;/a>, which is a new low-level language aiming to be a better C.&lt;/p>
&lt;h3 id="compile-time-vs-run-time">Compile-time vs. run-time&lt;/h3>
&lt;p>Zig has very cool &lt;a href="https://ziglang.org/documentation/master/#comptime">compile-time code execution semantics&lt;/a> which help provide its generics. We&amp;rsquo;ll be exploring these a bit, but since we want to ultimately &lt;em>build parser generators at runtime&lt;/em> (in order to execute a regexp) what we&amp;rsquo;ll be looking at is mostly &lt;em>runtime parser interfaces&lt;/em> rather than &lt;em>compile-time parser interfaces&lt;/em> (which are very much possible!)&lt;/p>
&lt;p>Since we&amp;rsquo;ll be dealing with heap allocations, our parser will not be able to run at comptime for now. Once &lt;a href="https://github.com/ziglang/zig/issues/1291">Zig gets comptime heap allocations&lt;/a> this should be possible and opens up interesting new opportunities.&lt;/p>
&lt;h3 id="the-parser-interface">The parser interface&lt;/h3>
&lt;p>We need an interface in Zig which describes a &lt;em>parser&lt;/em> as we previously mentioned:&lt;/p>
&lt;img class="color-auto" src="https://devlog.hexops.org/img/2021/zig-parser-combinators-and-why-theyre-awesome/img1.png">
&lt;p>Here it is - there&amp;rsquo;s a lot to unpack here so we&amp;rsquo;ll walk through it step-by-step:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="kr">pub&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">fn&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Parser&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="kr">comptime&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Value&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">type&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">comptime&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Reader&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">type&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">type&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="k">return&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">struct&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Self&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="nb">@This&lt;/span>&lt;span class="p">();&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">_parse&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">fn&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">self&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">*&lt;/span>&lt;span class="n">Self&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">allocator&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">*&lt;/span>&lt;span class="n">Allocator&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">src&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">*&lt;/span>&lt;span class="n">Reader&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">callconv&lt;/span>&lt;span class="p">(.&lt;/span>&lt;span class="n">Inline&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Error&lt;/span>&lt;span class="o">!?&lt;/span>&lt;span class="n">Value&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="kr">pub&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">fn&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">parse&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">self&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">*&lt;/span>&lt;span class="n">Self&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">allocator&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">*&lt;/span>&lt;span class="n">Allocator&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">src&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">*&lt;/span>&lt;span class="n">Reader&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">callconv&lt;/span>&lt;span class="p">(.&lt;/span>&lt;span class="n">Inline&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Error&lt;/span>&lt;span class="o">!?&lt;/span>&lt;span class="n">Value&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="k">return&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">self&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">_parse&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">self&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">allocator&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">src&lt;/span>&lt;span class="p">);&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">}&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">};&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="p">}&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;h3 id="zig-generics-are-provided-via-type-parameters">Zig generics are provided via type parameters&lt;/h3>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="kr">pub&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">fn&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Parser&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="kr">comptime&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Value&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">type&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">comptime&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Reader&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">type&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">type&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="k">return&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">struct&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">...&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">};&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="p">}&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>This is a Zig function which takes two arbitrary &lt;code>type&lt;/code> arguments at &lt;code>comptime&lt;/code>, named &lt;code>Value&lt;/code> and &lt;code>Reader&lt;/code>. Uppercase is used to denote the name of a type in Zig. Thes are:&lt;/p>
&lt;ul>
&lt;li>&lt;code>Value&lt;/code> will be the type of the actual value that the parser will produce (e.g. a string of matched text, or an AST note.)&lt;/li>
&lt;li>&lt;code>Reader&lt;/code> will be the type of the actual source of the raw text to parse (we&amp;rsquo;ll cover this more later.)&lt;/li>
&lt;/ul>
&lt;p>The function itself &lt;em>returns a new type&lt;/em>.&lt;/p>
&lt;p>What we&amp;rsquo;re seeing here is the key way in which &lt;a href="https://ziglang.org/documentation/master/#Generic-Data-Structures">Zig approaches generic data structures&lt;/a>: you merely pass around types as parameters - as if they were values - and you write functions which take types as parameters and return types as values. Some examples of valid calls to this function are:&lt;/p>
&lt;ul>
&lt;li>&lt;code>Parser(u8, []u8)&lt;/code> where &lt;code>u8&lt;/code> is an unsigned 8-bit integer and &lt;code>[]u8&lt;/code> is a slice of unsigned 8-bit integers.&lt;/li>
&lt;li>&lt;code>Parser([]const u8, @TypeOf(reader))&lt;/code> where &lt;code>[]const u8&lt;/code> is describing a slice of UTF-8 text (a string) and &lt;code>reader&lt;/code> is some reader type, such as &lt;code>std.io.fixedBufferStream(&amp;quot;foobar&amp;quot;)&lt;/code>.&lt;/li>
&lt;/ul>
&lt;h3 id="zig-runtime-interfaces">Zig runtime interfaces&lt;/h3>
&lt;p>Now, since we&amp;rsquo;re trying to define an interface whose actual implementation can be swapped out &lt;em>at runtime&lt;/em> - what we need is pretty simple:&lt;/p>
&lt;ul>
&lt;li>A &lt;code>struct&lt;/code> type which has the methods we want every implementation to provide.&lt;/li>
&lt;li>Those methods to &lt;em>call function pointers&lt;/em> which are defined as &lt;em>fields&lt;/em> of our struct.&lt;/li>
&lt;/ul>
&lt;p>Basically, if someone wants to implement our interface they just need to create a new instance of &lt;code>Parser&lt;/code> and populate the fields (callbacks) so their implementation is called when the interface is used.&lt;/p>
&lt;p>This is the same pattern used by the Zig &lt;a href="https://sourcegraph.com/github.com/ziglang/zig/-/blob/lib/std/mem/Allocator.zig">&lt;code>std.mem.Allocator&lt;/code> interface&lt;/a>.&lt;/p>
&lt;p>In our case here, the returned struct has a method that consumers of the interface would invoke called &lt;code>parse&lt;/code> - and the function pointer field that implementors will set to get a callback is the &lt;code>_parse&lt;/code> field:&lt;/p>
&lt;img class="color-auto" src="https://devlog.hexops.org/img/2021/zig-parser-combinators-and-why-theyre-awesome/img8.png">
&lt;h3 id="type-parameters">Type parameters&lt;/h3>
&lt;p>Let&amp;rsquo;s look at some of the data types going around here:&lt;/p>
&lt;img class="color-auto" src="https://devlog.hexops.org/img/2021/zig-parser-combinators-and-why-theyre-awesome/img9.png">
&lt;p>A few other notes:&lt;/p>
&lt;ul>
&lt;li>&lt;code>Error!?Value&lt;/code> is just describing the function can return an &lt;code>Error&lt;/code> OR no value OR a &lt;code>Value&lt;/code> type. See Zig&amp;rsquo;s &lt;a href="https://ziglang.org/documentation/master/#Error-Union-Type">error union types&lt;/a> and &lt;a href="https://ziglang.org/documentation/master/#Optionals">optional types&lt;/a>.&lt;/li>
&lt;li>&lt;code>callconv(.Inline)&lt;/code> is just telling the compiler to inline the function call - since our function isn&amp;rsquo;t doing a ton.&lt;/li>
&lt;/ul>
&lt;h3 id="errors-the-parser-interface-can-produce">Errors the Parser interface can produce&lt;/h3>
&lt;p>Our error type might start out looking something like this:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="kr">pub&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Error&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">error&lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">EndOfStream&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="p">}&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">||&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">std&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">mem&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">Allocator&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">Error&lt;/span>&lt;span class="p">;&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>&lt;code>error{...}&lt;/code> describes &lt;a href="https://ziglang.org/documentation/master/#Error-Set-Type">a set of potential errors&lt;/a> and &lt;code>|| std.mem.Allocator.Error&lt;/code> merely says to &lt;em>merge&lt;/em> the allocator type&amp;rsquo;s error set with ours - so our potential set of errors includes &lt;em>ours and theirs&lt;/em>.&lt;/p>
&lt;p>As we start performing different operations within parsers, it will become more complex to describe more potential sources of errors:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="kr">pub&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Error&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">error&lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">EndOfStream&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">Utf8InvalidStartByte&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="p">}&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">||&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">std&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">fs&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">File&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">ReadError&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="o">||&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">std&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">fs&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">File&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">SeekError&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="o">||&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">std&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">mem&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">Allocator&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">Error&lt;/span>&lt;span class="p">;&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Zig can often &lt;a href="https://ziglang.org/documentation/master/#Inferred-Error-Sets">infer error sets&lt;/a> but only in some contexts today.&lt;/p>
&lt;h2 id="our-first-parser">Our first Parser&lt;/h2>
&lt;p>All we need to do in order to implement a &lt;code>Parser&lt;/code> is provide the &lt;code>_parse&lt;/code> method, and define its return &lt;code>Value&lt;/code> type and &lt;code>Reader&lt;/code> input type:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">parser&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Parser&lt;/span>&lt;span class="p">([]&lt;/span>&lt;span class="kt">u8&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="nb">@TypeOf&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">reader&lt;/span>&lt;span class="p">))&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">.{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">_parse&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">myParse&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="p">};&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>In the above, the type &lt;code>T&lt;/code> in &lt;code>const parser: T&lt;/code> is denoting the type of the constant named &lt;code>parser&lt;/code> - in this case it&amp;rsquo;ll be the type returned by &lt;code>Parser([]u8, @TypeOf(reader))&lt;/code>. And this:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="n">something&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">.{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">_parse&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">myParse&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="p">}&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Is the Zig syntax for populating a struct. We&amp;rsquo;re setting the &lt;code>_parse&lt;/code> field to &lt;code>myParse&lt;/code>. Zig can infer the type of the struct if you write a &lt;code>.{}&lt;/code> instead of &lt;code>T{}&lt;/code> - which avoids the need for us to repeat the call to the &lt;code>Parser()&lt;/code> function which is verbose.&lt;/p>
&lt;h3 id="what-actually-is-a-reader">What actually is a &amp;ldquo;Reader&amp;rdquo;?&lt;/h3>
&lt;p>Up to this point, we&amp;rsquo;ve just talked about &lt;code>Reader&lt;/code> as being &lt;em>any type&lt;/em>.&lt;/p>
&lt;p>Similar to our &lt;code>Parser&lt;/code> interface, the Zig standard library &lt;a href="https://sourcegraph.com/github.com/ziglang/zig@f2b96782ecdc9e2f8740eb7d294203b2a585ea52/-/blob/lib/std/io/reader.zig#L13-20">provides a &lt;code>std.io.Reader&lt;/code> interface&lt;/a> and there are &lt;a href="https://sourcegraph.com/search?q=repo:%5Egithub%5C.com/ziglang/zig%24+file:%5Elib/std/+fn+reader%28&amp;amp;patternType=literal">many implementors of it&lt;/a> including:&lt;/p>
&lt;ul>
&lt;li>&lt;code>std.fs.File&lt;/code>&lt;/li>
&lt;li>&lt;code>std.io.fixedBufferStream(&amp;quot;foobar&amp;quot;)&lt;/code>&lt;/li>
&lt;li>&lt;code>std.net.Stream&lt;/code> (network sockets)&lt;/li>
&lt;/ul>
&lt;p>However, in contrast to our &lt;code>Parser&lt;/code> type which invokes &lt;em>function pointers&lt;/em> at runtime, the &lt;code>std.io.Reader&lt;/code> interface is a &lt;em>compile time type&lt;/em> - meaning calls to the underlying implementation do not involve a pointer dereference.&lt;/p>
&lt;p>Today, Zig is in early stages (version 0.7) and does not have anything like an interface or trait type (although &lt;a href="https://github.com/ziglang/zig/issues/1268">it seems likely this will be improved in the future&lt;/a>.)&lt;/p>
&lt;p>This means that, for now, we cannot simply define our function as accepting &lt;em>only&lt;/em> an &lt;code>std.io.Reader&lt;/code> interface - instead we must declare that we accept &lt;em>any type&lt;/em> which we&amp;rsquo;ll call &lt;code>Reader&lt;/code>, write our code &lt;em>as if it is an &lt;code>std.io.Reader&lt;/code>&lt;/em> - and the compiler will just barf if anybody passes something in that &lt;em>isn&amp;rsquo;t&lt;/em> an &lt;code>std.io.Reader&lt;/code>. This can sometimes lead to confusing compiler error messages (&amp;ldquo;there&amp;rsquo;s an error in the standard library code? Ah, no, I just needed to pass a &lt;code>.reader()&lt;/code>!&amp;rdquo;).&lt;/p>
&lt;h3 id="a-parser-that-parses-a-literal-string">A Parser that parses a literal string&lt;/h3>
&lt;p>If we want a &lt;code>Parser&lt;/code> interface implementation that parses a specific string literal, one way to do that is to also make that a generic function which accepts &lt;em>any&lt;/em> reader type (so we&amp;rsquo;re not restricted to e.g. just file inputs):&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="kr">pub&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">fn&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Literal&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="kr">comptime&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Reader&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">type&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">type&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="k">return&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">struct&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="c1">// TODO
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">};&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="p">}&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>This is pretty good - but we need some way to have the type we return &lt;em>implement&lt;/em> the &lt;code>Parser&lt;/code> interface we defined. The way to do this is by defining a field in our struct:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="kr">pub&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">fn&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Literal&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="kr">comptime&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Reader&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">type&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">type&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="k">return&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">struct&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">parser&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Parser&lt;/span>&lt;span class="p">([]&lt;/span>&lt;span class="kt">u8&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Reader&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">.{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">_parse&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">parse&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">},&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">};&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="p">}&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Now a consumer can write the following to get a literal string parser:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">parser&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Literal&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="nb">@TypeOf&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">reader&lt;/span>&lt;span class="p">)).&lt;/span>&lt;span class="n">parser&lt;/span>&lt;span class="p">;&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;h3 id="passing-parameters-to-a-parser-implementation">Passing parameters to a parser implementation&lt;/h3>
&lt;p>If we want our &lt;code>Literal&lt;/code> parser to accept a parameter &amp;ndash; the literal string to look for &amp;ndash; we need to give it a parameter.&lt;/p>
&lt;p>In the case of merely passing it a string, we &lt;em>could&lt;/em> adjust the signature so that this is possible:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">parser&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Literal&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="s">&amp;#34;some string&amp;#34;&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="nb">@TypeOf&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">reader&lt;/span>&lt;span class="p">)).&lt;/span>&lt;span class="n">parser&lt;/span>&lt;span class="p">;&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>However, we&amp;rsquo;ll define ours using an &lt;code>init&lt;/code> method which is more common in Zig data structures:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="kr">pub&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">fn&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Literal&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="kr">comptime&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Reader&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">type&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">type&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="k">return&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">struct&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">parser&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Parser&lt;/span>&lt;span class="p">([]&lt;/span>&lt;span class="kt">u8&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Reader&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">.{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">_parse&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">parse&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">},&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">want&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">[]&lt;/span>&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">u8&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Self&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="nb">@This&lt;/span>&lt;span class="p">();&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="c1">// The `want` string must stay alive for as long as the parser will be used.
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">pub&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">fn&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">init&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">want&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">[]&lt;/span>&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">u8&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Self&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="k">return&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Self&lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">want&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">want&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">};&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">}&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">};&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="p">}&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>In this case, &lt;code>want&lt;/code> is the string literal we want to match - and &lt;code>[]const u8&lt;/code> is Zig&amp;rsquo;s string type. It describes a slice of immutable (non-modifiable) encoded UTF-8 bytes.&lt;/p>
&lt;p>Unlike C, &lt;code>[]const u8&lt;/code> being a slice means it is &lt;em>a pointer to the string in memory and its length&lt;/em> - so we don&amp;rsquo;t have to pass around the length parameter separately or use a null-terminated string. In Zig, there are two ways to represent a string:&lt;/p>
&lt;ul>
&lt;li>&lt;code>[]const u8&lt;/code> (unmodifiable string, most common)&lt;/li>
&lt;li>&lt;code>[]u8&lt;/code> (modifiable string)&lt;/li>
&lt;/ul>
&lt;h3 id="understanding-zigs-wildconfusing-fieldparentptr">Understanding Zig&amp;rsquo;s wild/confusing &lt;code>@fieldParentPtr&lt;/code>&lt;/h3>
&lt;p>We&amp;rsquo;re finally ready to actually have our &lt;code>Literal&lt;/code> parser &lt;em>parse&lt;/em> something! We just need to implement our &lt;code>parse&lt;/code> method:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="kr">pub&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">fn&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Literal&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="kr">comptime&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Reader&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">type&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">type&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="k">return&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">struct&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">parser&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Parser&lt;/span>&lt;span class="p">([]&lt;/span>&lt;span class="kt">u8&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Reader&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">.{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">_parse&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">parse&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">},&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">want&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">[]&lt;/span>&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">u8&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">...&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Self&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="nb">@This&lt;/span>&lt;span class="p">();&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="k">fn&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">parse&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">parser&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">*&lt;/span>&lt;span class="n">Parser&lt;/span>&lt;span class="p">([]&lt;/span>&lt;span class="kt">u8&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Reader&lt;/span>&lt;span class="p">),&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">allocator&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">*&lt;/span>&lt;span class="n">Allocator&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">src&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">*&lt;/span>&lt;span class="n">Reader&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">callconv&lt;/span>&lt;span class="p">(.&lt;/span>&lt;span class="n">Inline&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Error&lt;/span>&lt;span class="o">!?&lt;/span>&lt;span class="p">[]&lt;/span>&lt;span class="kt">u8&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">self&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="nb">@fieldParentPtr&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">Self&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="s">&amp;#34;parser&amp;#34;&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">parser&lt;/span>&lt;span class="p">);&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">...&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">}&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">};&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="p">}&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>But wait a minute! In order for the &lt;code>._parse = parse,&lt;/code> assignment to work the first argument to &lt;code>parse&lt;/code> needs to be the &lt;code>self&lt;/code> parameter for a &lt;code>Parser([]u8, Reader)&lt;/code> - so how does &lt;em>our&lt;/em> &lt;code>parse&lt;/code> implementation method get to access the &lt;code>want&lt;/code> field of our struct?&lt;/p>
&lt;p>This is where some Zig magic comes in: on obscure builtin function we can use inside of our &lt;code>parse&lt;/code> method:&lt;/p>
&lt;pre tabindex="0">&lt;code>const self = @fieldParentPtr(Self, &amp;#34;parser&amp;#34;, parser);
&lt;/code>&lt;/pre>&lt;p>To understand this, first let&amp;rsquo;s get a look at what these parameters are referring to:&lt;/p>
&lt;img class="color-auto" src="https://devlog.hexops.org/img/2021/zig-parser-combinators-and-why-theyre-awesome/img10.png">
&lt;p>We can see from the Zig documentation that this function operates as follows:&lt;/p>
&lt;blockquote>
&lt;p>Given a pointer to a field, returns the base pointer of a struct.&lt;/p>
&lt;/blockquote>
&lt;p>So in our case:&lt;/p>
&lt;ul>
&lt;li>&lt;code>Self&lt;/code> is the &amp;ldquo;parent struct&amp;rdquo; we&amp;rsquo;re trying to acquire a reference to (our type)&lt;/li>
&lt;li>&lt;code>&amp;quot;parser&amp;quot;&lt;/code> is the name of our struct&amp;rsquo;s field.&lt;/li>
&lt;li>&lt;code>parser&lt;/code> is the &lt;em>pointer to our &lt;code>parser&lt;/code> struct field&lt;/em>.&lt;/li>
&lt;/ul>
&lt;p>Hopefully you can start to see the link here: &lt;code>parser&lt;/code> is a pointer to &lt;em>our struct field&lt;/em>, so Zig has a little helper &lt;code>@fieldParentPtr&lt;/code> which can rely on that fact to give us &lt;em>our struct&lt;/em> given a pointer to &lt;em>our struct field&lt;/em>.&lt;/p>
&lt;h3 id="implementing-the-rest-of-parse">Implementing the rest of &lt;code>parse&lt;/code>&lt;/h3>
&lt;p>Our full &lt;code>parse&lt;/code> method will look like this:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="c1">// If a value is returned, it is up to the caller to free it.
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="k">fn&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">parse&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">parser&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">*&lt;/span>&lt;span class="n">Parser&lt;/span>&lt;span class="p">([]&lt;/span>&lt;span class="kt">u8&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Reader&lt;/span>&lt;span class="p">),&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">allocator&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">*&lt;/span>&lt;span class="n">Allocator&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">src&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">*&lt;/span>&lt;span class="n">Reader&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">callconv&lt;/span>&lt;span class="p">(.&lt;/span>&lt;span class="n">Inline&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Error&lt;/span>&lt;span class="o">!?&lt;/span>&lt;span class="p">[]&lt;/span>&lt;span class="kt">u8&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">self&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="nb">@fieldParentPtr&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">Self&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="s">&amp;#34;parser&amp;#34;&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">parser&lt;/span>&lt;span class="p">);&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">buf&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">try&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">allocator&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">alloc&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="kt">u8&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">self&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">want&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">len&lt;/span>&lt;span class="p">);&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="k">errdefer&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">allocator&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">free&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">buf&lt;/span>&lt;span class="p">);&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">read&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">try&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">src&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">reader&lt;/span>&lt;span class="p">().&lt;/span>&lt;span class="n">readAll&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">buf&lt;/span>&lt;span class="p">);&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="k">if&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">read&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">&amp;lt;&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">self&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">want&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">len&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">or&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">!&lt;/span>&lt;span class="n">std&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">mem&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">eql&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="kt">u8&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">buf&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">self&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">want&lt;/span>&lt;span class="p">))&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="k">try&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">src&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">seekableStream&lt;/span>&lt;span class="p">().&lt;/span>&lt;span class="n">seekBy&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="o">-&lt;/span>&lt;span class="nb">@intCast&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="kt">i64&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">read&lt;/span>&lt;span class="p">));&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">allocator&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">free&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">buf&lt;/span>&lt;span class="p">);&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="c1">// parsing failed
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">return&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kc">null&lt;/span>&lt;span class="p">;&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">}&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="k">return&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">buf&lt;/span>&lt;span class="p">;&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="p">}&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>There are a few notable things here:&lt;/p>
&lt;ul>
&lt;li>We&amp;rsquo;re trying to return a string from our &lt;code>parse&lt;/code> function, i.e. the value it emits is a string (instead of an AST node).&lt;/li>
&lt;li>The &lt;code>want&lt;/code> string we &lt;em>got&lt;/em> inside of our &lt;code>init&lt;/code> method is agreed to only be valid while &lt;code>parse&lt;/code> will still be called. We&amp;rsquo;ve decided to create a contract that all of our &lt;code>Parser&lt;/code> implementations will either not hold onto memory given by others - or if they do, only do so until &lt;code>parse&lt;/code> returns. Hence, we need to allocate a new string in our method.&lt;/li>
&lt;li>Normally we could rely solely on &lt;code>defer&lt;/code> (&amp;ldquo;run at end of function&amp;rdquo;) or &lt;code>errdefer&lt;/code> (&amp;ldquo;run if an error is returned&amp;rdquo;), but since we&amp;rsquo;ve chosen to reserve the &lt;em>none optional&lt;/em> &lt;code>null&lt;/code> as &amp;ldquo;we didn&amp;rsquo;t parse anything&amp;rdquo; we need to manually free if we &lt;code>return null;&lt;/code>. A &lt;code>nulldefer&lt;/code> and &lt;code>somedefer&lt;/code> could be nice, maybe?&lt;/li>
&lt;/ul>
&lt;p>Putting it all together, you&amp;rsquo;ll get something like this: &lt;a href="https://gist.github.com/emidoots/8f098a13177b4bc008a7741505819f90">GitHub gist&lt;/a>.&lt;/p>
&lt;h2 id="our-first-_parser-combinator_">Our first &lt;em>parser combinator&lt;/em>&lt;/h2>
&lt;p>To demonstrate how a &lt;em>parser combinator&lt;/em> would be implemented, we&amp;rsquo;ll try implementing the &lt;code>OneOf&lt;/code> operator. It will take any number of &lt;em>parsers&lt;/em> as input and run them consecutively until one succeeds or none do.&lt;/p>
&lt;p>Let&amp;rsquo;s first start by writing out the basic structure of our function:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="kr">pub&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">fn&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">OneOf&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="kr">comptime&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Value&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">type&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">comptime&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Reader&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">type&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">type&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="k">return&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">struct&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">parser&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Parser&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">Value&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Reader&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">.{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">_parse&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">parse&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">},&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">...&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">};&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="p">}&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>You&amp;rsquo;ll notice here that in contrast to our &lt;code>Literal&lt;/code> &lt;em>parser&lt;/em> function from earlier, this function takes a second &lt;code>comptime Value: type&lt;/code> parameter. This is because we want it to work with any existing &lt;code>Parser&lt;/code> implementation, regardless of what type of value it produces.&lt;/p>
&lt;p>We can start to fill in the type by adding our &lt;code>init&lt;/code> method:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="kr">pub&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">fn&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">OneOf&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="kr">comptime&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Value&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">type&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">comptime&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Reader&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">type&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">type&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="k">return&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">struct&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">parser&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Parser&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">Value&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Reader&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">.{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">_parse&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">parse&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">},&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">parsers&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">[]&lt;/span>&lt;span class="o">*&lt;/span>&lt;span class="n">Parser&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">Value&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Reader&lt;/span>&lt;span class="p">),&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Self&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="nb">@This&lt;/span>&lt;span class="p">();&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="c1">// `parsers` slice must stay alive for as long as the parser will be
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="c1">// used.
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">pub&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">fn&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">init&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">parsers&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">[]&lt;/span>&lt;span class="o">*&lt;/span>&lt;span class="n">Parser&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">Value&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Reader&lt;/span>&lt;span class="p">))&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Self&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="k">return&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Self&lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">parsers&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">parsers&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">};&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">}&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">};&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="p">}&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>As you can see here, we&amp;rsquo;re going to simply take in a list of pointers to parsers. They&amp;rsquo;ll all need to have the same return &lt;code>Value&lt;/code> as was specified in the call to &lt;code>OneOf&lt;/code>.&lt;/p>
&lt;p>One reason for this is that &lt;a href="https://github.com/ziglang/zig/issues/447">Zig does not support &lt;em>return type inference&lt;/em>&lt;/a>. You can have a function which takes &lt;code>anytype&lt;/code> as a parameter, but it cannot return an &lt;code>anytype&lt;/code>. This just means we need to have a generic function (in this case, &lt;code>OneOf&lt;/code>) which accepts a type parameter and then use that &lt;code>Value&lt;/code> type later. In a language like Haskell or OCaml, this would not be true.&lt;/p>
&lt;p>Finally, we can implement our &lt;code>parse&lt;/code> method:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="kr">pub&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">fn&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">OneOf&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="kr">comptime&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Value&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">type&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kr">comptime&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Reader&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">type&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kt">type&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="k">return&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">struct&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">...&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="c1">// Caller is responsible for freeing the value, if any.
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">fn&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">parse&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">parser&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">*&lt;/span>&lt;span class="n">Parser&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">Value&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Reader&lt;/span>&lt;span class="p">),&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">allocator&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">*&lt;/span>&lt;span class="n">Allocator&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">src&lt;/span>&lt;span class="o">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">*&lt;/span>&lt;span class="n">Reader&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">callconv&lt;/span>&lt;span class="p">(.&lt;/span>&lt;span class="n">Inline&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Error&lt;/span>&lt;span class="o">!?&lt;/span>&lt;span class="n">Value&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">self&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="nb">@fieldParentPtr&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">Self&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="s">&amp;#34;parser&amp;#34;&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">parser&lt;/span>&lt;span class="p">);&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="k">for&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">self&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">parsers&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">|&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">one_of_parser&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">|&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="kr">const&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">result&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">try&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">one_of_parser&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">parse&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">allocator&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">src&lt;/span>&lt;span class="p">);&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="k">if&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">result&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">!=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kc">null&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="k">return&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">result&lt;/span>&lt;span class="p">;&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">}&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">}&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="k">return&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kc">null&lt;/span>&lt;span class="p">;&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">}&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">};&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="p">}&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>There are a few things to unpack here:&lt;/p>
&lt;ul>
&lt;li>&lt;code>try one_of_parser.parse(allocator, src);&lt;/code> indicates that if parsing using &lt;code>one_of_parser&lt;/code> returns an &lt;em>error&lt;/em> that our function should return immediately and not continue attempting to parse with other parsers.&lt;/li>
&lt;li>&lt;code>if (result != null) {&lt;/code> is how you check if an Optional type in Zig is &amp;ldquo;None&amp;rdquo;. I find this pretty interesting: it&amp;rsquo;s not &lt;code>null&lt;/code>, it&amp;rsquo;s actually an optional &amp;ldquo;none&amp;rdquo; type - but it is called &lt;code>null&lt;/code>. I&amp;rsquo;m not sure why, but can imagine this making the language friendlier to people unfamiliar with optional types.&lt;/li>
&lt;/ul>
&lt;h2 id="using-our-oneof-parser-combinator">Using our OneOf parser combinator&lt;/h2>
&lt;p>Now for the cool part: we get to put both our &lt;code>Literal&lt;/code> parser and &lt;code>OneOf&lt;/code> parser combinator to &lt;em>build a new parser&lt;/em>!&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="c1">// Define our parser.
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="kr">var&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">one_of&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">OneOf&lt;/span>&lt;span class="p">([]&lt;/span>&lt;span class="kt">u8&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="nb">@TypeOf&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">reader&lt;/span>&lt;span class="p">)).&lt;/span>&lt;span class="n">init&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="o">&amp;amp;&lt;/span>&lt;span class="p">.{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="o">&amp;amp;&lt;/span>&lt;span class="n">Literal&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="nb">@TypeOf&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">reader&lt;/span>&lt;span class="p">)).&lt;/span>&lt;span class="n">init&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="s">&amp;#34;dog&amp;#34;&lt;/span>&lt;span class="p">).&lt;/span>&lt;span class="n">parser&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="o">&amp;amp;&lt;/span>&lt;span class="n">Literal&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="nb">@TypeOf&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">reader&lt;/span>&lt;span class="p">)).&lt;/span>&lt;span class="n">init&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="s">&amp;#34;sheep&amp;#34;&lt;/span>&lt;span class="p">).&lt;/span>&lt;span class="n">parser&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="o">&amp;amp;&lt;/span>&lt;span class="n">Literal&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="nb">@TypeOf&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">reader&lt;/span>&lt;span class="p">)).&lt;/span>&lt;span class="n">init&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="s">&amp;#34;cat&amp;#34;&lt;/span>&lt;span class="p">).&lt;/span>&lt;span class="n">parser&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="p">});&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>The above will parse one of &lt;code>&amp;quot;dog&amp;quot;&lt;/code>, &lt;code>&amp;quot;sheep&amp;quot;&lt;/code>, or &lt;code>&amp;quot;cat&amp;quot;&lt;/code> from the input reader.&lt;/p>
&lt;p>We&amp;rsquo;re passing &lt;code>@TypeOf(reader)&lt;/code> frequently above which makes the code a bit more cryptic than needed, and it would be possible to introduce a &lt;code>OneOfLiteral&lt;/code> helper which makes the above instead read:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="c1">// Define our parser.
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="kr">var&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">one_of&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">OneOfLiteral&lt;/span>&lt;span class="p">([]&lt;/span>&lt;span class="kt">u8&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="nb">@TypeOf&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">reader&lt;/span>&lt;span class="p">)).&lt;/span>&lt;span class="n">init&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="o">&amp;amp;&lt;/span>&lt;span class="p">.{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="s">&amp;#34;dog&amp;#34;&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="s">&amp;#34;cat&amp;#34;&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="s">&amp;#34;sheep&amp;#34;&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="p">});&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>One thing to unpack here is this syntax for passing an array to &lt;code>init&lt;/code>: &lt;code>&amp;amp;.{...}&lt;/code>:&lt;/p>
&lt;ul>
&lt;li>The function takes a parameter &lt;code>parsers: []*Parser(Value, Reader)&lt;/code>&lt;/li>
&lt;li>&lt;code>.{...}&lt;/code> would give us &lt;em>a fixed size array&lt;/em> &lt;code>[3]*Parser(Value, Reader)&lt;/code>&lt;/li>
&lt;li>&lt;code>&amp;amp;.{}&lt;/code> gives us a pointer to an array, i.e. &lt;em>a slice&lt;/em> &lt;code>[]*Parser(Value, Reader)&lt;/code>.&lt;/li>
&lt;/ul>
&lt;p>Since our list is known at compile time, we don&amp;rsquo;t have to allocate or free memory for the slice. If our list was dynamic, we would need to do so.&lt;/p>
&lt;p>Finally, we can actually use our parser above:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-zig" data-lang="zig">&lt;span class="line">&lt;span class="cl">&lt;span class="kr">var&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">p&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">&amp;amp;&lt;/span>&lt;span class="n">one_of&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">parser&lt;/span>&lt;span class="p">;&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="kr">var&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">result&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">try&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">p&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">parse&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">allocator&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">&amp;amp;&lt;/span>&lt;span class="n">reader&lt;/span>&lt;span class="p">);&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="n">std&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">testing&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">expectEqualStrings&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="s">&amp;#34;cat&amp;#34;&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">result&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="o">?&lt;/span>&lt;span class="p">);&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="k">if&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">result&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">|&lt;/span>&lt;span class="n">r&lt;/span>&lt;span class="o">|&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">allocator&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">free&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">r&lt;/span>&lt;span class="p">);&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="p">}&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;h2 id="runtime-parser-generation">Runtime parser generation&lt;/h2>
&lt;p>You might be wondering how we would go from the &lt;code>Literal&lt;/code> &lt;em>parser&lt;/em> and &lt;code>OneOf&lt;/code> &lt;em>parser combinator&lt;/em> to actually &lt;em>generating a parser at runtime that can parse the semantics defined in a regexp string&lt;/em>.&lt;/p>
&lt;p>Since our &lt;code>Parser&lt;/code> interface is a runtime interface (you can swap out the implementation at runtime) and since our parser combinator &lt;code>OneOf&lt;/code> operates using that interface (only the return value must be known at compile time, it could be a generic AST node) it means that we can easily dynamically create slices of &lt;code>[]*Parser(...)&lt;/code> at runtime based on the result of a parser combinator we have built - like our &amp;ldquo;dog, cat, sheep&amp;rdquo; parser from earlier.&lt;/p>
&lt;p>The challenge left for you as a reader is to:&lt;/p>
&lt;ul>
&lt;li>Write &lt;em>parsers&lt;/em> like our &lt;code>Literal&lt;/code> parser that can parse the components of our regexp &lt;code>a[bc].*abc&lt;/code>:
&lt;ul>
&lt;li>&lt;code>RegexLiteralParser&lt;/code> can parse &lt;code>a&lt;/code>, &lt;code>b&lt;/code>, and &lt;code>c&lt;/code>, but not &lt;code>abc&lt;/code> (the string.)&lt;/li>
&lt;li>&lt;code>RegexRangeOpenParser&lt;/code> can parse &lt;code>[&lt;/code>.&lt;/li>
&lt;li>&lt;code>RegexRangeCloseParser&lt;/code> can parse &lt;code>]&lt;/code>&lt;/li>
&lt;li>&lt;code>RegexAnyParser&lt;/code> can parse the &lt;code>.&lt;/code> &amp;ldquo;any character&amp;rdquo; syntax.&lt;/li>
&lt;li>&lt;code>RegexRepetitionParser&lt;/code> can parse the &lt;code>*&lt;/code> repetition operator.&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Write a &lt;em>parser combinators&lt;/em> like our &lt;code>OneOf&lt;/code> parser, except have it parse a &lt;code>Sequence&lt;/code> of parsers.&lt;/li>
&lt;li>Use our &lt;code>Sequence&lt;/code> parser combinator and &lt;code>RegexLiteralParser&lt;/code> to build a &lt;code>RegexStringLiteralParser&lt;/code> - similar to how we built out &amp;ldquo;dog, cat, sheep&amp;rdquo; parser.&lt;/li>
&lt;li>Write a &lt;em>new kind of function&lt;/em> called a &lt;em>runtime parser generator&lt;/em> named &lt;code>RegexParser&lt;/code> which will be super familiar:
&lt;ul>
&lt;li>Take in a &lt;em>parser combinator&lt;/em> called &lt;code>RegexSyntaxParser&lt;/code> which can turn your regexp syntax into some intermediary like an AST.&lt;/li>
&lt;li>Have your function &lt;em>use parser combinators like OneOf, Sequence, etc.&lt;/em> to build a brand new parser at runtime based on that intermediary AST.&lt;/li>
&lt;li>Return that new parser which parses the actual semantics described by the input regexp!&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h2 id="closing-thoughts">Closing thoughts&lt;/h2>
&lt;p>I am sorry for not giving you a full (or even partial) regex engine :) I am still exploring this and it is a large undertaking, this blog post would be far too long if it was included.&lt;/p>
&lt;p>You can find a copy of the final code with &lt;em>parsers&lt;/em> and &lt;em>parser combinators&lt;/em> &lt;a href="https://gist.github.com/emidoots/db2dd2c49aa038e23b654120e70c9b00">here&lt;/a>. Just &lt;code>zig init-exe&lt;/code> and plop them into your &lt;code>src/&lt;/code> directory.&lt;/p>
&lt;p>You may also want to check out &lt;a href="https://github.com/Hejsil/mecha">Mecha&lt;/a>, a parser combinator library for Zig.&lt;/p>
&lt;p>If anything was unclear or confusing, I&amp;rsquo;m happy to help: shoot me an email &lt;a href="mailto:stephen@hexops.com">stephen@hexops.com&lt;/a> or leave a comment on Hacker News / Reddit and I&amp;rsquo;ll follow up.&lt;/p></description></item><item><title>Postgres regex search over 10,000 GitHub repositories (using only a Macbook)</title><link>https://devlog.hexops.org/2021/postgres-regex-search-over-10000-github-repositories/</link><pubDate>Wed, 17 Feb 2021 00:00:00 +0000</pubDate><guid>https://devlog.hexops.com/2021/postgres-regex-search-over-10000-github-repositories/</guid><description>&lt;p>In this article, we share empirical measurements from our experiments in using Postgres to index and search over 10,000 top GitHub repositories using &lt;code>pg_trgm&lt;/code> on only a Macbook.&lt;/p>
&lt;p>This is a follow up to &lt;a href="https://devlog.hexops.org/2021/postgres-trigram-search-learnings">&amp;ldquo;Postgres Trigram search learnings&amp;rdquo;&lt;/a>, in which we shared several learnings and beliefs about trying to use Postgres Trigram indexes as an alterative to Google&amp;rsquo;s &lt;a href="https://github.com/google/zoekt">Zoekt&lt;/a> (&amp;ldquo;Fast trigram based code search&amp;rdquo;).&lt;/p>
&lt;p>We share our results, as well as &lt;a href="https://github.com/hexops/pgtrgm_emperical_measurements">the exact steps we performed, scripts, and lists of the top 20,000 repositories by stars/language on GitHub&lt;/a> so you can reproduce the results yourself should you desire.&lt;/p>
&lt;h2 id="tldr">TL;DR&lt;/h2>
&lt;p>&lt;strong>This article is extensive and more akin to a research paper than a blog post.&lt;/strong> If you&amp;rsquo;re interested in our conclusions, see &lt;a href="#conclusions">conclusions&lt;/a> instead.&lt;/p>
&lt;h2 id="goals">Goals&lt;/h2>
&lt;p>We wanted to get empirical measurements for how suitable Postgres is in providing regexp search over documents, e.g. as an alterative to Google&amp;rsquo;s &lt;a href="https://github.com/google/zoekt">Zoekt&lt;/a> (&amp;ldquo;Fast trigram based code search&amp;rdquo;). In specific:&lt;/p>
&lt;ul>
&lt;li>How many repositories can we index on just a 2019 Macbook Pro?&lt;/li>
&lt;li>How fast are different regexp searches over the corpus?&lt;/li>
&lt;li>What Postgres 13 configuration gives best results?&lt;/li>
&lt;li>What other operational effects need consideration if seriously attempting to use Postgres as the backend for a regexp search engine?&lt;/li>
&lt;li>What is the best database schema to use?&lt;/li>
&lt;/ul>
&lt;h2 id="hardware">Hardware&lt;/h2>
&lt;p>We ran all tests on a 2019 Macbook Pro with:&lt;/p>
&lt;ul>
&lt;li>2.3 GHz 8-Core Intel Core i9&lt;/li>
&lt;li>16 GB 2667 MHz DDR4&lt;/li>
&lt;/ul>
&lt;p>During test execution, few other Mac applications were in use such that effectively all CPU/memory was available to Postgres.&lt;/p>
&lt;h2 id="corpus">Corpus&lt;/h2>
&lt;p>We scraped &lt;a href="https://github.com/hexops/pgtrgm_emperical_measurements/tree/main/top_repos">lists of the top 1,000 repositories from the GitHub search API&lt;/a> ranked by stars for each of the following languages (~20.5k repositories in total):&lt;/p>
&lt;ul>
&lt;li>C++, C#, CSS, Go, HTML, Java, JavaScript, MatLab, ObjC, Perl, PHP, Python, Ruby, Rust, Shell, Solidity, Swift, TypeScript, VB .NET, and Zig.&lt;/li>
&lt;/ul>
&lt;p>Cloning all ~20.5k repositories in parallel took ~14 hours with a fast ~100 Mbps connection to GitHub&amp;rsquo;s servers.&lt;/p>
&lt;h3 id="dataset-reduction">Dataset reduction&lt;/h3>
&lt;p>We found the amount of disk space required by &lt;code>git clone --depth 1&lt;/code> on these repositories to be a sizable ~412G for just 12,148 repositories - and so we put in place several processes for further reduce the dataset size by about 66%:&lt;/p>
&lt;ul>
&lt;li>Removing &lt;code>.git&lt;/code> directories resulted in a 30% reduction (412G -&amp;gt; 290G, for 12,148 repositories)&lt;/li>
&lt;li>Removing files &amp;gt; 1 MiB resulted in another 51% reduction (290G -&amp;gt; 142G, for 12,148 repositories - note GitHub does not index files &amp;gt; 384 KiB in their search engine)&lt;/li>
&lt;/ul>
&lt;h2 id="database-insertion">Database insertion&lt;/h2>
&lt;p>We &lt;a href="https://github.com/hexops/pgtrgm_emperical_measurements/blob/main/cmd/corpusindex/main.go">concurrently inserted&lt;/a> the entire corpus into Postgres, with the following DB schema:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-sql" data-lang="sql">&lt;span class="line">&lt;span class="cl">&lt;span class="k">CREATE&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">EXTENSION&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">IF&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">NOT&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">EXISTS&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">pg_trgm&lt;/span>&lt;span class="p">;&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="k">CREATE&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">TABLE&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">IF&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">NOT&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">EXISTS&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">files&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">id&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">bigserial&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">PRIMARY&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">KEY&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">contents&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="nb">text&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">NOT&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">NULL&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">filepath&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="nb">text&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">NOT&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">NULL&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="p">);&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>In total, this took around ~8 hours to complete and Postgres&amp;rsquo;s entire on-disk utilization was 101G.&lt;/p>
&lt;h2 id="creating-the-trigram-index">Creating the Trigram index&lt;/h2>
&lt;p>We tried three separate times to index the dataset using the following GIN Trigram index:&lt;/p>
&lt;pre tabindex="0">&lt;code>CREATE INDEX IF NOT EXISTS files_contents_trgm_idx ON files USING GIN (contents gin_trgm_ops);
&lt;/code>&lt;/pre>&lt;ul>
&lt;li>&lt;strong>In the first attempt, we hit an OOM after 11 hours and 34 minutes.&lt;/strong> This was due to a rapid spike in memory usage at the very end of indexing. We used a &lt;a href="https://github.com/hexops/pgtrgm_emperical_measurements#configuration-attempt-1-indexing-failure-oom">fairly aggressive&lt;/a> Postgres configuration with a very large max WAL size, so it was not entirely unexpected.&lt;/li>
&lt;li>&lt;strong>In the second attempt, we ran out of SSD disk space after ~27 hours&lt;/strong>. Notable is that the disk space largely grew towards the end of indexing, similar to when we faced an OOM - it was not a gradual increase over time. For this attempt, we used the excellent &lt;a href="https://pgtune.leopard.in.ua/#/">pgtune&lt;/a> tool to reduce our first Postgres configuration as follows:&lt;/li>
&lt;/ul>
&lt;pre tabindex="0">&lt;code>shared_buffers = 4GB → 2560MB
effective_cache_size = 12GB → 7680MB
maintenance_work_mem = 16GB → 1280MB
default_statistics_target = 100 → 500
work_mem = 5242kB → 16MB
min_wal_size = 50GB → 4GB
max_wal_size = 4GB → 16GB
max_parallel_workers_per_gather = 8 → 4
max_parallel_maintenance_workers = 8 → 4
&lt;/code>&lt;/pre>&lt;ul>
&lt;li>&lt;strong>In our third and final attempt, we cut the dataset in half and indexing succeeded after 22 hours.&lt;/strong> In specific, we deleted half of the files in the database (from 19,441,820 files / 178GiB of data to 9,720,910 files / 82 GiB of data.) The Postgres configuration used was the same as in attempt 2.&lt;/li>
&lt;/ul>
&lt;h2 id="indexing-performance-memory-usage">Indexing performance: Memory usage&lt;/h2>
&lt;p>In our first attempt, we see the reported &lt;code>docker stats&lt;/code> memory usage of the container grow up to 12 GiB (chart shows MiB of memory used over time):&lt;/p>
&lt;img width="981" alt="image" src="https://devlog.hexops.org/img/2021/postgres-regex-search-over-10000-github-repositories/img1.png">
&lt;p>In our second and third attempts, we see far less memory usage (~1.6 GiB consistently):&lt;/p>
&lt;img width="980" alt="image" src="https://devlog.hexops.org/img/2021/postgres-regex-search-over-10000-github-repositories/img2.png">
&lt;img width="980" alt="image" src="https://devlog.hexops.org/img/2021/postgres-regex-search-over-10000-github-repositories/img3.png">
&lt;h2 id="indexing-performance-cpu-usage">Indexing performance: CPU usage&lt;/h2>
&lt;p>Postgres&amp;rsquo; Trigram indexing appears to be mostly single-threaded (at least when indexing &lt;em>a single table&lt;/em>, we test multiple tables later.)&lt;/p>
&lt;p>In our first attempt, CPU usage for the container did not rise above 156% (one and a half virtual CPU cores):&lt;/p>
&lt;img width="982" alt="image" src="https://devlog.hexops.org/img/2021/postgres-regex-search-over-10000-github-repositories/img4.png">
&lt;p>Our second attempt was around 150-200% CPU usage on average:&lt;/p>
&lt;img width="980" alt="image" src="https://devlog.hexops.org/img/2021/postgres-regex-search-over-10000-github-repositories/img5.png">
&lt;p>Our third attempt similarly saw an average of 150-200%, but with a brief spike towards the end to ~350% CPU:&lt;/p>
&lt;img width="980" alt="image" src="https://devlog.hexops.org/img/2021/postgres-regex-search-over-10000-github-repositories/img6.png">
&lt;h2 id="indexing-performance-disk-io">Indexing performance: Disk IO&lt;/h2>
&lt;p>Disk reads/writes during indexing averaged about ~250 MB/s for reads (blue) and writes (red). Native in-software tests show the same Macbook able to achieve read/write speeds of ~860 MB/s with &amp;lt;5% affect on CPU utilization.&lt;/p>
&lt;p>&lt;small>Addition made Feb 20, 2021:&lt;/small> We ran tests using native Postgres as well (instead of in Docker with a bind mount) and found better indexing and query performance, more on this below.&lt;/p>
&lt;img width="599" alt="image" src="https://devlog.hexops.org/img/2021/postgres-regex-search-over-10000-github-repositories/img7.png">
&lt;h2 id="indexing-performance-disk-space">Indexing performance: Disk space&lt;/h2>
&lt;p>The database contains 9,720,910 files totalling 82.07 GiB:&lt;/p>
&lt;pre tabindex="0">&lt;code>postgres=# select count(filepath) from files;
count
---------
9720910
(1 row)
postgres=# select SUM(octet_length(contents)) from files;
sum
-------------
88123563320
(1 row)
&lt;/code>&lt;/pre>&lt;p>&lt;strong>Before indexing&lt;/strong>, we find that all of Postgres is consuming 54G:&lt;/p>
&lt;pre tabindex="0">&lt;code>$ du -sh .postgres/
54G .postgres/
&lt;/code>&lt;/pre>&lt;p>After &lt;code>CREATE INDEX&lt;/code>, Postgres uses:&lt;/p>
&lt;pre tabindex="0">&lt;code>$ du -sh .postgres/
73G .postgres/
&lt;/code>&lt;/pre>&lt;p>Thus, the index size for 82 GiB of text is 19 GiB (or 23% of the data size.)&lt;/p>
&lt;h2 id="database-startup-times">Database startup times&lt;/h2>
&lt;p>From an operational standpoint, it is worth noting that if Postgres is starting clean (i.e. previous shutdown was graceful) then startup time is almost instantaneous: it begins accepting connections immediately and loads the index as needed.&lt;/p>
&lt;p>However, if Postgres experienced a non-graceful termination during e.g. startup, it can take a hefty ~10 minutes with this dataset to start as it goes through an automated recovery process.&lt;/p>
&lt;h2 id="queries-executed">Queries executed&lt;/h2>
&lt;p>In total, we executed 19,936 search queries against the index. We chose queries which we expect give reasonably varying amounts of coverage over the trigram index (that is, queries whose trigrams are more or less likely to occur in many files):&lt;/p>
&lt;table>
&lt;thead>
&lt;tr>
&lt;th>Regexp query&lt;/th>
&lt;th>Matching # files in entire dataset&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td>&lt;code>var&lt;/code>&lt;/td>
&lt;td>unknown (2m+ suspected)&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;code>error&lt;/code>&lt;/td>
&lt;td>1,479,452&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;code>123456789&lt;/code>&lt;/td>
&lt;td>59,841&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;code>fmt\.Error&lt;/code>&lt;/td>
&lt;td>127,895&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;code>fmt\.Println&lt;/code>&lt;/td>
&lt;td>22,876&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;code>bytes.Buffer&lt;/code>&lt;/td>
&lt;td>34,554&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;code>fmt\.Print.*&lt;/code>&lt;/td>
&lt;td>37,319&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;code>ac8ac5d63b66b83b90ce41a2d4061635&lt;/code>&lt;/td>
&lt;td>0&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;code>d97f1d3ff91543[e-f]49.8b07517548877&lt;/code>&lt;/td>
&lt;td>0&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;details>
&lt;summary>Detailed breakdown&lt;/summary>
&lt;div markdown="1">
&lt;table>
&lt;thead>
&lt;tr>
&lt;th>Query&lt;/th>
&lt;th>Result Limit&lt;/th>
&lt;th>Times executed&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td>&lt;code>var&lt;/code>&lt;/td>
&lt;td>10&lt;/td>
&lt;td>1000&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;code>var&lt;/code>&lt;/td>
&lt;td>100&lt;/td>
&lt;td>1000&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;code>var&lt;/code>&lt;/td>
&lt;td>1000&lt;/td>
&lt;td>100&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;code>var&lt;/code>&lt;/td>
&lt;td>unlimited&lt;/td>
&lt;td>4&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;code>error'&lt;/code>&lt;/td>
&lt;td>10&lt;/td>
&lt;td>2000&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;code>error'&lt;/code>&lt;/td>
&lt;td>100&lt;/td>
&lt;td>2000&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;code>error'&lt;/code>&lt;/td>
&lt;td>1000&lt;/td>
&lt;td>200&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;code>error'&lt;/code>&lt;/td>
&lt;td>unlimited&lt;/td>
&lt;td>18&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;code>123456789&lt;/code>&lt;/td>
&lt;td>10&lt;/td>
&lt;td>1000&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;code>123456789&lt;/code>&lt;/td>
&lt;td>100&lt;/td>
&lt;td>1000&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;code>123456789&lt;/code>&lt;/td>
&lt;td>1000&lt;/td>
&lt;td>100&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;code>123456789&lt;/code>&lt;/td>
&lt;td>unlimited&lt;/td>
&lt;td>2&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;code>fmt\.Error&lt;/code>&lt;/td>
&lt;td>10&lt;/td>
&lt;td>1000&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;code>fmt\.Error&lt;/code>&lt;/td>
&lt;td>100&lt;/td>
&lt;td>1000&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;code>fmt\.Error&lt;/code>&lt;/td>
&lt;td>1000&lt;/td>
&lt;td>100&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;code>fmt\.Error&lt;/code>&lt;/td>
&lt;td>unlimited&lt;/td>
&lt;td>2&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;code>fmt\.Println&lt;/code>&lt;/td>
&lt;td>10&lt;/td>
&lt;td>1000&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;code>fmt\.Println&lt;/code>&lt;/td>
&lt;td>100&lt;/td>
&lt;td>1000&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;code>fmt\.Println&lt;/code>&lt;/td>
&lt;td>1000&lt;/td>
&lt;td>100&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;code>fmt\.Println&lt;/code>&lt;/td>
&lt;td>unlimited&lt;/td>
&lt;td>2&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;code>bytes.Buffer&lt;/code>&lt;/td>
&lt;td>10&lt;/td>
&lt;td>4&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;code>bytes.Buffer&lt;/code>&lt;/td>
&lt;td>100&lt;/td>
&lt;td>4&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;code>bytes.Buffer&lt;/code>&lt;/td>
&lt;td>1000&lt;/td>
&lt;td>4&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;code>bytes.Buffer&lt;/code>&lt;/td>
&lt;td>unlimited&lt;/td>
&lt;td>2&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;code>fmt\.Print.*&lt;/code>&lt;/td>
&lt;td>10&lt;/td>
&lt;td>1000&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;code>fmt\.Print.*&lt;/code>&lt;/td>
&lt;td>100&lt;/td>
&lt;td>1000&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;code>fmt\.Print.*&lt;/code>&lt;/td>
&lt;td>1000&lt;/td>
&lt;td>100&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;code>fmt\.Print.*&lt;/code>&lt;/td>
&lt;td>unlimited&lt;/td>
&lt;td>2&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;code>ac8ac5d63b66b83b90ce41a2d4061635&lt;/code>&lt;/td>
&lt;td>10&lt;/td>
&lt;td>1000&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;code>ac8ac5d63b66b83b90ce41a2d4061635&lt;/code>&lt;/td>
&lt;td>100&lt;/td>
&lt;td>1000&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;code>ac8ac5d63b66b83b90ce41a2d4061635&lt;/code>&lt;/td>
&lt;td>1000&lt;/td>
&lt;td>100&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;code>ac8ac5d63b66b83b90ce41a2d4061635&lt;/code>&lt;/td>
&lt;td>unlimited&lt;/td>
&lt;td>2&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;code>d97f1d3ff91543[e-f]49.8b07517548877&lt;/code>&lt;/td>
&lt;td>10&lt;/td>
&lt;td>1000&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;code>d97f1d3ff91543[e-f]49.8b07517548877&lt;/code>&lt;/td>
&lt;td>100&lt;/td>
&lt;td>1000&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;code>d97f1d3ff91543[e-f]49.8b07517548877&lt;/code>&lt;/td>
&lt;td>1000&lt;/td>
&lt;td>100&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;code>d97f1d3ff91543[e-f]49.8b07517548877&lt;/code>&lt;/td>
&lt;td>unlimited&lt;/td>
&lt;td>2&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;/div>
&lt;/details>
&lt;h2 id="query-performance">Query performance&lt;/h2>
&lt;p>In total, we executed 19,936 search queries against the database (linearly, not in parallel) which completed in the following times:&lt;/p>
&lt;table>
&lt;thead>
&lt;tr>
&lt;th>Time bucket&lt;/th>
&lt;th>Percentage of queries&lt;/th>
&lt;th>Number of queries&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td>Under 50ms&lt;/td>
&lt;td>30%&lt;/td>
&lt;td>5,933&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>Under 250ms&lt;/td>
&lt;td>41%&lt;/td>
&lt;td>8,088&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>Under 500ms&lt;/td>
&lt;td>52%&lt;/td>
&lt;td>10,275&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>Under 750ms&lt;/td>
&lt;td>63%&lt;/td>
&lt;td>12,473&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>Under 1s&lt;/td>
&lt;td>68%&lt;/td>
&lt;td>13,481&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>Under 1.5s&lt;/td>
&lt;td>74%&lt;/td>
&lt;td>14,697&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>Under 3s&lt;/td>
&lt;td>79%&lt;/td>
&lt;td>15,706&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>Under 25s&lt;/td>
&lt;td>79%&lt;/td>
&lt;td>15,708&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>Under 30s&lt;/td>
&lt;td>99%&lt;/td>
&lt;td>19,788&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;h2 id="query-performance-vs-planning-time">Query performance vs. planning time&lt;/h2>
&lt;p>The following scatter plot shows how 79% of queries executed in under 3s (Y axis, in ms), while Postgres&amp;rsquo;s query planner had planned them for execution in under 100-250ms generally (X axis, in ms):&lt;/p>
&lt;img width="1252" alt="image" src="https://devlog.hexops.org/img/2021/postgres-regex-search-over-10000-github-repositories/img8.png">
&lt;p>If we expand the view to include all queries, we start to get a picture of just how outlier these 21% of queries are (note that the small block of dots in the bottom left represents the same diagram shown above):&lt;/p>
&lt;img width="1250" alt="image" src="https://devlog.hexops.org/img/2021/postgres-regex-search-over-10000-github-repositories/img9.png">
&lt;h2 id="query-time-vs-cpu--memory-usage">Query time vs. CPU &amp;amp; Memory usage&lt;/h2>
&lt;p>The following image shows:&lt;/p>
&lt;ul>
&lt;li>(top) Query time in milliseconds&lt;/li>
&lt;li>(middle) CPU usage percentage (e.g. 801% refers to 8 out of 16 virtual CPU cores being consumed)&lt;/li>
&lt;li>(bottom) Memory usage in MiB.&lt;/li>
&lt;/ul>
&lt;img width="1255" alt="image" src="https://devlog.hexops.org/img/2021/postgres-regex-search-over-10000-github-repositories/img10.png">
&lt;p>Notable insights from this are:&lt;/p>
&lt;ul>
&lt;li>The large increase in resource usage towards the end is when we began executing queries with no &lt;code>LIMIT&lt;/code>.&lt;/li>
&lt;li>CPU usage does not exceed 138%, until the spike at the end.&lt;/li>
&lt;li>Memory usage does not exceed 42 MiB, until the spike at the end.&lt;/li>
&lt;/ul>
&lt;p>We suspect &lt;code>pg_trgm&lt;/code> is single-threaded within the scope of a single table, but with &lt;a href="https://www.postgresql.org/docs/10/ddl-partitioning.html">table data partitioning&lt;/a> (or splitting data into multiple tables with subsets of the data), we suspect better parallelism could be achieved.&lt;/p>
&lt;h2 id="investigating-slow-queries">Investigating slow queries&lt;/h2>
&lt;p>If we plot the number of index rechecks (X axis) vs. execution time (Y axis), we can clearly see one of the most significant aspects of slow queries is that they have many more index rechecks:&lt;/p>
&lt;img width="1036" alt="image" src="https://devlog.hexops.org/img/2021/postgres-regex-search-over-10000-github-repositories/img11.png">
&lt;p>And if we look at &lt;a href="https://github.com/hexops/pgtrgm_emperical_measurements/blob/main/query_logs/query-run-3.log#L3-L24">the &lt;code>EXPLAIN ANALYZE&lt;/code> output for one of these queries&lt;/a> we can also confirm &lt;code>Parallel Bitmap Heap Scan&lt;/code> is slow due to &lt;code>Rows Removed by Index Recheck&lt;/code>.&lt;/p>
&lt;h2 id="table-splitting">Table splitting&lt;/h2>
&lt;p>Splitting up the search index into multiple smaller tables seems like an obvious approach to getting &lt;code>pg_trgm&lt;/code> to use multiple CPU cores. We tried this by taking the same exact data set and splitting it into 200 tables, and found numerous benefits:&lt;/p>
&lt;h3 id="benefit-1-incremental-indexing">Benefit 1: Incremental indexing&lt;/h3>
&lt;p>If indexing fails after 11-27 hours, as happened to us twice in the non-splitting approach, all progress is not lost.&lt;/p>
&lt;h3 id="benefit-2-parallel-indexing">Benefit 2: Parallel indexing&lt;/h3>
&lt;p>Unlike our first non-splitting approach, which showed we were only able to utilize 1.5-2 virtual CPU cores, with multiple tables we are able to utilize 8-9 virtual CPU cores:&lt;/p>
&lt;img width="1143" alt="image" src="https://devlog.hexops.org/img/2021/postgres-regex-search-over-10000-github-repositories/img12.png">
&lt;h3 id="benefit-3-indexing-is-84-faster">Benefit 3: Indexing is 84% faster&lt;/h3>
&lt;p>Unlike our first attempt which took 22 hours in total, parallel indexing completed in only 3h27m.&lt;/p>
&lt;h3 id="benefit-4-indexing-uses-69-less-memory">Benefit 4: Indexing uses 69% less memory&lt;/h3>
&lt;p>With non-splitting we saw peak memory usage up to 12 GiB. With the same exact Postgres configuration, we were able to index with only 3.7 GiB peak memory usage:&lt;/p>
&lt;img width="1140" alt="image" src="https://devlog.hexops.org/img/2021/postgres-regex-search-over-10000-github-repositories/img13.png">
&lt;h2 id="benefit-4-parallel-querying">Benefit 4: Parallel querying&lt;/h2>
&lt;p>Previously, we saw CPU utilization of only 138% (1.3 virtual CPU cores), with table splitting we see CPU utilization during queries of 1600% (16 virtual CPU cores) showing we are doing work fully in parallel:&lt;/p>
&lt;img width="1144" alt="image" src="https://devlog.hexops.org/img/2021/postgres-regex-search-over-10000-github-repositories/img14.png">
&lt;p>Similarly, we saw memory usage average around ~380 MiB, compared to only ~42 MiB before:&lt;/p>
&lt;img width="1143" alt="image" src="https://devlog.hexops.org/img/2021/postgres-regex-search-over-10000-github-repositories/img15.png">
&lt;h2 id="benefit-5-query-performance">Benefit 5: Query performance&lt;/h2>
&lt;p>We reran the same exact set of search queries, but a smaller number of times overall (350 queries, instead of 19.9k - which we found to still be a representative enough sample.)&lt;/p>
&lt;p>As we can see below, table splitting in general led to a 200-300% improvement in query time for heavier queries that previously took 20-30s, now taking only 7-15s thanks to parallel querying (top chart is before, bottom is after, both in milliseconds):&lt;/p>
&lt;img width="1143" alt="image" src="https://devlog.hexops.org/img/2021/postgres-regex-search-over-10000-github-repositories/img16.png">
&lt;p>We also grouped queries based on the &lt;code>LIMIT&lt;/code> specified in the query and placed them into time buckets (&amp;ldquo;how many queries completed in under 50ms?&amp;rdquo;) - comparing the two shows that less complex queries and/or queries for fewer results were negatively affected slightly, while larger queries were helped substantially:&lt;/p>
&lt;table>
&lt;thead>
&lt;tr>
&lt;th>Change (positive is good)&lt;/th>
&lt;th>Results limit&lt;/th>
&lt;th>Bucket&lt;/th>
&lt;th>&lt;strong>Queries in bucket before&lt;/strong>&lt;/th>
&lt;th>&lt;strong>Queries in bucket after&lt;/strong>&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td>-33%&lt;/td>
&lt;td>10&lt;/td>
&lt;td>&amp;lt;50ms&lt;/td>
&lt;td>33%&lt;/td>
&lt;td>0%&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>+13%&lt;/td>
&lt;td>10&lt;/td>
&lt;td>&amp;lt;250ms&lt;/td>
&lt;td>44%&lt;/td>
&lt;td>57%&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>+33%&lt;/td>
&lt;td>10&lt;/td>
&lt;td>&amp;lt;1s&lt;/td>
&lt;td>77%&lt;/td>
&lt;td>100%&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>-29%&lt;/td>
&lt;td>100&lt;/td>
&lt;td>&amp;lt;100ms&lt;/td>
&lt;td>29%&lt;/td>
&lt;td>0%&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>+20%&lt;/td>
&lt;td>100&lt;/td>
&lt;td>&amp;lt;500ms&lt;/td>
&lt;td>50%&lt;/td>
&lt;td>70%&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>+19%&lt;/td>
&lt;td>100&lt;/td>
&lt;td>&amp;lt;10s&lt;/td>
&lt;td>80%&lt;/td>
&lt;td>99%&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>-12%&lt;/td>
&lt;td>1000&lt;/td>
&lt;td>&amp;lt;250ms&lt;/td>
&lt;td>12%&lt;/td>
&lt;td>0%&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>-13%&lt;/td>
&lt;td>1000&lt;/td>
&lt;td>&amp;lt;2.5s&lt;/td>
&lt;td>77%&lt;/td>
&lt;td>64%&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>+23%&lt;/td>
&lt;td>1000&lt;/td>
&lt;td>&amp;lt;20s&lt;/td>
&lt;td>77%&lt;/td>
&lt;td>100%&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>+4%&lt;/td>
&lt;td>none&lt;/td>
&lt;td>&amp;lt;20s&lt;/td>
&lt;td>0%&lt;/td>
&lt;td>4%&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>+18%&lt;/td>
&lt;td>none&lt;/td>
&lt;td>&amp;lt;60s&lt;/td>
&lt;td>0%&lt;/td>
&lt;td>18%&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;p>Detailed comparisons are available below for those interested:&lt;/p>
&lt;details>
&lt;summary>Queries with `LIMIT 10`&lt;/summary>
&lt;div markdown="1">
&lt;table>
&lt;thead>
&lt;tr>
&lt;th>Time bucket&lt;/th>
&lt;th>Percentage of queries (before)&lt;/th>
&lt;th>Percentage of queries (after splitting)&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td>50ms&lt;/td>
&lt;td>33.00% (2999 of 9004)&lt;/td>
&lt;td>0% (0 of 100)&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>100ms&lt;/td>
&lt;td>33.00% (2999 of 9004)&lt;/td>
&lt;td>1.00% (1 of 100)&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>250ms&lt;/td>
&lt;td>44.00% (3999 of 9004)&lt;/td>
&lt;td>57.00% (57 of 100)&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>500ms&lt;/td>
&lt;td>55.00% (4999 of 9004)&lt;/td>
&lt;td>79.00% (79 of 100)&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>1000ms&lt;/td>
&lt;td>77.00% (6998 of 9004)&lt;/td>
&lt;td>80.00% (80 of 100)&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>2500ms&lt;/td>
&lt;td>77.00% (7003 of 9004)&lt;/td>
&lt;td>80.00% (80 of 100)&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>5000ms&lt;/td>
&lt;td>77.00% (7004 of 9004)&lt;/td>
&lt;td>80.00% (80 of 100)&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>10000ms&lt;/td>
&lt;td>77.00% (7004 of 9004)&lt;/td>
&lt;td>100.00% (100 of 100)&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>20000ms&lt;/td>
&lt;td>77.00% (7004 of 9004)&lt;/td>
&lt;td>100.00% (100 of 100)&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>30000ms&lt;/td>
&lt;td>99.00% (8985 of 9004)&lt;/td>
&lt;td>100.00% (100 of 100)&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>40000ms&lt;/td>
&lt;td>99.00% (9003 of 9004)&lt;/td>
&lt;td>100.00% (100 of 100)&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>50000ms&lt;/td>
&lt;td>100.00% (9004 of 9004)&lt;/td>
&lt;td>100.00% (100 of 100)&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>60000ms&lt;/td>
&lt;td>100.00% (9004 of 9004)&lt;/td>
&lt;td>100.00% (100 of 100)&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;/div>
&lt;/details>
&lt;details>
&lt;summary>Queries with `LIMIT 100`&lt;/summary>
&lt;div markdown="1">
&lt;table>
&lt;thead>
&lt;tr>
&lt;th>Time bucket&lt;/th>
&lt;th>Percentage of queries (before)&lt;/th>
&lt;th>Percentage of queries (after splitting)&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td>50ms&lt;/td>
&lt;td>29.00% (2934 of 10000)&lt;/td>
&lt;td>0% (0 of 100)&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>100ms&lt;/td>
&lt;td>29.00% (2978 of 10000)&lt;/td>
&lt;td>0% (0 of 100)&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>250ms&lt;/td>
&lt;td>39.00% (3975 of 10000)&lt;/td>
&lt;td>31.00% (31 of 100)&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>500ms&lt;/td>
&lt;td>50.00% (5000 of 10000)&lt;/td>
&lt;td>70.00% (70 of 100)&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>1000ms&lt;/td>
&lt;td>59.00% (5984 of 10000)&lt;/td>
&lt;td>79.00% (79 of 100)&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>2500ms&lt;/td>
&lt;td>79.00% (7996 of 10000)&lt;/td>
&lt;td>80.00% (80 of 100)&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>5000ms&lt;/td>
&lt;td>80.00% (8000 of 10000)&lt;/td>
&lt;td>80.00% (80 of 100)&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>10000ms&lt;/td>
&lt;td>80.00% (8000 of 10000)&lt;/td>
&lt;td>99.00% (99 of 100)&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>20000ms&lt;/td>
&lt;td>80.00% (8000 of 10000)&lt;/td>
&lt;td>100.00% (100 of 100)&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>30000ms&lt;/td>
&lt;td>99.00% (9999 of 10000)&lt;/td>
&lt;td>100.00% (100 of 100)&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>40000ms&lt;/td>
&lt;td>100.00% (10000 of 10000)&lt;/td>
&lt;td>100.00% (100 of 100)&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>50000ms&lt;/td>
&lt;td>100.00% (10000 of 10000)&lt;/td>
&lt;td>100.00% (100 of 100)&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>60000ms&lt;/td>
&lt;td>100.00% (10000 of 10000)&lt;/td>
&lt;td>100.00% (100 of 100)&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;/div>
&lt;/details>
&lt;details>
&lt;summary>Queries with `LIMIT 1000`&lt;/summary>
&lt;div markdown="1">
&lt;table>
&lt;thead>
&lt;tr>
&lt;th>Time bucket&lt;/th>
&lt;th>Percentage of queries (before)&lt;/th>
&lt;th>Percentage of queries (after splitting)&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td>50ms&lt;/td>
&lt;td>0% (0 of 904)&lt;/td>
&lt;td>0% (0 of 100)&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>100ms&lt;/td>
&lt;td>0% (1 of 904)&lt;/td>
&lt;td>0% (0 of 100)&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>250ms&lt;/td>
&lt;td>12.00% (114 of 904)&lt;/td>
&lt;td>0% (0 of 100)&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>500ms&lt;/td>
&lt;td>30.00% (276 of 904)&lt;/td>
&lt;td>21.00% (21 of 100)&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>1000ms&lt;/td>
&lt;td>55.00% (499 of 904)&lt;/td>
&lt;td>41.00% (41 of 100)&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>2500ms&lt;/td>
&lt;td>77.00% (700 of 904)&lt;/td>
&lt;td>64.00% (64 of 100)&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>5000ms&lt;/td>
&lt;td>77.00% (704 of 904)&lt;/td>
&lt;td>77.00% (77 of 100)&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>10000ms&lt;/td>
&lt;td>77.00% (704 of 904)&lt;/td>
&lt;td>98.00% (98 of 100)&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>20000ms&lt;/td>
&lt;td>77.00% (704 of 904)&lt;/td>
&lt;td>100.00% (100 of 100)&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>30000ms&lt;/td>
&lt;td>88.00% (804 of 904)&lt;/td>
&lt;td>100.00% (100 of 100)&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>40000ms&lt;/td>
&lt;td>99.00% (901 of 904)&lt;/td>
&lt;td>100.00% (100 of 100)&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>50000ms&lt;/td>
&lt;td>99.00% (903 of 904)&lt;/td>
&lt;td>100.00% (100 of 100)&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>60000ms&lt;/td>
&lt;td>100.00% (904 of 904)&lt;/td>
&lt;td>100.00% (100 of 100)&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;/div>
&lt;/details>
&lt;details>
&lt;summary>Queries with no limit`&lt;/summary>
&lt;div markdown="1">
&lt;table>
&lt;thead>
&lt;tr>
&lt;th>Time bucket&lt;/th>
&lt;th>Percentage of queries (before)&lt;/th>
&lt;th>Percentage of queries (after splitting)&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td>50ms&lt;/td>
&lt;td>0% (0 of 28)&lt;/td>
&lt;td>0% (0 of 50)&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>100ms&lt;/td>
&lt;td>0% (0 of 28)&lt;/td>
&lt;td>0% (0 of 50)&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>250ms&lt;/td>
&lt;td>0% (0 of 28)&lt;/td>
&lt;td>0% (0 of 50)&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>500ms&lt;/td>
&lt;td>0% (0 of 28)&lt;/td>
&lt;td>0% (0 of 50)&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>1000ms&lt;/td>
&lt;td>0% (0 of 28)&lt;/td>
&lt;td>0% (0 of 50)&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>2500ms&lt;/td>
&lt;td>0% (0 of 28)&lt;/td>
&lt;td>0% (0 of 50)&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>5000ms&lt;/td>
&lt;td>0% (0 of 28)&lt;/td>
&lt;td>0% (0 of 50)&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>10000ms&lt;/td>
&lt;td>0% (0 of 28)&lt;/td>
&lt;td>0% (0 of 50)&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>20000ms&lt;/td>
&lt;td>0% (0 of 28)&lt;/td>
&lt;td>4.00% (2 of 50)&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>30000ms&lt;/td>
&lt;td>0% (0 of 28)&lt;/td>
&lt;td>16.00% (8 of 50)&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>40000ms&lt;/td>
&lt;td>0% (0 of 28)&lt;/td>
&lt;td>16.00% (8 of 50)&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>50000ms&lt;/td>
&lt;td>0% (0 of 28)&lt;/td>
&lt;td>18.00% (9 of 50)&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>60000ms&lt;/td>
&lt;td>0% (0 of 28)&lt;/td>
&lt;td>18.00% (9 of 50)&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;/div>
&lt;/details>
&lt;h2 id="postgres-in-docker-vs-native-postgres">Postgres-in-Docker vs. native Postgres&lt;/h2>
&lt;p>&lt;small>Addition made Feb 20, 2021&lt;/small>&lt;/p>
&lt;p>In our original article we did not clarify the performance impacts of running Postgres inside of Docker with a volume bind mount. This was raised as a potential source of IO performance difference to us by Thorsten Ball.&lt;/p>
&lt;p>We ran all tests above with Postgres in Docker, using a volume bind mount (the osxfs driver, not the experimental FUSE gRPC driver.)&lt;/p>
&lt;p>We additionally ran the same table-splitting benchmarks on a native Postgres server (&lt;a href="https://github.com/hexops/pgtrgm_emperical_measurements#native-postgres-tests">reproduction steps here&lt;/a>) and found the following key changes:&lt;/p>
&lt;h3 id="cpu-usage--memory-usage-approximately-the-same">CPU usage &amp;amp; memory usage: approximately the same&lt;/h3>
&lt;p>CPU and memory usage was approximately the same as in our Docker Postgres tests.&lt;/p>
&lt;p>We anticipated this would be the case as the Macbook does have VT-x virtualization enabled (default on all i7/i9 Macbooks, and confirmed through &lt;code>sysctl kern.hv_support&lt;/code>)&lt;/p>
&lt;h3 id="indexing-speed-was-88-faster">Indexing speed was ~88% faster&lt;/h3>
&lt;p>Running the statements to split up the large table into multiple smaller ones, i.e.:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-sql" data-lang="sql">&lt;span class="line">&lt;span class="cl">&lt;span class="k">CREATE&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">TABLE&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">files_000&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">AS&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">SELECT&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">*&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">FROM&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">files&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">WHERE&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">id&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">&amp;gt;&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="mi">0&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">AND&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">id&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">&amp;lt;&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="mi">50000&lt;/span>&lt;span class="p">;&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="k">CREATE&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">TABLE&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">files_001&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">AS&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">SELECT&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">*&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">FROM&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">files&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">WHERE&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">id&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">&amp;gt;&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="mi">50000&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">AND&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">id&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">&amp;lt;&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="mi">100000&lt;/span>&lt;span class="p">;&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="p">...&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Was much faster in native Postgres, taking about 2-8s for each table instead of 20-40s previously, and taking only 15m in total instead of 2h before.&lt;/p>
&lt;p>Parallel creation of the Trigram indexes using e.g.:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-sql" data-lang="sql">&lt;span class="line">&lt;span class="cl">&lt;span class="k">CREATE&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">INDEX&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">IF&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">NOT&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">EXISTS&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">files_000_contents_trgm_idx&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">ON&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">files&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">USING&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">GIN&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">contents&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">gin_trgm_ops&lt;/span>&lt;span class="p">);&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Was also much faster, taking only 23m compared to ~3h with Docker.&lt;/p>
&lt;h3 id="query-performance-is-12-99-faster-depending-on-query">Query performance is 12-99% faster, depending on query&lt;/h3>
&lt;p>We re-ran the same 350 queries as in our earlier table-splitting benchmark, and found the following substantial improvements:&lt;/p>
&lt;ol>
&lt;li>Queries that were previously very slow noticed a ~12% improvement. This is likely due to IO operations needed when interfacing with the 200 separate tables.&lt;/li>
&lt;li>Queries that were previously in the middle-ground noticed meager ~5% improvements.&lt;/li>
&lt;li>Queries that were previously fairly fast (likely searching only over a one or two tables before returning) noticed substantial 16-99% improvements.&lt;/li>
&lt;/ol>
&lt;details>
&lt;summary>Exhaustive comparison details (negative change is good)&lt;/summary>
&lt;div markdown="1">
&lt;table>
&lt;thead>
&lt;tr>
&lt;th>Change&lt;/th>
&lt;th>Time bucket&lt;/th>
&lt;th>Queries under bucket &lt;strong>before&lt;/strong>&lt;/th>
&lt;th>Queries under bucket &lt;strong>after&lt;/strong>&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td>0%&lt;/td>
&lt;td>500s&lt;/td>
&lt;td>350 of 350&lt;/td>
&lt;td>350 of 350&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>-12%&lt;/td>
&lt;td>100s&lt;/td>
&lt;td>309 of 350&lt;/td>
&lt;td>350 of 350&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>-12%&lt;/td>
&lt;td>50s&lt;/td>
&lt;td>309 of 350&lt;/td>
&lt;td>350 of 350&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>-12%&lt;/td>
&lt;td>40s&lt;/td>
&lt;td>308 of 350&lt;/td>
&lt;td>350 of 350&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>-12%&lt;/td>
&lt;td>30s&lt;/td>
&lt;td>308 of 350&lt;/td>
&lt;td>349 of 350&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>-7%&lt;/td>
&lt;td>25s&lt;/td>
&lt;td>307 of 350&lt;/td>
&lt;td>330 of 350&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>-7%&lt;/td>
&lt;td>25s&lt;/td>
&lt;td>307 of 350&lt;/td>
&lt;td>330 of 350&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>-8%&lt;/td>
&lt;td>20s&lt;/td>
&lt;td>302 of 350&lt;/td>
&lt;td>330 of 350&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>-8%&lt;/td>
&lt;td>20s&lt;/td>
&lt;td>302 of 350&lt;/td>
&lt;td>330 of 350&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>-5%&lt;/td>
&lt;td>10s&lt;/td>
&lt;td>297 of 350&lt;/td>
&lt;td>311 of 350&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>-26%&lt;/td>
&lt;td>5s&lt;/td>
&lt;td>237 of 350&lt;/td>
&lt;td>319 of 350&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>-7%&lt;/td>
&lt;td>2500ms&lt;/td>
&lt;td>224 of 350&lt;/td>
&lt;td>240 of 350&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>-9%&lt;/td>
&lt;td>2000ms&lt;/td>
&lt;td>219 of 350&lt;/td>
&lt;td>240 of 350&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>-9%&lt;/td>
&lt;td>1500ms&lt;/td>
&lt;td>219 of 350&lt;/td>
&lt;td>240 of 350&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>-16%&lt;/td>
&lt;td>1000ms&lt;/td>
&lt;td>200 of 350&lt;/td>
&lt;td>237 of 350&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>-14%&lt;/td>
&lt;td>750ms&lt;/td>
&lt;td>190 of 350&lt;/td>
&lt;td>221 of 350&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>-23%&lt;/td>
&lt;td>500ms&lt;/td>
&lt;td>170 of 350&lt;/td>
&lt;td>220 of 350&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>-59%&lt;/td>
&lt;td>250ms&lt;/td>
&lt;td>88 of 350&lt;/td>
&lt;td>217 of 350&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>-99%&lt;/td>
&lt;td>100ms&lt;/td>
&lt;td>1 of 350&lt;/td>
&lt;td>168 of 350&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>-99%&lt;/td>
&lt;td>50ms&lt;/td>
&lt;td>1 of 350&lt;/td>
&lt;td>168 of 350&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;/div>
&lt;/details>
&lt;h2 id="conclusions">Conclusions&lt;/h2>
&lt;p>We think the following learnings are most important:&lt;/p>
&lt;ul>
&lt;li>&lt;code>.git&lt;/code> directories, even with &lt;code>--depth=1&lt;/code> clones, account for 30% of a repositories size on disk (at least in top 10,000 GitHub repositories.)&lt;/li>
&lt;li>Files &amp;gt; 1 MiB (often binaries) account for another 51% of the data size on disk of repositories.&lt;/li>
&lt;li>On only a Macbook Pro, it is possible to get Postgres Trigram regex search over 10,000 repositories to run most reasonable queries in under 5s - and certainly much faster with more hardware.&lt;/li>
&lt;li>&lt;code>pg_trgm&lt;/code> performs single-threaded indexing and querying, unless you split your data up into multiple tables.&lt;/li>
&lt;li>By default, a Postgres &lt;code>text&lt;/code> colum will be compressed by Postgres on disk out of the box - resulting in a 23% reduction in size (with the files we inserted.)&lt;/li>
&lt;li>&lt;code>pg_trgm&lt;/code> GIN indexes take around 26% the size of your data on disk. So if indexing 1 GiB of raw text, expect Postgres to store that text in around ~827 MiB plus 279 MiB for the GIN trigram index.&lt;/li>
&lt;li>Splitting your data into multiple tables if using &lt;code>pg_trgm&lt;/code> is an obvious win, as it allows for paralle indexing which can be the difference between 4h vs 22h. It also reduces the risk of an indexing failure after 22h due to e.g. lack of memory and uses much less peak memory overall.&lt;/li>
&lt;li>Docker bind mounts (not volumes) are quite slow outside of Linux host environments (there are many other articles on this subject.)&lt;/li>
&lt;/ul>
&lt;p>If you are looking for fast regexp or code search today, consider:&lt;/p>
&lt;ul>
&lt;li>&lt;a href="https://sourcegraph.com">Sourcegraph&lt;/a> (disclaimer: the author works here, but this article is not endorsed or affiliated in any way)&lt;/li>
&lt;li>&lt;a href="https://github.com/google/zoekt">Zoekt&lt;/a>&lt;/li>
&lt;li>&lt;a href="https://github.com/BurntSushi/ripgrep">Ripgrep&lt;/a>&lt;/li>
&lt;/ul>
&lt;p>Follow this devlog for updates as we continue investigating faster ways to do regexp &amp;amp; ngram search at large scales.&lt;/p></description></item><item><title>Postgres Trigram search learnings</title><link>https://devlog.hexops.org/2021/postgres-trigram-search-learnings/</link><pubDate>Tue, 26 Jan 2021 00:00:00 +0000</pubDate><guid>https://devlog.hexops.com/2021/postgres-trigram-search-learnings/</guid><description>&lt;p>In this article I talk about learnings I have from trying to use pg_trgm as the backend for a search engine, Tridex, which aims to be a competitor to Google&amp;rsquo;s &lt;a href="https://github.com/google/zoekt">Zoekt&lt;/a> (&amp;ldquo;Fast trigram based code search&amp;rdquo;)&lt;/p>
&lt;h2 id="background">Background&lt;/h2>
&lt;p>I work @ &lt;a href="https://sourcegraph.com">Sourcegraph&lt;/a>, which provides code search, code intelligence, and other developer tooling. (If you&amp;rsquo;re one of my Sourcegraph co-workers, hey! Hexops is the name of my GitHub organization for after-hours experiments and what I hope will one day in the distant future become a successful game development company.)&lt;/p>
&lt;p>Over the past ~8 months I have been exploring the intersection between game development and my work at Sourcegraph, and finding interesting overlap between the two. I have been working on a competitor to Google&amp;rsquo;s &lt;a href="https://github.com/google/zoekt">zoekt&lt;/a> (&amp;ldquo;Fast trigram based code search&amp;rdquo;), which is one of the search backends used by Sourcegraph with a few goals in mind:&lt;/p>
&lt;ol>
&lt;li>Produce a search backend I can use for Hexops, to provide search functionality for a user-voice type platform (think GitHub issues, but with duplicate issue removal and upvote/downvote capability), and other social-network type features in games I hope to one day create. i.e. not just code search&lt;/li>
&lt;li>Learn more about search in hopes of being able to bring some of those learnings to Sourcegraph (I really want our commit/diff search to be indexed, and wish we could index more things in general.) It would also be cool to solve the ominous and difficult &lt;a href="https://github.com/google/zoekt/issues/86">Zoekt memory usage problem&lt;/a> we have had at Sourcegraph for as long as I can remember.&lt;/li>
&lt;li>Provide a foundation for more bleeding-edge &amp;ldquo;here&amp;rsquo;s a whacky idea&amp;rdquo; type experiments that are cool surrounding developer tools, but that are not necessarily guaranteed wins / anything I could reasonable pitch elsewhere.&lt;/li>
&lt;/ol>
&lt;h2 id="indexing-every-character-is-different-than-regular-fts">Indexing every character is different than regular FTS&lt;/h2>
&lt;p>First, it&amp;rsquo;s important to note that my usage of pg_trgm is not the same as general FTS (Full Text Search) usage of pg_trgm in general. My usage (and the use case of code search) cares about every character being indexed, and being able to do regex searches - this is different than traditional FTS where only words matter.&lt;/p>
&lt;h2 id="pg_trgm-indexes-apply-to-all-data-in-that-column">pg_trgm indexes apply to all data in that column&lt;/h2>
&lt;p>This sounds obvious, but in practice has interesting/weird implications when trying to build a code search engine like Zoekt.&lt;/p>
&lt;p>For example, Zoekt builds indexes of repositories code in chunks (from what I understand) and then concurrently, in an unordered fashion, searches those repository code chunks (inverted trigram indexes). This plays a key role in the strategy of pagination that Zoekt implements: you can search over those chunks and give up searching further chunks after you&amp;rsquo;ve found enough results.&lt;/p>
&lt;p>With pg_trgm, a naive approach would be to have a &lt;code>file_contents&lt;/code> column with a pg_trgm GIN index over it, and put every file from every repository into that column. But that index would apply to &lt;em>all&lt;/em> file contents across every repository, so when you want to LIMIT search over that column you are searching over one giant trigram GIN index instead of many smaller ones. It&amp;rsquo;s faster if your aim is to search the entire index, but if your aim is &amp;ldquo;get enough results and then get out&amp;rdquo; it&amp;rsquo;s much slower, because you have to deal with the entire index instead of multiple smaller indexes. But, I have not tested this empirically - so take this statement with a grain or two of salt. It&amp;rsquo;s possible I am wrong here.&lt;/p>
&lt;p>There is an obvious way to counteract this effect, though: use one pg_trgm GIN index (i.e. a distinct table or column) per repository. You now have a distinct GIN trigram index per repository chunk. Of course, when there are thousands of repositories this introduces major complexity in schema management, query execution, etc. as you might imagine having thousands of tables/columns not exactly being great either.&lt;/p>
&lt;p>It is possible that Postgres&amp;rsquo; &lt;a href="https://www.postgresql.org/docs/10/ddl-partitioning.html">table data partitioning&lt;/a> could interoperate with pg_trgm nicely to solve this problem, but I didn&amp;rsquo;t explore this in-depth and found no information on the subject. Importantly, you would need to partition tables based on repository (or better, file-size-based chunks.) It is worth exploring this approach more.&lt;/p>
&lt;h2 id="naive-usage-of-pg_trgm-is-competitive">Naive usage of pg_trgm is competitive!&lt;/h2>
&lt;p>The good news is that even with the naive approach previously described, pg_trgm turns out to be approximately competitive with Zoekt, I assume due to it using a GIN index for trigram matching instead of an inverted index like Zoekt:&lt;/p>
&lt;p>I don&amp;rsquo;t have empirical measurements of this that I can share, you&amp;rsquo;ll just have to take my word for it, but approximately on a 2020 Macbook pro with several thousand source code repositories:&lt;/p>
&lt;ul>
&lt;li>Query time performance is roughly the same for needle-in-the-haystack and haystack-full-of-needles queries over all repositories file contents.&lt;/li>
&lt;li>Pagination can be quite slow towards the tail end of the table, each subset being fetched requires a full search of the index to find the results at the end of the table. A streaming approach rather than traditional SQL pagination would be ideal.&lt;/li>
&lt;li>On-disk data size is quite small compared to Zoekt&amp;rsquo;s, Postgres trigram GIN indexes appear to be quite small and its on-by-default data compression works really well with text.&lt;/li>
&lt;li>Postgres uses MUCH less memory than Zoekt. Like several orders of magnitude less.&lt;/li>
&lt;/ul>
&lt;p>Interestingly, however, Postgres uses MUCH less memory. The choice of using an inverted trigram index in Zoekt, &lt;a href="https://github.com/google/zoekt/issues/86">as I understand it&lt;/a>, is also one of the reasons that its memory usage is so large (among other things, like Go being fairly relaxed about returning memory to the OS.) I also suggest reading &lt;a href="https://news.ycombinator.com/item?id=18584294">these Hacker News comments from 2018&lt;/a> and the linked article from Russ Cox about Google Code Search, from which Zoekt was ultimately born.&lt;/p>
&lt;h2 id="horizontally-scaling-pg_trgm-is-hard">Horizontally scaling pg_trgm is hard&lt;/h2>
&lt;p>Once your data no longer fits into a single machine / Postgres instance, things get tricky. How do you scale pg_trgm across multiple machines?&lt;/p>
&lt;p>Postgres &lt;a href="https://www.postgresql.org/docs/current/high-availability.html">supports a nauseating amount of complex High Availability deployment options&lt;/a> for scaling horizontally, and ideally for a search engine you would want something like data partitioning where data is split across multiple hosts but also with the possibility of replication across multiple hosts (for the event a host goes down.)&lt;/p>
&lt;p>One of the options it supports is horizontal data partitioning through splitting tables into multiple smaller tables, and then using a foreign data wrapper (postgres_fdw) to execute queries that access all of those tables across the network. This is described in a bit more depth &lt;a href="https://www.highgo.ca/2019/08/08/horizontal-scalability-with-sharding-in-postgresql-where-it-is-going-part-2-of-3/">in this blog post&lt;/a>. This could be a good approach, but I decided not to explore this option further.&lt;/p>
&lt;p>Ultimately I decided to go with a multiple-table approach, with each table representing a type of data (e.g. repository code) and performing horizontal sharding and scaling at the application layer outside of the DB entirely. I will explain why I took this approach in the next section.&lt;/p>
&lt;h2 id="deploying-and-tuning-postgres-configuration-is-hard">Deploying and tuning Postgres configuration is hard&lt;/h2>
&lt;p>In stark contrast to Zoekt, which uses a ridiculous amount of memory, with Postgres I was left with a different problem: I could not get it to use all available memory/CPU to perform search queries faster.&lt;/p>
&lt;p>Raising &lt;code>shared_buffers = 128MB&lt;/code> (default &lt;code>32MB&lt;/code>) helped a fair amount, but still a similar issue. Ultimately I believe the majority of query time is spent not on CPU latency, but rather a combination of RAM lookups / L3 cache misses and IO latency.&lt;/p>
&lt;p>Nonetheless, this introduced a new problem for me: I wanted this search engine to be as simple to deploy as possible, and the idea of having Postgres tuning being a requirement did not appeal to me. I have also seen in the field how many use Amazon RDS, which does not allow for tuning (and is often a several-years-outdated Postgres version anyway.)&lt;/p>
&lt;p>With all of this in mind, I ended up going with deploying Docker containers with my own Postgres binary and configuration built-in and managing Postgres on behalf of the user. This ended up being interesting for other reasons I won&amp;rsquo;t get into (think automatic zero-downtime upgrades from Postgres 12 -&amp;gt; 13.)&lt;/p>
&lt;h2 id="ultra-large-scales">Ultra large scales&lt;/h2>
&lt;p>Although using pg_trgm is competitive (much better than?) Zoekt - it&amp;rsquo;s still not enough to be able to efficiently scale up to a massive scale such as Google. The index is still relatively large (in the hundreds of MB for thousands of repositories) well outside the bounds of CPU caches and that makes it kind of slow at truly large scales where the index grows near-linearly.&lt;/p>
&lt;p>Ultimately.. Once you add in deployment pains, configuration tuning, trigram index splitting, horizontal scaling, etc. it&amp;rsquo;s a lot less like using Postgres to build a search engine - and a lot more like using Postgres as a trigram index provider. It&amp;rsquo;s interesting, and works, but there may be better options.&lt;/p>
&lt;h2 id="future-exploration">Future exploration&lt;/h2>
&lt;p>A more fruitful direction may be to explore effectively the same architecture (i.e. roll-your-own-search-engine), but replacing pg_trgm and Postgres entirely with a custom ngram index built on top of the bloom-filter successor which is more L3-cache-friendly, &lt;a href="https://lemire.me/blog/2019/12/19/xor-filters-faster-and-smaller-than-bloom-filters/">xor filters&lt;/a>.&lt;/p>
&lt;p>I believe with this approach you could achieve scales/performance similar to Google, Bing, etc. while providing full regex search and more. This idea is not completely unfounded, it has been &lt;a href="https://github.com/BurntSushi/ripgrep/issues/1518">suggested for indexing in ripgrep, for example&lt;/a> (although &lt;a href="https://github.com/BurntSushi/ripgrep/issues/1497">it appears they&amp;rsquo;ll be going with an inverted trigram index similar to Zoekt&lt;/a> instead.)&lt;/p>
&lt;h2 id="closing-thoughts">Closing thoughts&lt;/h2>
&lt;p>In Tridex (the search engine I am working on), we&amp;rsquo;re planning on exploring this avenue by replacing Postgres and pg_trgm with a custom trigram index based on xor-filters, and will likely write it in Zig. I only realized the opportunity here in a late-night conversation with a coworker who has an affinity for bloom filters, so perhaps I am misguided and this will turn up no fruit.&lt;/p>
&lt;p>Follow this devlog for updates.&lt;/p></description></item></channel></rss>