<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:cc="http://cyber.law.harvard.edu/rss/creativeCommonsRssModule.html">
    <channel>
        <title><![CDATA[Stories by Konstantin Tarkus on Medium]]></title>
        <description><![CDATA[Stories by Konstantin Tarkus on Medium]]></description>
        <link>https://medium.com/@koistya?source=rss-692b968dbc82------2</link>
        <image>
            <url>https://cdn-images-1.medium.com/fit/c/150/150/1*UAEzd2AY18TtTCjcOmc09w.jpeg</url>
            <title>Stories by Konstantin Tarkus on Medium</title>
            <link>https://medium.com/@koistya?source=rss-692b968dbc82------2</link>
        </image>
        <generator>Medium</generator>
        <lastBuildDate>Mon, 06 Apr 2026 09:22:35 GMT</lastBuildDate>
        <atom:link href="https://medium.com/@koistya/feed" rel="self" type="application/rss+xml"/>
        <webMaster><![CDATA[yourfriends@medium.com]]></webMaster>
        <atom:link href="http://medium.superfeedr.com" rel="hub"/>
        <item>
            <title><![CDATA[From Prototype to Production: Change Control for AI Decisions]]></title>
            <link>https://medium.com/@koistya/from-prototype-to-production-change-control-for-ai-decisions-d1e0fc773e4d?source=rss-692b968dbc82------2</link>
            <guid isPermaLink="false">https://medium.com/p/d1e0fc773e4d</guid>
            <category><![CDATA[programming]]></category>
            <category><![CDATA[developer-tools]]></category>
            <category><![CDATA[artifical-intellegence]]></category>
            <category><![CDATA[machine-learning]]></category>
            <category><![CDATA[software-engineering]]></category>
            <dc:creator><![CDATA[Konstantin Tarkus]]></dc:creator>
            <pubDate>Mon, 09 Feb 2026 12:51:17 GMT</pubDate>
            <atom:updated>2026-02-09T12:51:17.087Z</atom:updated>
            <content:encoded><![CDATA[<figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*Pu4BsjUHQYyXIT0kR3_FJA.png" /></figure><p>You update a prompt. You test it on five examples. Looks good. You ship.</p><p>Three days later, a support ticket: a field that used to contain &quot;$4.2M in Q3 revenue&quot; now says &quot;strong revenue growth&quot;. Another ticket: a claim that listed a specific hire date is gone entirely. Your prompt change improved 95% of cases and silently broke 3%. The inputs that regressed weren&#39;t in your five test cases. They never are.</p><p>This is the gap between prototype and production. Prototypes need to work on a few examples. Production systems need to prove that nothing broke.</p><h3>Change control for AI decisions</h3><p>We’ve solved this problem before. Not for AI, but for code.</p><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/7fca0108b1cc960fbb6c78b58d96b4f9/href">https://medium.com/media/7fca0108b1cc960fbb6c78b58d96b4f9/href</a></iframe><p>The difference: with code, the diff exists automatically. With AI, it doesn’t exist unless you explicitly compute it. You wouldn’t merge a PR without reviewing the diff. Why would you ship a prompt change without seeing what decisions it would alter?</p><p>Verist applies this workflow to AI decisions. Define a step, capture baselines, change your prompt, recompute, and get a diff showing exactly what changed before you deploy.</p><h3>Try it: catch a prompt regression in 60 seconds</h3><pre>npm install verist @verist/llm zod openai</pre><p>verist is the kernel (steps, replay, diff). @verist/llm adds LLM provider adapters.</p><h3>Define a step</h3><p>A step wraps an LLM call with typed input and output. defineExtractionStep handles the boilerplate of calling the model, parsing the response, and validating against a Zod schema.</p><pre>import { defineExtractionStep, createOpenAI } from &quot;@verist/llm&quot;;<br>import { run, unwrap, recompute, formatDiff } from &quot;verist&quot;;<br>import OpenAI from &quot;openai&quot;;<br>import { z } from &quot;zod&quot;;<br><br>const ClaimsSchema = z.object({<br>  claims: z.array(z.string()),<br>});<br><br>const extractClaims = defineExtractionStep({<br>  name: &quot;extract-claims&quot;,<br>  input: z.object({ text: z.string() }),<br>  output: ClaimsSchema,<br>  request: (input) =&gt; ({<br>    model: &quot;gpt-4o-mini&quot;,<br>    temperature: 0,<br>    messages: [<br>      {<br>        role: &quot;system&quot;,<br>        content: `Extract specific, verifiable claims from the text.<br>Each claim must contain a concrete number, name, or date.<br>Return raw JSON only, no markdown: { &quot;claims&quot;: [&quot;claim1&quot;, ...] }`,<br>      },<br>      { role: &quot;user&quot;, content: input.text },<br>    ],<br>    responseFormat: &quot;json&quot;,<br>  }),<br>});</pre><p>This is a self-contained definition. No global registry, no side effects at definition time. The step declares what goes in, what comes out, and how to call the model.</p><h3>Run and capture a baseline</h3><pre>const adapters = { llm: createOpenAI({ client: new OpenAI() }) };<br><br>const text = `Acme Corp reported $4.2M in Q3 revenue, up 18% year-over-year.<br>CEO Jane Park announced 3 new enterprise clients and plans to expand<br>the engineering team from 45 to 60 people by March 2025.`;<br><br>const baseline = unwrap(await run(extractClaims, { text }, { adapters }));</pre><p>run() executes the step and captures the result as an artifact. unwrap() extracts the value from the Result type (Verist uses errors-as-values, not exceptions). The baseline now holds the output along with the artifacts needed to recompute later.</p><pre>Baseline: 4 claims<br>  - Acme Corp reported $4.2M in Q3 revenue<br>  - Revenue up 18% year-over-year<br>  - CEO Jane Park announced 3 new enterprise clients<br>  - Plans to expand engineering team from 45 to 60 by March 2025</pre><h3>Recompute with a new prompt</h3><p>Now change the prompt. Maybe you want something more concise, so you switch to a summarization prompt:</p><pre>const vagueStep = defineExtractionStep({<br>  name: &quot;extract-claims&quot;,<br>  input: z.object({ text: z.string() }),<br>  output: ClaimsSchema,<br>  request: (input) =&gt; ({<br>    model: &quot;gpt-4o-mini&quot;,<br>    temperature: 0,<br>    messages: [<br>      {<br>        role: &quot;system&quot;,<br>        content: `Summarize the key points from the text. Be concise.<br>Return raw JSON only, no markdown: { &quot;claims&quot;: [&quot;point1&quot;, ...] }`,<br>      },<br>      { role: &quot;user&quot;, content: input.text },<br>    ],<br>    responseFormat: &quot;json&quot;,<br>  }),<br>});<br><br>// Replay the new step logic against the baseline&#39;s input<br>const result = unwrap(await recompute(baseline, vagueStep, { adapters }));<br><br>console.log(formatDiff(result.outputDiff));</pre><p>recompute() runs the new step definition against the same input from the baseline, without re-running your application code. It returns a diff comparing the old output to the new one.</p><pre>  claims[0]: &quot;Acme Corp reported $4.2M in Q3 revenue&quot;<br>          -&gt; &quot;Acme Corp had strong Q3 revenue growth&quot;<br>- claims[3]: &quot;Plans to expand engineering team from 45 to 60 by March 2025&quot;</pre><p>That’s the regression. The vague prompt lost specificity in claims[0] and dropped claims[3] entirely. You caught it before shipping. No customer tickets. No silent regressions.</p><h3>Schema violations: catching what logs miss</h3><p>Back to the opening scenario. You changed a prompt, tested on five examples, and shipped. Three percent of cases broke. But what kind of breakage?</p><p>Sometimes the model doesn’t just change a value. It returns something structurally wrong: a string where you expected a number, a missing required field, an array that’s suddenly empty. Logs would show the raw response. Verist validates the output against your Zod schema on every recompute and surfaces violations explicitly.</p><pre>const result = unwrap(await recompute(baseline, updatedStep, { adapters }));<br><br>if (result.schemaViolations.length &gt; 0) {<br>  for (const v of result.schemaViolations) {<br>    console.log(`${v.path.join(&quot;.&quot;)}: ${v.kind} (${v.message})`);<br>  }<br>}<br>// claims.0: type (Expected string, received number)</pre><p>Each violation has a path, a kind (&quot;missing&quot;, &quot;type&quot;, &quot;refinement&quot;, or &quot;other&quot;), and a message. In CI, schema violations always fail the build, even with --no-fail-on-diff. Value changes are debatable. Structural breakage is not.</p><h3>Try it without API keys</h3><p>You don’t need OpenAI credentials to see the workflow in action. verist init scaffolds a deterministic step using regex extraction:</p><pre>npx verist init<br>npx verist capture --step parse-contact --input &quot;verist/inputs/*.json&quot;<br>npx verist test --step parse-contact</pre><p>This creates a step, captures baselines from sample inputs, and runs a regression test. No LLM calls, no API keys. Once you’re ready to see LLM diffs, add your key and run the full example:</p><pre>OPENAI_API_KEY=sk-... npx tsx examples/prompt-diff/quickstart.ts</pre><h3>Human corrections that survive</h3><p>In production, humans correct AI mistakes. A reviewer fixes a misclassified claim. A support agent overrides an extracted value. These corrections are expensive to make and easy to lose. Recompute with a new prompt and most systems wipe them out.</p><p>Verist separates the two concerns with a three-layer state model:</p><pre>computed  +  overlay  =  effective<br>(AI)        (human)      (what the app sees)</pre><ul><li><strong>Computed</strong>: AI-derived values, rewritten on every recompute</li><li><strong>Overlay</strong>: Human corrections, never touched by automation</li><li><strong>Effective</strong>: Shallow merge where overlay wins:<br>{ ...computed, ...overlay }</li></ul><pre>import { effectiveState } from &quot;@verist/storage&quot;;<br><br>// AI extracted: { amount: &quot;$4.2M&quot;, currency: &quot;USD&quot; }<br>// Human corrected currency to EUR<br><br>const state = {<br>  computed: { amount: &quot;$4.2M&quot;, currency: &quot;USD&quot; },<br>  overlay: { currency: &quot;EUR&quot; }, // human correction<br>};<br><br>const effective = effectiveState(state);<br>// -&gt; { amount: &quot;$4.2M&quot;, currency: &quot;EUR&quot; }</pre><p>When you recompute with a new prompt, the computed layer updates. The overlay stays. If the new prompt extracts { amount: &quot;$4.2M&quot;, currency: &quot;GBP&quot; }, the effective state is still { amount: &quot;$4.2M&quot;, currency: &quot;EUR&quot; } because the human correction takes precedence.</p><p>This isn’t a merge strategy you have to build. It’s a primitive in the storage layer.</p><h3>CI: block regressions before merge</h3><p>Once you have baselines captured, add a regression gate to your CI pipeline:</p><pre># .github/workflows/verist.yml<br>name: Verist regression check<br>on: [push, pull_request]<br><br>jobs:<br>  verist-test:<br>    runs-on: ubuntu-latest<br>    steps:<br>      - uses: actions/checkout@v4<br>      - uses: actions/setup-node@v4<br>        with:<br>          node-version: &quot;22&quot;<br>      - run: npm install<br>      - run: npx verist test --step extract-claims</pre><p>verist test recomputes every baseline for the step and exits with code 1 if anything changed. Schema violations always fail. Value changes fail by default but can be relaxed with --no-fail-on-diff for steps where some drift is acceptable.</p><p>For PR comments with a summary table:</p><pre>- run: npx verist test --step extract-claims --format markdown &gt; verist-report.md<br>  continue-on-error: true<br>- uses: marocchino/sticky-pull-request-comment@v2<br>  with:<br>    path: verist-report.md</pre><p>The JSON format (--format json) gives you machine-readable output with counts for passed, changed, schemaViolations, and failed, so you can build custom thresholds or notifications.</p><h3>What this is not</h3><ul><li><strong>Not an agent framework.</strong> No autonomous loops, no memory, no tool calling. Verist is the layer underneath that makes decisions reviewable.</li><li><strong>Not observability.</strong> Logs tell you what happened. Verist tells you what <em>would</em> change before you ship.</li><li><strong>Not a hosted platform.</strong> It’s a library. Your code, your infrastructure, your database.</li><li><strong>Not evals.</strong> Eval frameworks score outputs against a ground truth you label. Verist doesn’t require ground truth. It diffs old output against new output so you can review the delta.</li></ul><h3>When this fits</h3><ul><li><strong>Prompt iteration</strong> — You’re tuning prompts and need to know what breaks across your full input set, not just three cherry-picked examples.</li><li><strong>Model upgrades</strong> — You’re switching from GPT-4 to Claude or upgrading to a new version and want to quantify the impact before deploying.</li><li><strong>Safe recompute</strong> — You need to reprocess historical data with new logic without overwriting the human corrections your team has made.</li><li><strong>Decision audit</strong> — Regulated or high-stakes domains where you need to reproduce and explain any past AI decision.</li></ul><p>If your AI outputs are consumed by downstream code and a silent change would cause a bug, a bad customer experience, or a compliance issue, you need change control.</p><p>If you’ve ever hesitated to change a prompt because you couldn’t predict the blast radius, that’s what Verist solves.</p><h3>Try it</h3><pre>npm install verist @verist/cli zod<br>npx verist init</pre><p><a href="https://verist.dev">https://verist.dev</a></p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=d1e0fc773e4d" width="1" height="1" alt="">]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[Designing Fair Token Bucket Policies for Real-Time Apps]]></title>
            <link>https://levelup.gitconnected.com/designing-fair-token-bucket-policies-for-real-time-apps-289b00eb4435?source=rss-692b968dbc82------2</link>
            <guid isPermaLink="false">https://medium.com/p/289b00eb4435</guid>
            <category><![CDATA[web-development]]></category>
            <category><![CDATA[software-engineering]]></category>
            <category><![CDATA[software-development]]></category>
            <category><![CDATA[programming]]></category>
            <category><![CDATA[software-architecture]]></category>
            <dc:creator><![CDATA[Konstantin Tarkus]]></dc:creator>
            <pubDate>Tue, 04 Nov 2025 23:47:31 GMT</pubDate>
            <atom:updated>2025-11-04T23:47:31.136Z</atom:updated>
            <content:encoded><![CDATA[<h4>Sizing burst capacity and refill rates for chat, gaming, and streaming without starving legitimate users or enabling abuse.</h4><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*HDfR8GWRfK-_wXVHCN4WTA.jpeg" /></figure><h3>The Problem With One-Size-Fits-All Rate Limiting</h3><p>You launch your multiplayer game. The collision detection runs tight — good. Then a speedrunner discovers an exploit: spam move commands fast enough and the server’s position updates lag behind optimistic rendering. Your rate limiter catches them at 30 messages, which feels fair. But then legitimate players in heated moments hit the same limit after 20 rapid inputs during a firefight.</p><p>You dial the limit down to 15 to catch abuse earlier. Now everyone complains about input lag.</p><p>Sound familiar?</p><p>The problem isn’t that rate limiting is wrong. It’s essential for protecting backend resources and ensuring fair access. The problem is that <strong>most policies treat all traffic the same</strong>, missing the fact that different applications have radically different traffic signatures. A chat app needs to tolerate sudden bursts when users paste code blocks. A game needs predictable, high-frequency streams. A video stream needs one massive initial burst, then steady flow.</p><p>The token bucket algorithm is elegant enough to handle all three — but only if you size it correctly for your workload. This post shows you how to reason about capacity and refill rate, not as magic numbers, but as levers that directly model your application’s behavior. You’ll learn to distinguish between legitimate user behavior and abuse, then size your limits accordingly.</p><h3>How Token Buckets Work: A Quick Mental Model</h3><p>Token buckets are simple: tokens refill at a constant rate (r), you consume tokens per message, and requests queue when empty. The key parameters are:</p><ul><li><strong>Refill rate</strong> (r): tokens per second — your sustained message rate</li><li><strong>Burst capacity</strong> (B_max): maximum tokens — how many messages you can send instantly</li></ul><p>The magic: bursts are allowed without punishing inactive users. This is why token buckets beat leaky buckets for real-time apps.</p><h3>Capacity vs. Refill Rate: The Fundamental Trade-Off</h3><p>These two parameters work together. Sized independently, they create terrible user experiences or dangerous security holes.</p><h4>Capacity: Burst Tolerance</h4><p><strong>What it controls</strong>: How many tokens a user can consume in a single burst.</p><p><strong>If too low</strong>: Users feel input lag. A gamer can’t respond quickly. A chatter can’t paste a code block without hitting the limit.</p><p><strong>If too high</strong>: Attackers get a long window to flood before detection kicks in.</p><p><strong>Example comparison</strong> (chat app):</p><ul><li>Capacity 10, Rate 2/sec → allows 10 instant messages, then wait 5s. Feels slow.</li><li>Capacity 100, Rate 2/sec → allows 100 instant messages, then wait 50s. Easy attack vector.</li><li>Capacity 30, Rate 10/sec → allows 30 instant messages, then wait 2s. Good balance.</li></ul><h4>Refill Rate: Sustained Throughput</h4><p><strong>What it controls</strong>: The long-term average message rate.</p><p><strong>If too low</strong>: Legitimate users feel starved. App feels sluggish.</p><p><strong>If too high</strong>: Abusers run rampant and can saturate the system.</p><p><strong>Example comparison</strong> (gaming):</p><ul><li>Rate 10/sec, Capacity 20 → 10 messages/sec sustained, burst of 20. Too slow for 60 Hz game.</li><li>Rate 100/sec, Capacity 20 → 100 messages/sec sustained, but only 20-token burst. Drops packets on network jitter.</li><li>Rate 60/sec, Capacity 120 → 60 messages/sec sustained, 2-second burst window. Matches 60 Hz tick rate perfectly.</li></ul><h4>The Pairing Principle</h4><p>Set capacity and refill rate together, not independently. A useful formula:</p><pre>recovery_time = capacity / refill_rate</pre><p>If you want users to recover from a full burst in 3 seconds, size it as:</p><ul><li>refill_rate = 20 tokens/sec implies capacity = 60 tokens</li><li>refill_rate = 10 tokens/sec implies capacity = 30 tokens</li></ul><p>Both allow 3-second recovery, but the first sustains higher throughput while the second is more restrictive. Your choice depends on your app’s typical load and abuse patterns.</p><pre>| Use Case        | Capacity | Rate/sec | Sustained     | Recovery | Outcome                                      |<br>| --------------- | -------- | -------- | ------------- | -------- | -------------------------------------------- |<br>| Too Strict      | 10       | 2        | 2 msg/sec     | 5s       | Users churn from input lag                   |<br>| Chat (Balanced) | 100-200  | 1-2      | 1-2 msg/sec   | 50-200s  | Handles history load without false positives |<br>| Gaming (30 Hz)  | 10-15    | 35-40    | 35-40 msg/sec | &lt;1s      | Matches game tick rate plus jitter margin    |<br>| Too Loose       | 500      | 100      | 100 msg/sec   | 5s       | Abusers flood the system                     |</pre><h3>Domain-Specific Policies</h3><p>Real-time apps have distinct traffic signatures. Understanding yours is the key to sizing limits that work.</p><h4>Chat Applications</h4><p><strong>Traffic signature</strong>:</p><ul><li>Users type at 0.6–1.1 words per second (roughly one message every 10–20 seconds under normal typing)</li><li>Bursts occur: pasting code snippets, rapid-fire group conversation, media uploads</li><li>False positives hurt engagement; users churn if rate limiting feels excessive</li><li>Initial state: loading message history, fetching member lists, synchronizing presence</li></ul><p><strong>Recommended policy</strong>:</p><pre>Capacity: 100-200 messages<br>Rate: 1-2 messages/sec<br>Recovery time: 50-200 seconds</pre><p><strong>Rationale</strong>:</p><ul><li>Refill rate (1–2/sec) covers normal conversation: typical typing pace plus presence updates and typing indicators</li><li>Capacity (100–200) handles the most common legitimate burst: loading message history on channel entry or pasting code snippets</li><li>If a user loads 100 recent messages on entering a channel, they need burst capacity of at least 100</li><li>Recovery time of 50–200 seconds is acceptable — users expect a pause after heavy activity like bulk uploads or rapid pasting</li><li>Real test: User sends 50 fast messages (code paste or rapid conversation), 50–150 tokens remain, wait 25–150s, back to full bucket</li></ul><p><strong>What breaks this</strong>:</p><ul><li>Capacity 10: Users copying code blocks get blocked constantly</li><li>Rate 100/sec: Coordinated spam floods rooms before moderation</li><li>Rate 0.5/sec: Users feel starved in group conversations</li><li>No monitoring: You won’t know when the policy is too strict</li></ul><p><strong>Observability</strong>: Alert if more than 5% of users hit rate limits daily. This signals either a too-strict policy or a spam spike.</p><h4>Multiplayer Gaming</h4><p><strong>Traffic signature</strong>:</p><ul><li>Players send input commands at the <strong>game tick rate</strong> (e.g., 60 Hz = 60 messages/sec per player)</li><li>One limit too low makes the game unplayable; one too high enables desynchronization attacks</li><li>Different message types have different costs: position updates vs. chat messages</li><li>Network jitter causes packet batching; the bucket must absorb temporary spikes</li></ul><p><strong>Recommended policy</strong> (for a 30 Hz server):</p><pre>Per-player input (position, rotation):<br>  Capacity: 10-15 tokens (quick action &quot;combos&quot;)<br>  Rate: 35-40 tokens/sec (30 Hz tick rate + 20% safety margin)<br><br>Per-player chat (separate bucket):<br>  Capacity: 20 messages<br>  Rate: 5 messages/sec</pre><p><strong>Rationale</strong>:</p><ul><li>Refill rate must match or exceed the server’s tick rate, plus a 20% safety margin for network jitter</li><li>For a 30 Hz server, 35–40 tokens/sec ensures players can send and receive at the game’s natural frequency</li><li>Capacity (10–15 tokens) is small because gaming traffic is a continuous stream, not bursty — large bursts often signal abuse (spam scripts)</li><li>Separate buckets prevent chat spam from blocking critical movement commands</li><li>Scenario: Player executes a quick combo (5 actions). Bucket has tokens. Player spams chat. Chat bucket empties, but movement input still flows through</li></ul><p><strong>What breaks this</strong>:</p><ul><li>Single bucket for all messages: Attacker spams chat, blocks position updates, player appears frozen</li><li>Capacity 20: Temporary packet loss makes the game unplayable</li><li>Capacity 500+: Malicious clients send fake position updates, world desynchronizes</li><li>Rate 20/sec: Input lag on 60 Hz server feels terrible</li></ul><h4>Streaming (Live Video)</h4><p><strong>Traffic signature</strong>:</p><ul><li>Two-phase model: massive initial buffer-fill (0–10 seconds), then steady-state flow at video bitrate</li><li>Phase 1 (buffer-fill): High-speed transfer to fill client buffer (e.g., 10 seconds of content upfront)</li><li>Phase 2 (steady-state): Downloads at bitrate matched to playback speed (with minor pauses to prevent buffer overflow)</li><li>Control plane: Viewers send heartbeats, seek requests, quality changes (low volume); Streamers send metadata updates</li></ul><p><strong>The Critical Buffer-Fill Calculation</strong>:</p><p>Streaming policies must explicitly handle the initial massive burst. This is the most commonly missed piece.</p><pre># Example: 1080p stream at 6 Mbps, target 10-second buffer<br><br>Video bitrate:       750 KB/sec (6 Mbps = 750 kilobytes/second)<br>Buffer duration:     10 seconds<br>Initial burst size:  750 KB/sec × 10 sec = 7,500 KB = 7.5 MB<br><br>This is the minimum burst capacity your policy must allow.</pre><p><strong>Recommended policy</strong> (per-viewer):</p><pre>Initial buffer-fill (first connection or seek):<br>  Capacity: 7.5 MB (for 10s of 6 Mbps content)<br>  Rate: 750 KB/sec (6 Mbps)<br>  Recovery time: ~10 seconds<br><br>Steady-state playback (after buffer full):<br>  Capacity: Bitrate × 2-3 seconds (e.g., 1.5-2.25 MB for 6 Mbps stream)<br>  Rate: Video bitrate (e.g., 750 KB/sec)<br><br>Control plane (heartbeats, seeks, quality changes):<br>  Capacity: 10 actions<br>  Rate: 2 actions/sec (separate bucket from data transfer)</pre><p><strong>Rationale</strong>:</p><ul><li>Burst capacity must be sized to the actual initial buffer-fill requirement, not guessed</li><li>For a 6 Mbps stream with 10-second target buffer, capacity must be <strong>at least 7.5 MB</strong></li><li>Steady-state capacity slightly larger than refill rate prevents underflow during minor network jitter</li><li>Separate control plane prevents heartbeat timeouts or seek delays from being starved by data transfer</li><li>Recovery time for initial burst is acceptable (users expect a few seconds of buffering on start)</li></ul><p><strong>Calculating for Your Bitrate</strong>:</p><p>First, convert Mbps to KB/sec (divide by 8): 6 Mbps → 750 KB/sec. Then multiply by buffer duration.</p><pre>| Bitrate         | KB/sec | 5s Buffer | 10s Buffer | 15s Buffer |<br>| --------------- | ------ | --------- | ---------- | ---------- |<br>| 2.5 Mbps (480p) | 312.5  | 1.56 MB   | 3.125 MB   | 4.7 MB     |<br>| 5 Mbps (720p)   | 625    | 3.13 MB   | 6.25 MB    | 9.4 MB     |<br>| 8 Mbps (1080p)  | 1,000  | 5 MB      | 10 MB      | 15 MB      |</pre><p><strong>What breaks this</strong>:</p><ul><li>Using a generic 100-token capacity: Can’t handle even 5 seconds of buffer-fill at real-world bitrates</li><li>Mixing data and control in one bucket: Heartbeats timeout while buffering, users see “connection lost”</li><li>Heartbeat rate 0.2/sec: Users seeking every few seconds hit the control limit</li><li>Not accounting for buffer-fill: Users experience long startup delays on first play or after seeking</li></ul><h3>Layering Limits: Per-User, Per-Route, Cost-Based</h3><p>A single global limit is a blunt instrument. Production systems need multiple layers of protection.</p><h4>The Pyramid of Protection</h4><pre>Global limit (all users, all messages)<br>    ↓<br>Per-IP / Per-Device limit (catch botnets)<br>    ↓<br>Per-User Per-Type limit (fairness between users)<br>    ↓<br>Per-Route Cost-Based limit (protect expensive operations)</pre><p><strong>Example</strong>: <strong>Chat room with 10,000 users</strong>:</p><pre>Global: 100,000 messages/sec<br>Per-IP: 100 messages/sec (catch credential stuffing)<br>Per-User: 20/sec per message type<br>Per-Route: TEXT=1 token, ADMIN_COMMAND=10 tokens</pre><h4>Per-User Per-Type (Most Common)</h4><p>One bucket per (user, message type) pair. Chat messages have a separate quota from presence updates.</p><pre>// Per-user per-type (in-memory adapter shown; swap for Redis/Durable Objects in production)<br>import { rateLimit, keyPerUserPerType } from &quot;@ws-kit/middleware&quot;;<br>import { memoryRateLimiter } from &quot;@ws-kit/adapters/memory&quot;;<br><br>const limiter = rateLimit({<br>  limiter: memoryRateLimiter({<br>    capacity: 30,<br>    tokensPerSecond: 10,<br>  }),<br>  key: keyPerUserPerType,<br>});<br><br>router.use(limiter);<br><br>router.on(ChatSendMessage, (ctx) =&gt; {<br>  ctx.send(ChatSentMessage, { message: ctx.payload.text });<br>});<br><br>router.on(PresenceUpdateMessage, (ctx) =&gt; {<br>  ctx.publish(&quot;presence&quot;, PresenceChangedMessage, {<br>    userId: ctx.ws.data?.userId,<br>  });<br>});</pre><p><strong>Why it works</strong>:</p><ul><li>Prevents one chatty user from monopolizing bandwidth and starving others</li><li>Different message types can have different tolerances based on their impact</li><li>Fair: each user gets an independent quota per message type</li><li>Simple to reason about and tune based on real-world usage</li></ul><h4>Cost-Based Limiting (Advanced)</h4><p>Different operations consume different amounts of resources. GitHub and Shopify use this for GraphQL APIs — it’s equally powerful for WebSockets.</p><pre>// Single unified rate limiter with custom cost per operation<br>import { rateLimit } from &quot;@ws-kit/middleware&quot;;<br>import { memoryRateLimiter } from &quot;@ws-kit/adapters/memory&quot;;<br><br>const limiter = rateLimit({<br>  limiter: memoryRateLimiter({<br>    capacity: 200,<br>    tokensPerSecond: 50,<br>  }),<br>  key: (ctx) =&gt; {<br>    const user = ctx.ws.data?.userId ?? &quot;anon&quot;;<br>    return `user:${user}`;<br>  },<br>  cost: (ctx) =&gt; {<br>    // All costs must be positive integers<br>    if (ctx.type === &quot;ChatSend&quot;) return 1;<br>    if (ctx.type === &quot;FileUpload&quot;) return 20;<br>    if (ctx.type === &quot;AdminBan&quot;) return 10;<br>    if (ctx.type === &quot;HistorySearch&quot;) return 15;<br>    return 1; // default<br>  },<br>});<br><br>router.use(limiter);</pre><p><strong>Why it works</strong>:</p><ul><li>Expensive operations (database queries, external APIs) cost more tokens</li><li>Cheap operations (presence updates, heartbeats) cost less, allowing higher frequency</li><li>Single shared bucket prevents any one operation type from monopolizing quota</li><li>Users have flexibility: spend budget on many cheap operations or a few expensive ones</li><li>Scales better than managing dozens of independent buckets</li></ul><p><strong>Trade-off</strong>: Costs must align with actual resource consumption. Misjudged costs will either starve legitimate users or enable abuse.</p><h4>Choosing a Rate Limiter Adapter</h4><p>Examples above use memoryRateLimiter. For production, choose your adapter based on deployment:</p><pre>| Adapter         | Best For                | Latency |<br>| --------------- | ----------------------- | ------- |<br>| In-Memory       | Single server, dev/test | &lt;1ms    |<br>| Redis           | Distributed fleets      | 2-5ms   |<br>| Durable Objects | Edge/global             | 10-50ms |</pre><p>Swap the adapter and the semantics remain identical:</p><pre>import { redisRateLimiter } from &quot;@ws-kit/adapters/redis&quot;;<br><br>const limiter = rateLimit({<br>  limiter: redisRateLimiter(redis, { capacity: 30, tokensPerSecond: 10 }),<br>  key: keyPerUserPerType,<br>});</pre><p><strong>Note on imports</strong>: Each adapter is available via a subpath export. Use @ws-kit/adapters/memory, @ws-kit/adapters/redis, or @ws-kit/adapters/cloudflare-do to import only the adapter you need. Importing from @ws-kit/adapters directly requires explicit adapter selection via the platform-specific factories.</p><h4>How to Calculate Costs</h4><p>Assigning costs requires understanding what each operation actually costs your backend:</p><pre>// Example cost calculation based on database operations<br>// All costs must be positive integers (no decimals, no zero/negative values)<br>const operationCosts = {<br>  // Simple, in-memory operations (cost = 1)<br>  PresenceUpdate: 1, // Just update local state<br>  TypingIndicator: 1, // Ephemeral, instant delivery<br><br>  // Database reads (cost = 2-5 depending on complexity)<br>  MessageGet: 3, // Single document read<br>  UserProfile: 2, // Cache-friendly read<br><br>  // Database writes (cost = 5-10 including indexing)<br>  MessageSend: 5, // Write + index update<br>  MessageEdit: 5, // Update + audit log<br><br>  // Expensive operations (cost = 20-50 for CPU-intensive work)<br>  MessageSearch: 20, // Full-text search across millions<br>  HistoryExport: 50, // Generates file, might send email<br>  AnalyticsQuery: 30, // Aggregates data across time range<br>};</pre><p>Start with these simple ratios:</p><ul><li>In-memory operations: 1 token</li><li>Database reads: 2–5 tokens (depends on complexity)</li><li>Database writes: 5–10 tokens (include indexing, replication)</li><li>External API calls: 10–20 tokens (includes latency uncertainty)</li><li>Aggregations/searches: 20–50 tokens (CPU-intensive)</li></ul><p>Then validate by measuring actual P95 latencies. If message.search takes 100ms and message.send takes 10ms, but both cost 5 tokens, you&#39;re underweighting search. Increase its cost to 50 tokens.</p><h3>Common Mistakes and Red Flags</h3><p>The most impactful rate-limiting failures in production boil down to three design errors. These are the battle scars from real systems.</p><h4>1. Capacity Way Higher Than Refill Rate (Most Common)</h4><p><strong>The Failure Mode</strong>: This is why most rate-limiting deployments fail in their first week.</p><p>❌ Bad: Capacity 1000, Rate 10/sec = 100-second burst window.</p><p>An attacker (or botnet) exploits the massive window: they send 1000 requests in 10 seconds while your monitoring system is still sleeping. The damage is done before typical monitoring thresholds are breached and alerts fire — by then, the database is already melting.</p><p><strong>Real-world impact</strong>: In early 2023, a major gaming platform experienced a DDoS that succeeded not because attacks were fast, but because their burst window was so large that coordinated spam flooded the system before rate-limit signals propagated to edge nodes.</p><p>✅ Fix: Use the formula capacity ≈ refill_rate * desired_recovery_time. A 10/sec refill rate with a 3-second recovery window means capacity ≈ 30. An attacker can&#39;t get meaningful damage done in 3 seconds.</p><h4>2. Confusing Per-User and Global Limits (Second Most Common)</h4><p><strong>The Failure Mode</strong>: Limits work independently instead of layered, creating security gaps or fairness problems.</p><p>❌ Bad approach 1: Per-user limit only → one power user or coordinated botnet saturates the server’s total capacity.</p><p>❌ Bad approach 2: Global limit only → one misbehaving user or network spike blocks all other users. Innocent mobile users in high-latency regions get starved.</p><p><strong>Real-world impact</strong>: A real-time collaboration platform launched with per-user limits but no global cap. During a product launch event, legitimate traffic from 50,000 concurrent users each hitting their personal limit harmlessly… except in aggregate they exceeded the database’s actual throughput. The platform went down not from abuse, but from scale.</p><p>✅ Fix: Implement the pyramid structure. Global limits upstream catch coordinated attacks by restricting aggregate capacity across all users and IPs — even if every single user stays within their quota, the sum cannot overwhelm the database. Per-user limits downstream ensure fairness. Both must exist.</p><h4>3. Not Accounting for Network Jitter (Subtle but Frequent)</h4><p><strong>The Failure Mode</strong>: Policies work perfectly in the lab but fail mysteriously in production due to real-world network behavior.</p><p>❌ Bad: Capacity 5, Rate 10/sec. Looks reasonable on paper: 0.5-second recovery.</p><p>On a flaky mobile network, packet loss causes TCP to retransmit and batch messages. The client bursts 5 messages in rapid succession. Bucket emptied. User can’t send anything for 0.5 seconds. It feels like input lag.</p><p>At scale, 10% of users experience this on Tuesday afternoons when cellular networks are congested. Support tickets flood in. You revert the rollout.</p><p>✅ Fix: Add 1–2 seconds of headroom to your capacity parameter — enough to absorb traffic bursts lasting that duration. For a 10/sec rate, use capacity 30–50 (representing 3–5 seconds of refill), not 10. This absorbs temporary network spikes without being a security hole (recovery is still fast: 3–5 seconds).</p><h4>Other Considerations</h4><p>If building cost-based limits (advanced), ensure operation costs match actual resource consumption — misjudged weights starve expensive queries or enable cheap-operation spam. Cost assignment is iterative: start with educated guesses (database reads cost more than presence updates), then continuously validate against actual latencies and user behavior, adjusting weights every few weeks. For multi-phase services, tie buckets to stable identifiers (session or device ID, not just user ID) to avoid losing tokens on login. Finally, always layer limits — global upstream, per-operation downstream — so bugs in one handler don’t cascade.</p><h3>Testing and Tuning Your Policy</h3><p>Choosing initial values is only half the battle. The other half is validating them against real-world usage patterns.</p><h4>Load Testing Strategy</h4><p>Before deploying to production, validate your policy with synthetic load:</p><ol><li><strong>Baseline test</strong>: Run legitimate usage at 10x typical load. Verify that normal users don’t hit limits.</li><li><strong>Burst test</strong>: Simulate network jitter by batching messages. Ensure capacity absorbs temporary spikes without false positives.</li><li><strong>Abuse test</strong>: Run coordinated spam from multiple IPs/users. Verify that global and per-IP limits catch coordinated attacks before they saturate the system.</li><li><strong>Edge case test</strong>: Run mixed workloads (some users light, some heavy). Verify that fair distribution works as expected.</li></ol><h4>Monitoring During Rollout</h4><p>When deploying to production, ramp up gradually:</p><ul><li><strong>Week 1</strong>: Deploy to 5% of users. Monitor metrics hourly. Look for unexpected spikes in rejected requests.</li><li><strong>Week 2</strong>: Expand to 25% of users. Watch for patterns (time of day? geographic? user type?).</li><li><strong>Week 3–4</strong>: Full rollout with continued monitoring.</li></ul><p>Track these metrics:</p><ul><li><strong>Rate limit hit rate</strong> (by message type, user tier): Should be &lt;1% for legitimate traffic</li><li><strong>Histogram of tokens remaining at rejection</strong>: If users always have 0 tokens, you’re too strict. If they have plenty, you’re too loose.</li><li><strong>Time since last refill</strong>: How long do users wait after hitting a limit? Should match your recovery_time.</li><li><strong>P95 latency of rate limit checks</strong>: Keep &lt;1ms. Slow checks block your event loop.</li></ul><h4>Tuning Based on Real Data</h4><p>After 2–4 weeks of data, adjust:</p><ul><li><strong>If &lt;0.1% hit the limit</strong>: Your policy is too loose. Users may complain if you see coordinated spam.</li><li><strong>If 0.1%-1% hit the limit</strong>: Good zone. Some legitimate power users hit it, but most don’t.</li><li><strong>If 1%-5% hit the limit</strong>: Getting strict. Watch user support tickets for complaints about input lag or sluggish feel.</li><li><strong>If &gt;5% hit the limit</strong>: Too strict. You’re harming legitimate users.</li></ul><p>Also consider:</p><ul><li><strong>Seasonal patterns</strong>: Games may be stricter during tournaments. Chat apps during product launches.</li><li><strong>User cohorts</strong>: Free users might have stricter limits than paid. Mobile users might have more generous limits due to network variance.</li><li><strong>Abuse trends</strong>: If a particular message type is being attacked, tighten its limit without affecting others.</li></ul><h3>Real-World Case Study: RoomChat</h3><p><strong>Setup</strong>: Collaborative room editor with real-time cursor positions and code editing. 10,000 concurrent users, averaging 2–3 Mbps egress during peak hours. Launch week: everything seemed fine. Week two: support tickets spiked.</p><p><strong>Initial policy</strong> (launched naively):</p><pre>Per-user per-type: capacity 200, rate 100/sec</pre><p><strong>Why this choice</strong>: Backend could handle ~100k msg/sec aggregate. With 10k users, averaging 10 msg/sec each felt reasonable. The team picked 100/sec as a “burst allowance” without much thought — it came from dividing remaining capacity by active users at a single snapshot.</p><p><strong>Week 1–2</strong>: <strong>The Discovery Phase</strong></p><p>Support tickets: “Rate limit? I just pasted a code snippet” and “My cursor keeps disappearing.”</p><p>First instinct: check if it’s users on weird network conditions or if everyone’s being blocked equally.</p><pre>Oct 15 — rate_limit_hit_total: 2.3% of active users hitting limits<br>Oct 16 — pulled message breakdown: TEXT_MESSAGE 12% of hits, CODE_PASTE 31%, CURSOR_POSITION 89%<br>Oct 17 — chart of hits by time of day: spiky, not uniform (but no clear pattern yet)</pre><p>Initial hypothesis: “Bucket’s too small, users are naturally bursty.” Increased capacity to 500 and shipped it midweek, fingers crossed.</p><p><strong>Week 2–3</strong>: <strong>The False Start</strong></p><p>Capacity 500 helped… sort of. Some metrics improved dramatically, others barely moved:</p><pre>CURSOR_POSITION hits: 89% → 47% (huge win, network jitter absorbed)<br>TEXT_MESSAGE hits: 12% → 9% (minor improvement)<br>CODE_PASTE hits: 31% → 19% (only 12-point drop, still high)</pre><p>But in the support channel: “I pasted a 50-line snippet and got rate limited.” Still happening at 19% — not acceptable.</p><p>Plus, during testing, one engineer noticed: if you rapidly paste 5 code blocks, the server logs don’t show 5 separate CODE_PASTE messages arriving. They show 3 or 4, sometimes out of order. TCP batching on 4G was real.</p><p><strong>Root cause discovery</strong> (messy, took longer than expected):</p><p>One engineer pulled wire timings from production — looking at actual message arrival patterns. First observation: code pastes weren’t evenly spaced. A user would send one paste, and the server would receive it as 2–3 fragmented TCP packets within 50ms. The token bucket saw each fragment as a separate message.</p><p>Hypothesis: “Network batching is compressing things.” But how much?</p><p>Checked message sizes: text messages averaged 80 bytes, code pastes 2–4 KB. On a congested 4G network, the TCP stack groups multiple frames together. A single logical “paste code block” operation becomes 3–4 physical packets arriving in rapid succession.</p><p>More investigation: ran a load test with intentional packet loss. At 5% loss, CODE_PASTE hit rate jumped from 19% to 34%. At 15% loss (simulating congested networks), it hit 41%.</p><p>Additionally discovered during an unrelated audit: 10k users × 100/sec = 1M msg/sec theoretical capacity. Reality check against database: the backend maxed out around 150k/sec sustained load. We were giving users a budget that the infrastructure couldn&#39;t actually handle.</p><p><strong>Week 3–4</strong>: <strong>Iterating (with setbacks)</strong></p><p>Strategy 1: Separate buckets by message type, uniform token cost.</p><pre>TEXT_MESSAGE: capacity 50, rate 20/sec<br>CODE_PASTE: capacity 80, rate 8/sec<br>CURSOR_POSITION: capacity 300, rate 100/sec</pre><p>Deployed. After 2 days, CODE_PASTE hits were down to 15% — progress but still unacceptable. Cursor felt smooth. Text felt fine. But code paste users were still complaining.</p><p>Late-week realization (and this was annoying): the issue wasn’t rate — it was <em>capacity</em>. Users could hit their CODE_PASTE bucket almost immediately during packet batching. Once empty, they had to wait 10 seconds for recovery. That felt broken.</p><p>Strategy 2: Different approach. Maybe code pastes cost more than text messages because they’re actually more expensive on the backend.</p><p>Measured actual server resource consumption:</p><ul><li>TEXT_MESSAGE: ~0.5ms processing</li><li>CODE_PASTE: 2.5–4ms processing (syntax highlighting, diff calculation, conflict detection)</li><li>CURSOR_POSITION: ~0.1ms processing</li></ul><p>Cost weighting: code paste takes 5–8x longer. So cost them more tokens.</p><p><strong>Final tuned policy</strong> (end of week 4):</p><pre>Global: 80,000 msg/sec (150k/sec max minus headroom, with some buffer for spikes)<br><br>Per-user per-type:<br>  TEXT_MESSAGE: capacity 50, rate 20/sec (1 token each)<br>  CODE_PASTE: capacity 100, rate 3/sec (5 tokens each; accounts for actual backend overhead)<br>  CURSOR_POSITION: capacity 500, rate 100/sec (0.5 tokens; cheap)</pre><p><strong>Rollout</strong> (week 5, then week 6):</p><p>Friday deploy: Canary to 5% of traffic. Monitored hourly over the weekend — hit rate hovered around 1.6–1.9%. Not perfect, but better than 2.3%.</p><p>Monday morning: Expanded to 30% (skipped the “25%” step; ops team wanted to move faster). Hit rate climbed to 2.1% during peak hours. Huh. Global limit + network spike during Monday morning activity. Pulled the release back to 5% after 4 hours.</p><p>Mid-week investigation: realized the global 80k limit was too tight for genuine peaks. Bumped to 95k. Re-deployed to 30%. Steadier at 1.3–1.7%.</p><p>Friday: Full rollout to 100%. First week: 1.4%, 1.8%, 1.1%, 1.9%, 1.5% day-to-day. Not the clean 0.8–1.2% range. More like 1–2%, with occasional spikes to 2.1% during product launch days or when heavy European users come online.</p><p>Current state (2 months later): still hovering around 1.3% average, with weeks ranging 0.9%-2.2% depending on user activity patterns.</p><p><strong>What we learned from actual data</strong>:</p><ol><li><strong>Initial assumptions were confidently wrong</strong>. Dividing backend capacity by user count sounds logical; it’s still wrong. Real traffic doesn’t distribute evenly. One 10,000-user instance isn’t the same as ten 1,000-user instances.</li><li><strong>Network batching isn’t a theoretical problem</strong>. It shows up in production the moment users are on congested networks. Staging with 500 synthetic users showed 0.2% hit rate. Real-world 10k users on 4G networks: 2.3%. The difference is TCP’s Nagle algorithm and network variance, not your code.</li><li><strong>Metrics lie until you understand them</strong>. “CODE_PASTE 31% hit rate” sounds bad. But investigation revealed: CODE_PASTE messages were 25x larger than TEXT, and they triggered 5–8x more backend work. A hit rate of 31% on expensive operations might be more acceptable than 12% on cheap ones. You need to measure resource cost, not just message count.</li><li><strong>Rollouts reveal what testing missed</strong>. The Friday canary revealed the baseline. Monday’s 2.1% spike (hitting the global limit) told us the 80k value was optimistic. Without real traffic, you can’t tune effectively.</li><li><strong>Tuning isn’t a one-time event</strong>. Two months in, we’re still between 0.9%-2.2% depending on the week. That’s not failure; that’s normal. The policy absorbs user behavior variation and network conditions without catastrophic failures. We adjust the global limit once every 2–3 weeks if we see consistent drift, but the per-user per-type strategy handles most variance automatically.</li></ol><p><strong>Current monitoring setup</strong>:</p><ul><li>Alert if &gt;3% of users hit rate limits in an hour (sudden policy drift or attack)</li><li>Track global limit rejection rate separately; if &gt;0.5% of requests hit it, check for DDoS or planned load spike</li><li>Weekly histogram of per-type hit rates; CODE_PASTE at 1–2% is expected, TEXT at &gt;1% means investigation</li><li>Correlate with customer support tickets; if code-heavy teams complain about “lags,” the policy might be too strict</li></ul><p><strong>Lessons learned</strong> (revised after production experience):</p><ol><li><strong>Assumptions require validation</strong>. The textbook formula (capacity × refill_rate = recovery_time) is useful for thinking but meaningless without real traffic data. Always A/B test in canary before rolling out.</li><li><strong>Cost-based limiting works, but requires measurement</strong>. You can’t eyeball token costs. Measure actual backend latency per operation. CODE_PASTE costing 5 tokens wasn’t a guess; it came from production profiling.</li><li><strong>Network jitter is your biggest wildcard</strong>. All your lab testing happens on stable networks. Production users on 4G, airplane WiFi, and congested corporate networks behave differently. Add 20–30% headroom to burst capacity; it’s not a waste, it’s insurance.</li><li><strong>Global limits are the safety net you hope never activates</strong>. During normal operation, they shouldn’t be hit. When they are (coordinated spam, load spike), that’s the only thing standing between your backend and meltdown. Don’t skimp on them.</li><li><strong>Tuning is iterative forever</strong>. There’s no “final policy.” Seasonal patterns (back-to-school, holidays, product launches) mean you’ll adjust limits 4–6 times per year. Build monitoring that makes adjustments quick and low-risk.</li></ol><h3>Implementation Checklist</h3><ul><li>Analyze your application’s traffic signature (chat? gaming? streaming?)</li><li>Define capacity and rate for each message type or operation</li><li>Document reasoning for chosen numbers (recovery time, user behavior, abuse scenarios)</li><li>Implement observability: log when limits are hit, by whom, how often</li><li>Test with synthetic load: simulate network jitter, bursts, and coordinated spam</li><li>Monitor in production for 2 weeks before declaring success</li><li>Review every 3 months: are users complaining? Is abuse rising? Adjust as needed.</li><li>Set up alerts for anomalies (sudden spike in limit hits, unusual patterns)</li><li>Use separate buckets for different message types (or cost-weight a single bucket)</li></ul><p><strong>Key metrics to track</strong>:</p><pre>rate_limit_exceeded_total (counter)<br>  Break down by message type and user cohort<br><br>rate_limit_bucket_tokens (gauge)<br>  Distribution of remaining tokens at time of rejection<br><br>rate_limit_check_latency (histogram)<br>  Cost of checking limits (target: &lt;1ms)</pre><h3>Putting It Into Practice</h3><p>Token bucket rate limiting is deceptively simple: add tokens, consume tokens, reject when empty. But designing fair policies that protect your backend without starving users requires understanding the trade-offs between capacity, refill rate, buckets, and costs.</p><h4>The Three-Phase Deployment Strategy</h4><p><strong>Phase 1: Design (1–2 weeks)</strong></p><ul><li>Analyze your app’s traffic signature using real production logs</li><li>Calculate baseline rates from P95 user behavior</li><li>Choose strategy: single bucket, per-user per-type, or cost-based</li><li>Build a simple rate limit simulator to test policies without deploying</li></ul><p><strong>Phase 2: Testing (1 week)</strong></p><ul><li>Run load tests at 5x and 10x typical peak load</li><li>Simulate network jitter and packet batching</li><li>Run abuse scenarios: coordinated spam, individual floods, mixed workloads</li><li>Document what settings work and what breaks</li></ul><p><strong>Phase 3: Gradual Rollout (4 weeks)</strong></p><ul><li>Week 1: Deploy to 5% of users (or one region)</li><li>Week 2: Expand to 25% as you build confidence</li><li>Week 3–4: Roll out to 100% with continued monitoring</li><li>Keep tuning decisions lightweight — you can adjust rates without redeploying</li></ul><h4>Common Implementation Patterns</h4><p>(Examples use memoryRateLimiter — swap for redisRateLimiter or durableObjectRateLimiter per deployment.)</p><p><strong>Pattern 1: Per-User Limits Only</strong></p><p>Simple, good for early-stage apps. Protects against individual users monopolizing resources but doesn’t prevent coordinated attacks.</p><pre>// Per-user rate limit<br>import { rateLimit } from &quot;@ws-kit/middleware&quot;;<br>import { memoryRateLimiter } from &quot;@ws-kit/adapters/memory&quot;;<br><br>const limiter = rateLimit({<br>  limiter: memoryRateLimiter({<br>    capacity: 100,<br>    tokensPerSecond: 20,<br>  }),<br>  key: (ctx) =&gt; {<br>    const user = ctx.ws.data?.userId ?? &quot;anon&quot;;<br>    return `user:${user}`;<br>  },<br>});<br><br>router.use(limiter);</pre><p><strong>Pattern 2: Tiered by Message Type</strong></p><p>Better for apps with mixed message costs. Text chat gets higher limits than expensive operations.</p><pre>// Define different limits per operation type<br>import { rateLimit } from &quot;@ws-kit/middleware&quot;;<br>import { memoryRateLimiter } from &quot;@ws-kit/adapters/memory&quot;;<br><br>const chatLimiter = rateLimit({<br>  limiter: memoryRateLimiter({ capacity: 50, tokensPerSecond: 20 }),<br>  key: (ctx) =&gt; `user:${ctx.ws.data?.userId ?? &quot;anon&quot;}`,<br>});<br><br>const uploadLimiter = rateLimit({<br>  limiter: memoryRateLimiter({ capacity: 5, tokensPerSecond: 1 }),<br>  key: (ctx) =&gt; `user:${ctx.ws.data?.userId ?? &quot;anon&quot;}`,<br>});<br><br>const searchLimiter = rateLimit({<br>  limiter: memoryRateLimiter({ capacity: 10, tokensPerSecond: 2 }),<br>  key: (ctx) =&gt; `user:${ctx.ws.data?.userId ?? &quot;anon&quot;}`,<br>});<br><br>// Register each limiter with its specific message type<br>router.use(ChatSendMessage, chatLimiter);<br>router.use(FileUploadMessage, uploadLimiter);<br>router.use(SearchHistoryMessage, searchLimiter);</pre><p><strong>Pattern 3: Cost-Based (Most Sophisticated)</strong></p><p>Single bucket per user, costs scale with operation impact. Best for mature apps where you’ve measured actual costs.</p><p><strong>Selecting Your Pattern</strong>:</p><ul><li><strong>Early stage</strong>: Start with Pattern 1 (per-user only)</li><li><strong>Multiple message types</strong>: Graduate to Pattern 2 (per-type limits)</li><li><strong>Mature, complex API</strong>: Pattern 3 (cost-based) provides the most control</li></ul><h4>Quick Start</h4><ol><li>Analyze your app’s traffic signature (chat? gaming? streaming?)</li><li>Pick initial capacity and rate from the domain-specific recommendations above</li><li>Choose your strategy: simple per-user, per-type, or cost-based</li><li>Deploy to a small cohort and monitor for 2 weeks</li><li>Tune based on real-world feedback: Are users complaining? Is abuse rising?</li></ol><p>The case study shows that one generic policy rarely survives contact with production. But by reasoning about capacity and refill rate as levers that model your workload, you move from reactive firefighting to proactive, confident tuning.</p><p>Fair rate limiting isn’t about saying “no” more often. It’s about saying “yes” predictably, protecting the system when necessary, and building a platform where both legitimate users and your backend infrastructure can thrive. When done right, users won’t notice the limit is there — they’ll just experience a fast, fair, and stable service.</p><h3>Further Reading</h3><ul><li>Token bucket algorithm (<a href="https://en.wikipedia.org/wiki/Token_bucket">Wikipedia</a>)</li><li>Rate limiting patterns (<a href="https://discord.com/developers/docs/topics/rate-limits">Discord Developer Docs</a>)</li><li>Cost-based rate limiting (<a href="https://shopify.engineering/rate-limiting-graphql-apis-calculating-query-complexity">Shopify Engineering</a>)</li><li>Rate limiting middleware in WS-Kit (<a href="https://kriasoft.com/ws-kit/guides/rate-limiting">Guide</a>)</li></ul><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=289b00eb4435" width="1" height="1" alt=""><hr><p><a href="https://levelup.gitconnected.com/designing-fair-token-bucket-policies-for-real-time-apps-289b00eb4435">Designing Fair Token Bucket Policies for Real-Time Apps</a> was originally published in <a href="https://levelup.gitconnected.com">Level Up Coding</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.</p>]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[Beyond the Lock: Why Fencing Tokens Are Essential]]></title>
            <link>https://levelup.gitconnected.com/beyond-the-lock-why-fencing-tokens-are-essential-5be0857d5a6a?source=rss-692b968dbc82------2</link>
            <guid isPermaLink="false">https://medium.com/p/5be0857d5a6a</guid>
            <category><![CDATA[programming]]></category>
            <category><![CDATA[javascript]]></category>
            <category><![CDATA[nodejs]]></category>
            <category><![CDATA[web-development]]></category>
            <category><![CDATA[typescript]]></category>
            <dc:creator><![CDATA[Konstantin Tarkus]]></dc:creator>
            <pubDate>Mon, 20 Oct 2025 03:35:49 GMT</pubDate>
            <atom:updated>2025-10-20T03:35:49.635Z</atom:updated>
            <content:encoded><![CDATA[<blockquote>A lock isn’t enough. Discover how fencing tokens prevent data corruption from stale locks and “zombie” processes.</blockquote><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*g_74YCdIXoAJTgkzgR8WFQ.png" /></figure><p>Your payment processor just charged a customer twice. Your inventory system thinks you have 47 widgets when there are only 23. Both disasters happened <em>despite</em> using distributed locks. The culprit? Your process held a lock that had already expired, became a “zombie,” and corrupted data while believing it was still protected.</p><p>This isn’t a rare edge case. It’s a fundamental property of distributed systems that most developers don’t discover until production.</p><h3>The Illusion of Safety</h3><p>Distributed locks feel safe. You acquire a lock from Redis or your database, perform your critical work, and release it. The pattern is simple:</p><pre>import { createLock } from &quot;syncguard/redis&quot;;<br>import Redis from &quot;ioredis&quot;;<br><br>const redis = new Redis();<br>const lock = createLock(redis);<br><br>// UNSAFE: No protection against stale locks<br>async function processPayment(orderId: string) {<br>  await lock(<br>    async () =&gt; {<br>      const order = await db.getOrder(orderId);<br>      await paymentGateway.charge(order.amount);<br>      await db.updateOrder(orderId, { status: &quot;paid&quot; });<br>    },<br>    { key: `payment:${orderId}`, ttlMs: 30000 },<br>  );<br>}</pre><p>This code <em>looks</em> correct. You’re using a proper distributed lock with a 30-second timeout. But there’s a critical flaw that becomes visible only under production conditions.</p><p>The problem is subtle: <strong>you’re treating a distributed lock as a binary state</strong> (locked/unlocked), just like an in-process mutex. But a distributed lock isn’t a mutex. It’s a <em>lease</em> — a time-bound grant of exclusive access that can expire while you’re still using it.</p><h3>When Locks Lie: The Zombie Process Problem</h3><p>Here’s what actually happens in production:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*Y7KiYeSft2Ky84q-DyPjgw.png" /></figure><p><strong>Timeline</strong>:</p><p><strong>T=0s</strong>: Process A acquires the lock with a 30-second TTL and starts processing a payment.</p><p><strong>T=5s</strong>: Process A enters a stop-the-world garbage collection pause. In the JVM, these pauses can last for <em>minutes</em> in pathological cases. But even a 35-second pause is enough to break everything.</p><p><strong>T=30s</strong>: The lock expires in Redis. From the lock service’s perspective, Process A has died.</p><p><strong>T=31s</strong>: Process B successfully acquires the same lock and begins processing the payment.</p><p><strong>T=33s</strong>: Process B charges the customer’s card and updates the database.</p><p><strong>T=40s</strong>: Process A wakes up from its GC pause, completely unaware that 35 seconds have passed. From its perspective, only microseconds elapsed. It believes it still holds the lock.</p><p><strong>T=41s</strong>: Process A charges the customer’s card <em>again</em> and overwrites Process B’s database update.</p><p><strong>Result</strong>: The customer is charged twice. Data corruption. A production incident.</p><p>This isn’t a bug in your lock implementation. This isn’t a bug in Redis. <strong>This is how distributed systems work</strong>. A process can be paused at any moment by:</p><ul><li>Garbage collection (stop-the-world pauses lasting seconds or even minutes)</li><li>OS process preemption (your process gets swapped out)</li><li>Virtual memory page faults (requires slow disk I/O)</li><li>Network delays (requests hang for seconds or minutes)</li></ul><p>The fundamental issue: <strong>only two parties are involved</strong> — the client and the lock manager. The client thinks it holds the lock. The lock manager knows the lease expired. But there’s no one to stop the client from proceeding with stale authorization.</p><h3>The Three-Party Protocol: Enter Fencing Tokens</h3><p>The solution requires a mental model shift. We need a third party to validate whether operations should be accepted: the resource itself.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*NgCECmj1YcWD3iiIj-WXog.png" /></figure><p>A <strong>fencing token</strong> is a monotonically increasing number issued by the lock service with every successful lock acquisition. Each time any process acquires the lock for a given resource, the token increases. Process A gets token 33. When the lock expires and Process B acquires it, Process B gets token 34.</p><p>The protocol works like this:</p><ol><li><strong>Client acquires lock and receives a token</strong>: { ok: true, lockId: &quot;...&quot;, fence: &quot;000000000000033&quot; }</li><li><strong>Client includes the token in every write</strong>: All operations to the protected resource must carry the fence token</li><li><strong>Resource checks the token</strong>: Before executing any write, the resource compares the incoming token against the last token it saw</li><li><strong>Resource rejects stale tokens</strong>: If incoming_token &lt;= last_seen_token, reject the write</li><li><strong>Resource accepts and updates</strong>: If incoming_token &gt; last_seen_token, accept the write and store the new token</li></ol><p>Now let’s replay the zombie process scenario with fencing tokens:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*Zx-HyURwZhHsV9TDhlDsng.png" /></figure><p><strong>Timeline</strong>:</p><p><strong>T=0s</strong>: Process A acquires the lock and receives token 33, then enters a GC pause.</p><p><strong>T=31s</strong>: Lock expires. Process B acquires the lock and receives token 34.</p><p><strong>T=33s</strong>: Process B charges the payment gateway (unfenced operation) and writes to the database with token 34. The database validates 34 &gt; null, accepts the write, and stores 34 as the last-seen token.</p><p><strong>T=40s</strong>: Process A wakes up from its GC pause. It charges the payment gateway again (creating a duplicate charge), then attempts to update the database with its stale token 33.</p><p><strong>T=41s</strong>: The database validates: 33 &gt; 34 is false. <strong>Write rejected</strong>. The database responds with an error.</p><p><strong>Result</strong>: Database integrity preserved — the zombie process cannot corrupt order state. However, the payment gateway was charged twice because it doesn’t participate in fencing. This demonstrates why idempotency keys are needed for external APIs (covered in “When Fencing Isn’t Possible”).</p><p>The key insight: <strong>the resource doesn’t trust the client’s claim of holding the lock</strong>. The resource validates the token against reality. Even a process with a stale view of the world cannot corrupt data.</p><h3>How SyncGuard Implements Fencing Tokens</h3><p><a href="https://kriasoft.com/syncguard/">SyncGuard</a> provides fencing tokens out-of-the-box for all its backends (Redis, PostgreSQL, Firestore). The implementation varies by backend, but the API is consistent.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*41Y5PTcT5BJnouJQGPITGQ.png" /></figure><h3>Backend Implementation</h3><p><strong>Redis</strong>: Uses atomic INCR on a per-key fence counter. The increment and lock acquisition happen in a single Lua script for atomicity:</p><pre>-- Within the atomic acquire script<br>local fenceKey = KEYS[3]  -- Per-resource counter key<br>local fence = string.format(&quot;%015d&quot;, redis.call(&#39;INCR&#39;, fenceKey))<br>-- Store fence in lock data and return it</pre><p><strong>PostgreSQL</strong>: Uses a dedicated fence_counters table with database-enforced atomicity. The counter increment happens within the same transaction as lock acquisition.</p><p><strong>Firestore</strong>: Uses Firestore transactions with per-key counter documents. The transaction ensures the counter increment and lock creation are atomic.</p><p>All backends return fence tokens as <strong>15-digit zero-padded strings</strong> (e.g., &quot;000000000000042&quot;):</p><ul><li>Monotonically increasing per resource key</li><li>Lexicographically comparable (use string comparison: fenceA &gt; fenceB)</li><li>Guaranteed unique even across crashes and restarts</li><li>No parsing needed — just compare strings directly</li></ul><h3>Resource-Side Implementation</h3><p>The resource (your database, file system, or API) must actively participate in the fencing protocol. This requires three steps:</p><p><strong>1. Add a fence token column to your data model</strong>:</p><pre>ALTER TABLE orders ADD COLUMN last_fence_token VARCHAR(15);</pre><p><strong>2. Validate fence tokens on every write</strong>:</p><pre>UPDATE orders<br>SET<br>  status = $1,<br>  last_fence_token = $2,<br>  updated_at = NOW()<br>WHERE<br>  order_id = $3<br>  AND last_fence_token &lt; $2  -- CRITICAL: only accept strictly greater tokens<br>RETURNING *;</pre><p><strong>3. Check the result to detect fenced-out operations</strong>:</p><pre>async function updateOrderWithFencing(<br>  orderId: string,<br>  updates: { status: string },<br>  fence: string,<br>): Promise&lt;boolean&gt; {<br>  const result = await db.query(<br>    `UPDATE orders<br>     SET status = $1, last_fence_token = $2, updated_at = NOW()<br>     WHERE order_id = $3 AND last_fence_token &lt; $2<br>     RETURNING *`,<br>    [updates.status, fence, orderId],<br>  );<br><br>  // If no rows updated, our fence token was stale<br>  return result.rowCount &gt; 0;<br>}</pre><h3>Putting It All Together</h3><p>Here’s the safe pattern with <a href="https://kriasoft.com/syncguard/">SyncGuard</a>:</p><pre>import { createRedisBackend } from &quot;syncguard/redis&quot;;<br>import Redis from &quot;ioredis&quot;;<br><br>const redis = new Redis();<br>const backend = createRedisBackend(redis);<br><br>// SAFE: Database validates fence token before accepting writes<br>async function processPayment(orderId: string) {<br>  await using lock = await backend.acquire({<br>    key: `payment:${orderId}`,<br>    ttlMs: 30000,<br>  });<br><br>  if (!lock.ok) {<br>    throw new Error(&quot;Could not acquire lock&quot;);<br>  }<br><br>  // Extract the fence token - a monotonically increasing number<br>  const { fence } = lock; // e.g., &quot;000000000000042&quot;<br><br>  const order = await db.getOrder(orderId);<br>  await paymentGateway.charge(order.amount);<br><br>  // Database validates: only accepts writes with fence &gt; last_seen_fence<br>  const updated = await db.updateOrderWithFencing(<br>    orderId,<br>    { status: &quot;paid&quot; },<br>    fence,<br>  );<br><br>  if (!updated) {<br>    // Our fence token was stale - another process with a higher token won<br>    // This means our lock expired and we&#39;re a &quot;zombie process&quot;<br>    throw new Error(&quot;Stale lock - operation rejected by resource&quot;);<br>  }<br><br>  // Lock automatically released when exiting &#39;await using&#39; block<br>}</pre><p>If Process A pauses during payment processing and its lock expires, Process B will acquire a new lock with a higher fence token. When Process A wakes up and attempts to update the database with its stale fence token, the database rejects the write. The payment is processed exactly once.</p><h3>When Fencing Isn’t Possible</h3><p>Not all systems can participate in the fencing protocol. Third-party REST APIs, legacy systems, or external services may not support custom token validation. In these cases, you have several options:</p><p><strong>Idempotency Keys</strong>: Many payment gateways and external APIs support idempotency keys. Use a unique request ID (like {orderId}-{fence}) to prevent duplicate processing:</p><pre>await paymentGateway.charge({<br>  amount: order.amount,<br>  idempotencyKey: `order-${orderId}-${fence}`,<br>});</pre><p><strong>Optimistic Concurrency Control</strong>: Use version numbers or ETags if the external system supports them. Before updating, check that the version hasn’t changed since you read it.</p><p><strong>Move to a Fencing-Capable Resource</strong>: Use your own database as a proxy. Instead of writing directly to the external API, write to your database with fence token validation, then have a separate process (idempotent worker) sync to the external system.</p><p><strong>Compensating Transactions</strong>: Design operations to be reversible. If you detect a duplicate operation after the fact, have a process to undo it.</p><p>The key principle: <strong>if you can’t validate fence tokens at the resource, you must use another mechanism to ensure idempotency</strong>.</p><h3>When You Need Fencing Tokens</h3><p>Not every lock requires fencing tokens. The decision depends on what the lock is protecting.</p><p><strong>You NEED fencing tokens when</strong>:</p><ul><li>The lock is for <em>correctness</em>, not just efficiency</li><li>Failures would cause data corruption, not just duplicate work</li><li>Examples: financial transactions, inventory updates, order state machines, account balance modifications</li></ul><p><strong>You might NOT need fencing tokens when</strong>:</p><ul><li>The lock is for <em>efficiency</em> only (e.g., preventing duplicate cache computations)</li><li>Your system can tolerate occasional duplicates</li><li>Idempotency alone provides sufficient protection</li><li>Operations are commutative (order doesn’t matter)</li></ul><p><strong>Architectural alternatives to consider</strong>:</p><ul><li><strong>Idempotency keys</strong>: For external APIs that support them</li><li><strong>Optimistic concurrency control</strong>: Use version numbers or timestamps</li><li><strong>Event sourcing</strong>: Immutable append-only logs eliminate update conflicts</li><li><strong>CRDTs</strong>: For operations that are naturally commutative</li></ul><p>The rule of thumb: if a duplicate or out-of-order operation would corrupt your data, you need either fencing tokens or an equivalent mechanism.</p><h3>The Bigger Picture: Locks vs Leases</h3><p>The fundamental lesson is a shift in mental model. A distributed lock is not a mutex. It’s a <strong>lease</strong> — a time-bound, probabilistic grant of permission.</p><p><strong>Leases can expire while you’re using them</strong>. This is not a failure mode. This is normal operation in distributed systems. Process pauses, network delays, and clock skew are not bugs to be fixed — they are fundamental properties of the environment.</p><p>Fencing tokens upgrade this probabilistic safety to <strong>deterministic correctness</strong>. Instead of hoping your process doesn’t pause, you build a system where even a paused process cannot cause harm. The resource becomes the final arbiter of operation validity.</p><p>This is the essence of defensive programming in distributed systems: <strong>assume your view of the world is stale</strong>. Don’t trust the client’s claim of holding a lock. Validate at the resource level with monotonically increasing tokens.</p><h3>Conclusion</h3><p>If you’re using distributed locks for data correctness, and you’re not using fencing tokens (or an equivalent mechanism), you have a latent data corruption bug. It’s not a matter of “if” but “when.”</p><p>The zombie process problem is real. GC pauses, network delays, and process preemption happen in production. Your distributed lock will expire while your process is paused. Without fencing tokens, that process will wake up and corrupt your data.</p><p>Fencing tokens solve this problem by making the resource an active participant in the safety protocol. The resource doesn’t trust the client’s claim of authorization — it validates every operation against a monotonically increasing token.</p><p>The cost is modest: an extra column in your database, an extra check in your write queries. The benefit is enormous: <strong>deterministic correctness</strong> instead of probabilistic hope.</p><p>Build systems that are safe by design. Use fencing tokens.</p><p><strong><em>SyncGuard</em></strong><em> is a TypeScript library that provides distributed locking with built-in fencing token support for Redis, PostgreSQL, and Firestore. Learn more at </em><a href="https://kriasoft.com/syncguard/"><em>kriasoft.com/syncguard/</em></a><em> or check out the source code on </em><a href="https://github.com/kriasoft/syncguard"><em>GitHub</em></a><em>.</em></p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=5be0857d5a6a" width="1" height="1" alt=""><hr><p><a href="https://levelup.gitconnected.com/beyond-the-lock-why-fencing-tokens-are-essential-5be0857d5a6a">Beyond the Lock: Why Fencing Tokens Are Essential</a> was originally published in <a href="https://levelup.gitconnected.com">Level Up Coding</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.</p>]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[GitHub Security Notifications for Discord]]></title>
            <link>https://medium.com/@koistya/github-security-notifications-for-discord-3fd8ca627a97?source=rss-692b968dbc82------2</link>
            <guid isPermaLink="false">https://medium.com/p/3fd8ca627a97</guid>
            <category><![CDATA[discord]]></category>
            <category><![CDATA[devops]]></category>
            <category><![CDATA[software-development]]></category>
            <category><![CDATA[github]]></category>
            <category><![CDATA[security]]></category>
            <dc:creator><![CDATA[Konstantin Tarkus]]></dc:creator>
            <pubDate>Sun, 28 Sep 2025 10:07:01 GMT</pubDate>
            <atom:updated>2025-09-28T10:22:10.978Z</atom:updated>
            <content:encoded><![CDATA[<figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*Liz5MBsP2iug_z1ixO7ONg.png" /></figure><p>A practical guide for setting up automated security notifications from GitHub repositories to Discord channels.</p><h3>Why Security Notifications Matter</h3><p>Security events in your repositories need immediate attention. This guide helps you configure automated notifications so your team stays informed about:</p><ul><li>Vulnerability discoveries and fixes</li><li>Security feature bypasses or disabled protections</li><li>Unauthorized access changes</li><li>Secret leaks and scanning results</li></ul><h3>Important Security Limitations</h3><p>⚠️ Discord Webhook Security Notice:</p><ul><li>Discord webhooks have no built-in authentication mechanism</li><li>Anyone with the webhook URL can send messages to your channel</li><li>There is no way to verify that messages come from GitHub</li><li>Treat webhook URLs as secrets and never expose them publicly</li></ul><h3>Discord Setup</h3><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*W9b9DoCVS13n5H9NqrISlw.png" /></figure><h3>1. Create a Private Security Channel</h3><pre>Right-click your Discord server → Create Channel<br>Name: #security (or #security-alerts)<br>Type: Text Channel<br>Private Channel: ✅ Enable<br>Permissions: Only security team members</pre><h3>2. Generate Webhook URL</h3><pre>Channel Settings → Integrations → Webhooks → New Webhook<br>Name: GitHub Security<br>Avatar: Upload GitHub logo (optional)<br>Copy Webhook URL → Save for GitHub configuration</pre><h3>GitHub Configuration</h3><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*tuScGsmEz-XzfInLBKVv5w.png" /></figure><h3>1. Repository Webhooks</h3><p>Navigate to: Settings → Webhooks → Add webhook</p><pre>Payload URL: [Your Discord webhook URL]/github<br>Content type: application/json<br>Secret: [Optional - validates requests FROM GitHub, but Discord cannot verify signatures]<br>SSL verification: Enable SSL verification</pre><p>⚠️ Important: GitHub secrets only validate that webhooks come FROM GitHub to prevent spoofing. Discord webhooks have NO signature validation capability — Discord accepts any properly formatted request to the webhook URL. For signature validation, use a proxy service between GitHub and Discord.</p><h3>2. Critical Security Events</h3><p>Enable these events for immediate notification:</p><pre>☑️ Code scanning alerts - Code scanning alert created, fixed in branch, or closed<br>☑️ Secret scanning alerts - Secrets scanning alert created, resolved, reopened, validated, or publicly leaked<br>☑️ Secret scanning alert locations - Secrets scanning alert location created<br>☑️ Dependabot alerts - Dependabot alert auto_dismissed, auto_reopened, created, dismissed, reopened, fixed, or reintroduced<br>☑️ Repository vulnerability alerts - (Note: Being deprecated, use Dependabot alerts)<br>☑️ Security and analyses - When security features are enabled/disabled<br>☑️ Repository advisories - Security advisories published for the repo</pre><h3>3. Access Control Events</h3><p>Enable for access monitoring:</p><pre>☑️ Branch protection configurations - All protections enabled/disabled<br>☑️ Branch protection rules - Individual rules created/edited/deleted<br>☑️ Collaborator add, remove, or changed - Team member access changes<br>☑️ Deploy keys - Deployment key additions/removals<br>☑️ Visibility changes - Repository made public/private</pre><h3>4. Security Bypass Events</h3><p>Enable for policy compliance:</p><pre>☑️ Dismissal requests for code scanning alerts - Alert dismissal tracking<br>☑️ Dismissal requests for secret scanning alerts - Secret dismissal tracking<br>☑️ Bypass requests for push rulesets - Push rule bypass requests<br>☑️ Bypass requests for secret scanning push protections - Secret push bypass</pre><h3>Best Practices</h3><h3>Channel Management</h3><ul><li>Keep the security channel private and restricted</li><li>Add only security team members and repository maintainers</li><li>Use thread discussions for detailed investigation</li><li>Pin important security policies and contacts</li></ul><h3>Notification Hygiene</h3><ul><li>Start with critical events only, expand gradually</li><li>Review and tune notifications weekly for first month</li><li>Document response procedures for each alert type</li><li>Set up on-call rotation for critical alerts</li></ul><h3>Response Workflows</h3><ul><li>Acknowledge alerts within 15 minutes during business hours (organizational policy)</li><li>Assign owner for each security issue</li><li>Use GitHub issue templates for security incident tracking</li><li>Post resolution summaries back to the channel</li></ul><p>Note: Response times are organizational recommendations based on industry standards for critical security incidents, not technical requirements from GitHub or Discord.</p><h3>Testing Your Setup</h3><ol><li>Configure the webhook in your repository</li><li>Make a simple change (e.g., push a commit or create an issue)</li><li>Check Discord channel for the notification</li><li>Note: The GitHub “Test” button returns a 400 error because GitHub’s test payload format doesn’t match Discord’s expected message schema. This is normal — the webhook is working correctly.</li><li>For security events, try creating a test file with a fake API key to trigger secret scanning</li></ol><h3>Event Priority Levels</h3><h3>🚨 Critical (Immediate Response Required)</h3><ul><li>Secret scanning alerts (publicly leaked)</li><li>Code scanning alerts (high/critical severity)</li><li>Repository vulnerability alerts (high/critical CVEs)</li><li>Security features disabled</li></ul><h3>⚠️ High (Response Within Hours)</h3><ul><li>Dependabot alerts (high/critical)</li><li>Branch protection disabled</li><li>Unauthorized collaborator changes</li><li>Deploy key modifications</li></ul><h3>ℹ️ Medium (Monitor and Review)</h3><ul><li>Security bypass requests</li><li>Alert dismissal requests</li><li>Branch protection rule changes</li><li>Completed security scans</li></ul><h3>Advanced Configuration (Optional)</h3><h3>Multiple Repositories</h3><p>For teams managing multiple repositories:</p><ul><li>Use organization-level webhooks for centralized management</li><li>Create separate channels for different severity levels (#security-critical, #security-info)</li><li>Start simple: same webhook for all repos, then customize as needed</li></ul><h3>Simple Enhancements</h3><ul><li>GitHub Actions: Filter events before sending to Discord (example: only notify for public repos)</li><li>Discord Bots: Use bots for two-way communication (acknowledge alerts from Discord)</li><li>Monitoring: Set up a simple daily/weekly summary of security events</li></ul><h3>Troubleshooting</h3><p>Webhook not triggering:</p><ul><li>Verify webhook URL format includes /github suffix (required for GitHub integration)</li><li>Check GitHub webhook delivery logs in Settings → Webhooks → Recent Deliveries</li><li>Ensure Discord channel permissions allow webhook posts</li><li>Note: GitHub’s “test” payload will show a 400 error — this is normal</li></ul><p>Missing notifications:</p><ul><li>Review selected events in GitHub webhook configuration</li><li>Test with a simple push event first</li><li>Check Discord server notification settings</li></ul><p>Too many notifications:</p><ul><li>Start with critical events only</li><li>Use Discord thread mode for detailed discussions</li><li>Consider time-based filtering for non-critical events</li><li>Discord rate limit: ~5 requests per 2 seconds per webhook</li></ul><h3>Security Best Practices</h3><h3>Webhook URL Protection</h3><ul><li>Never expose webhook URLs in client-side code or public repositories</li><li>Store webhook URLs in environment variables or secure secret management systems</li><li>Rotate webhook URLs quarterly or immediately if exposed</li><li>Use .gitignore to exclude any files containing webhook URLs</li></ul><h3>Webhook Rotation Procedure</h3><ol><li>Create new webhook in Discord (keep old one active)</li><li>Update GitHub webhook configuration with new URL</li><li>Test new webhook with a commit or issue</li><li>Once confirmed working, delete old Discord webhook</li><li>Document rotation in security log</li></ol><h3>Additional Security Measures</h3><ul><li>Consider using a proxy service between GitHub and Discord for additional validation</li><li>Implement monitoring for unusual webhook activity patterns</li><li>Use Discord bots with proper authentication for sensitive operations</li><li>Restrict channel access to security team members only</li></ul><p>Security Reminder: Discord webhooks cannot authenticate senders. Any service or person with the webhook URL can post messages. This is a fundamental limitation of Discord’s webhook system.</p><p>🚨 Critical: If a webhook URL is ever exposed or leaked, rotate it immediately. Exposed webhook URLs are compromised security credentials.</p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=3fd8ca627a97" width="1" height="1" alt="">]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[Building a Localhost OAuth Callback Server in Node.js]]></title>
            <link>https://levelup.gitconnected.com/building-a-localhost-oauth-callback-server-in-node-js-866be0765d44?source=rss-692b968dbc82------2</link>
            <guid isPermaLink="false">https://medium.com/p/866be0765d44</guid>
            <category><![CDATA[automation]]></category>
            <category><![CDATA[oauth]]></category>
            <category><![CDATA[javascript]]></category>
            <category><![CDATA[typescript]]></category>
            <category><![CDATA[nodejs]]></category>
            <dc:creator><![CDATA[Konstantin Tarkus]]></dc:creator>
            <pubDate>Tue, 19 Aug 2025 17:15:14 GMT</pubDate>
            <atom:updated>2025-08-19T17:15:14.593Z</atom:updated>
            <content:encoded><![CDATA[<figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*uSzhqFHJLNbYXfmr5BSliw.png" /><figcaption>npm: oauth-callback</figcaption></figure><p>When building CLI tools or desktop applications that integrate with OAuth providers, you face a unique challenge: how do you capture the authorization code when there’s no public-facing server to receive the callback? The answer lies in a clever technique that’s been right under our noses — spinning up a temporary localhost server to catch the OAuth redirect.</p><p>This tutorial walks through building a production-ready OAuth callback server that works across Node.js, Deno, and Bun. We’ll cover everything from the basic HTTP server setup to handling edge cases that trip up most implementations.</p><h3>Understanding the OAuth Callback Flow</h3><p>Before diving into code, let’s clarify what we’re building. In a typical OAuth 2.0 authorization code flow, your application redirects users to an authorization server (like Notion or Linear), where they grant permissions. The authorization server then redirects back to your application with an authorization code.</p><p>For web applications, this redirect goes to a public URL. But for CLI tools and desktop apps, we use a localhost UR — typically http://localhost:3000/callback. The OAuth provider redirects to this local address, and our temporary server captures the authorization code from the query parameters.</p><p>This approach is explicitly blessed by OAuth 2.0 for Native Apps (RFC 8252) and is used by major tools like the GitHub CLI and Google’s OAuth libraries.</p><h3>Setting Up the Basic HTTP Server</h3><p>The first step is creating an HTTP server that can listen on localhost. Modern JavaScript runtimes provide different APIs for this, but we can abstract them behind a common interface using Web Standards Request and Response objects.</p><pre>interface CallbackServer {<br>  start(options: ServerOptions): Promise&lt;void&gt;;<br>  waitForCallback(path: string, timeout: number): Promise&lt;CallbackResult&gt;;<br>  stop(): Promise&lt;void&gt;;<br>}<br><br>function createCallbackServer(): CallbackServer {<br>  // Runtime detection<br>  if (typeof Bun !== &quot;undefined&quot;) return new BunCallbackServer();<br>  if (typeof Deno !== &quot;undefined&quot;) return new DenoCallbackServer();<br>  return new NodeCallbackServer();<br>}</pre><p>Each runtime implementation follows the same pattern: create a server, listen for requests, and resolve a promise when the callback arrives. Here’s the Node.js version that bridges between Node’s http module and Web Standards:</p><pre>class NodeCallbackServer implements CallbackServer {<br>  private server?: http.Server;<br>  private callbackPromise?: {<br>    resolve: (result: CallbackResult) =&gt; void;<br>    reject: (error: Error) =&gt; void;<br>  };<br><br>  async start(options: ServerOptions): Promise&lt;void&gt; {<br>    const { createServer } = await import(&quot;node:http&quot;);<br><br>    return new Promise((resolve, reject) =&gt; {<br>      this.server = createServer(async (req, res) =&gt; {<br>        const request = this.nodeToWebRequest(req, options.port);<br>        const response = await this.handleRequest(request);<br><br>        res.writeHead(<br>          response.status,<br>          Object.fromEntries(response.headers.entries()),<br>        );<br>        res.end(await response.text());<br>      });<br><br>      this.server.listen(options.port, options.hostname, resolve);<br>      this.server.on(&quot;error&quot;, reject);<br>    });<br>  }<br><br>  private nodeToWebRequest(req: http.IncomingMessage, port: number): Request {<br>    const url = new URL(req.url!, `http://localhost:${port}`);<br>    const headers = new Headers();<br><br>    for (const [key, value] of Object.entries(req.headers)) {<br>      if (typeof value === &quot;string&quot;) {<br>        headers.set(key, value);<br>      }<br>    }<br><br>    return new Request(url.toString(), {<br>      method: req.method,<br>      headers,<br>    });<br>  }<br>}</pre><p>The beauty of this approach is that once we convert to Web Standards, the actual request handling logic is identical across all runtimes.</p><h3>Capturing the OAuth Callback</h3><p>The heart of our server is the callback handler. When the OAuth provider redirects back, we need to extract the authorization code (or error) from the query parameters:</p><pre>private async handleRequest(request: Request): Promise&lt;Response&gt; {<br>  const url = new URL(request.url);<br><br>  if (url.pathname === this.callbackPath) {<br>    const params: CallbackResult = {};<br><br>    // Extract all query parameters<br>    for (const [key, value] of url.searchParams) {<br>      params[key] = value;<br>    }<br><br>    // Resolve the waiting promise<br>    if (this.callbackPromise) {<br>      this.callbackPromise.resolve(params);<br>    }<br><br>    // Return success page to the browser<br>    return new Response(this.generateSuccessHTML(), {<br>      status: 200,<br>      headers: { &quot;Content-Type&quot;: &quot;text/html&quot; }<br>    });<br>  }<br><br>  return new Response(&quot;Not Found&quot;, { status: 404 });<br>}</pre><p>Notice how we capture all query parameters, not just the authorization code. OAuth providers send additional information like state for CSRF protection, and error responses include error and error_description fields. Our implementation preserves everything for maximum flexibility.</p><h3>Handling Timeouts and Cancellation</h3><p>Real-world OAuth flows can fail in numerous ways. Users might close the browser, deny permissions, or simply walk away. Our server needs robust timeout and cancellation handling:</p><pre>async waitForCallback(path: string, timeout: number): Promise&lt;CallbackResult&gt; {<br>  this.callbackPath = path;<br><br>  return new Promise((resolve, reject) =&gt; {<br>    let isResolved = false;<br><br>    // Set up timeout<br>    const timer = setTimeout(() =&gt; {<br>      if (!isResolved) {<br>        isResolved = true;<br>        reject(new Error(`OAuth callback timeout after ${timeout}ms`));<br>      }<br>    }, timeout);<br><br>    // Wrap resolve/reject to handle cleanup<br>    const wrappedResolve = (result: CallbackResult) =&gt; {<br>      if (!isResolved) {<br>        isResolved = true;<br>        clearTimeout(timer);<br>        resolve(result);<br>      }<br>    };<br><br>    this.callbackPromise = {<br>      resolve: wrappedResolve,<br>      reject: (error) =&gt; {<br>        if (!isResolved) {<br>          isResolved = true;<br>          clearTimeout(timer);<br>          reject(error);<br>        }<br>      }<br>    };<br>  });<br>}</pre><p>Supporting AbortSignal enables programmatic cancellation, essential for GUI applications where users might close a window mid-flow:</p><pre>if (signal) {<br>  if (signal.aborted) {<br>    throw new Error(&quot;Operation aborted&quot;);<br>  }<br><br>  const abortHandler = () =&gt; {<br>    this.stop();<br>    if (this.callbackPromise) {<br>      this.callbackPromise.reject(new Error(&quot;Operation aborted&quot;));<br>    }<br>  };<br><br>  signal.addEventListener(&quot;abort&quot;, abortHandler);<br>}</pre><h3>Providing User Feedback</h3><p>When users complete the OAuth flow, they see a browser page indicating success or failure. Instead of a blank page or cryptic message, provide clear feedback with custom HTML:</p><pre>function generateCallbackHTML(<br>  params: CallbackResult,<br>  templates: Templates,<br>): string {<br>  if (params.error) {<br>    // OAuth error - show error page<br>    return templates.errorHtml<br>      .replace(/{{error}}/g, params.error)<br>      .replace(/{{error_description}}/g, params.error_description || &quot;&quot;);<br>  }<br><br>  // Success - show confirmation<br>  return (<br>    templates.successHtml ||<br>    `<br>    &lt;html&gt;<br>      &lt;body style=&quot;font-family: system-ui; padding: 2rem; text-align: center;&quot;&gt;<br>        &lt;h1&gt;✅ Authorization successful!&lt;/h1&gt;<br>        &lt;p&gt;You can now close this window and return to your terminal.&lt;/p&gt;<br>      &lt;/body&gt;<br>    &lt;/html&gt;<br>  `<br>  );<br>}</pre><p>For production applications, consider adding CSS animations, auto-close functionality, or deep links back to your desktop application.</p><h3>Security Considerations</h3><p>While localhost servers are inherently more secure than public endpoints, several security measures are crucial:</p><p><strong>1. Bind to localhost only:</strong> Never bind to 0.0.0.0 or public interfaces. This prevents network-based attacks:</p><pre>this.server.listen(port, &quot;localhost&quot;); // NOT &quot;0.0.0.0&quot;</pre><p><strong>2. Validate the state parameter:</strong> OAuth’s state parameter prevents CSRF attacks. Generate it before starting the flow and validate it in the callback:</p><pre>const state = crypto.randomBytes(32).toString(&quot;base64url&quot;);<br>const authUrl = `${provider}/authorize?state=${state}&amp;...`;<br><br>// In callback handler<br>if (params.state !== expectedState) {<br>  throw new Error(&quot;State mismatch - possible CSRF attack&quot;);<br>}</pre><p><strong>3. Close the server immediately:</strong> Once you receive the callback, shut down the server to minimize the attack surface:</p><pre>const result = await server.waitForCallback(&quot;/callback&quot;, 30000);<br>await server.stop(); // Always cleanup</pre><p><strong>4. Use unpredictable ports when possible:</strong> If your OAuth provider supports dynamic redirect URIs, use random high ports to prevent port-squatting attacks.</p><h3>Putting It All Together</h3><p>Here’s a complete example that ties everything together:</p><pre>import { createCallbackServer } from &quot;./server&quot;;<br>import { spawn } from &quot;child_process&quot;;<br><br>export async function getAuthCode(authUrl: string): Promise&lt;string&gt; {<br>  const server = createCallbackServer();<br><br>  try {<br>    // Start the server<br>    await server.start({<br>      port: 3000,<br>      hostname: &quot;localhost&quot;,<br>      successHtml: &quot;&lt;h1&gt;Success! You can close this window.&lt;/h1&gt;&quot;,<br>      errorHtml: &quot;&lt;h1&gt;Error: {{error_description}}&lt;/h1&gt;&quot;,<br>    });<br><br>    // Open the browser<br>    const opener =<br>      process.platform === &quot;darwin&quot;<br>        ? &quot;open&quot;<br>        : process.platform === &quot;win32&quot;<br>          ? &quot;start&quot;<br>          : &quot;xdg-open&quot;;<br>    spawn(opener, [authUrl], { detached: true });<br><br>    // Wait for callback<br>    const result = await server.waitForCallback(&quot;/callback&quot;, 30000);<br><br>    if (result.error) {<br>      throw new Error(`OAuth error: ${result.error_description}`);<br>    }<br><br>    return result.code!;<br>  } finally {<br>    // Always cleanup<br>    await server.stop();<br>  }<br>}<br><br>// Usage<br>const code = await getAuthCode(<br>  &quot;https://github.com/login/oauth/authorize?&quot; +<br>    &quot;client_id=xxx&amp;redirect_uri=http://localhost:3000/callback&quot;,<br>);</pre><h3>Best Practices and Next Steps</h3><p>Building a robust OAuth callback server requires attention to detail, but the patterns are consistent across implementations. Key takeaways:</p><ul><li><strong>Use Web Standards APIs</strong> for cross-runtime compatibility</li><li><strong>Handle all error cases</strong> including timeouts and user cancellation</li><li><strong>Provide clear user feedback</strong> with custom success/error pages</li><li><strong>Implement security measures</strong> like state validation and localhost binding</li><li><strong>Clean up resources</strong> by always stopping the server after use</li></ul><p>This localhost callback approach has become the de facto standard for OAuth in CLI tools. Libraries like <a href="https://github.com/kriasoft/oauth-callback">oauth-callback</a> provide production-ready implementations with additional features like automatic browser detection, token persistence, and PKCE support.</p><p>Modern OAuth is moving toward even better solutions like Device Code Flow for headless environments and Dynamic Client Registration for eliminating pre-shared secrets. But for now, the localhost callback server remains the most widely supported and user-friendly approach for bringing OAuth to command-line tools.</p><p>Ready to implement OAuth in your CLI tool? Check out the complete <a href="https://github.com/kriasoft/oauth-callback">oauth-callback</a> library for a battle-tested implementation that handles all the edge cases discussed here.</p><p><em>This tutorial is part of a series on modern authentication patterns. Follow </em><a href="https://x.com/koistya"><em>@koistya</em></a><em> for more insights on building secure, user-friendly developer tools.</em></p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=866be0765d44" width="1" height="1" alt=""><hr><p><a href="https://levelup.gitconnected.com/building-a-localhost-oauth-callback-server-in-node-js-866be0765d44">Building a Localhost OAuth Callback Server in Node.js</a> was originally published in <a href="https://levelup.gitconnected.com">Level Up Coding</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.</p>]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[Why I Built MCP Client Generator (And Why You Should Care)]]></title>
            <link>https://medium.com/@koistya/why-i-built-mcp-client-generator-and-why-you-should-care-c860193ca902?source=rss-692b968dbc82------2</link>
            <guid isPermaLink="false">https://medium.com/p/c860193ca902</guid>
            <category><![CDATA[mcp-server]]></category>
            <category><![CDATA[ai]]></category>
            <category><![CDATA[open-source]]></category>
            <category><![CDATA[openai]]></category>
            <category><![CDATA[llm]]></category>
            <dc:creator><![CDATA[Konstantin Tarkus]]></dc:creator>
            <pubDate>Wed, 13 Aug 2025 14:11:40 GMT</pubDate>
            <atom:updated>2025-08-13T14:11:40.080Z</atom:updated>
            <content:encoded><![CDATA[<figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*z_M1F0tR20P4OybCreDEJw.png" /></figure><blockquote><strong>⚠️ Upfront disclaimer: This is an early prototype exploring new approaches to API integration. It’s not production-ready, but I’m sharing it to gather feedback from developers facing similar challenges.</strong></blockquote><p>Picture this: You’re building with AI coding assistants like Claude or GitHub Copilot, and they’re using MCP tools to interact with your services. But here’s the catch — <strong>inefficient tool usage can dramatically increase your token costs</strong>. A single poorly optimized loop calling APIs could burn through tokens faster than you expect.</p><p>I learned this the hard way. My AI assistant was making individual API calls in a loop, consuming tokens at an alarming rate. That’s when I realized: we need a better way to interact with MCP servers — one that’s both cost-effective and developer-friendly.</p><h3>The Hidden Cost of AI Tool Usage</h3><p>You know the drill — you need to connect to GitHub, Notion, Linear, and multiple other services. Each one has its own SDK quirks, authentication dance, and type definitions that may or may not be up-to-date. But there’s a bigger problem:</p><p><strong>When AI agents use MCP tools inefficiently, your costs can escalate quickly.</strong></p><p>Instead of letting AI assistants make hundreds of individual tool calls, what if we could generate efficient, batched automation scripts? What if we could have type-safe clients that encourage best practices? What if authentication could just… work?</p><p>That’s when I discovered the potential of the Model Context Protocol (MCP) and decided to build something to solve this problem.</p><h3>The Integration Challenge Many Developers Face</h3><p>Whether you’re a solo developer automating your workflow or part of a startup building integrations, you’ve likely encountered this scenario:</p><ul><li>Create GitHub issues from Notion pages</li><li>Sync tasks with Linear for bug tracking</li><li>Move data between multiple platforms</li><li>Do it all with proper TypeScript support</li></ul><p>The traditional approach? Install multiple SDKs, manage different authentication flows, deal with various error handling patterns, and hope everything stays in sync. Oh, and don’t forget to create OAuth applications for each service — complete with client IDs and secrets that you’ll need to manage securely.</p><p>The boilerplate accumulates quickly.</p><h3>Enter MCP Client Generator: A Prototype</h3><p>Here’s what I’m experimenting with:</p><pre>import { notion, github, linear } from &quot;./lib/mcp-client&quot;;<br><br>// One command will generate all of this with full type safety<br>const page = await notion.createPage({<br>  title: &quot;Bug Report: Login Issues&quot;,<br>  content: &quot;Users are reporting authentication failures...&quot;,<br>});<br>const issue = await github.createIssue({<br>  title: page.title,<br>  body: page.content,<br>});<br>await linear.createIssue({<br>  title: `Bug: ${page.title}`,<br>  description: `GitHub Issue: ${issue.html_url}`,<br>  teamId: &quot;engineering&quot;,<br>});</pre><p>That’s the goal. Three services, fully typed, with a single configuration file:</p><pre>{<br>  &quot;mcpServers&quot;: {<br>    &quot;notion&quot;: {<br>      &quot;type&quot;: &quot;http&quot;,<br>      &quot;url&quot;: &quot;https://mcp.notion.com/mcp&quot;<br>    },<br>    &quot;github&quot;: {<br>      &quot;type&quot;: &quot;http&quot;,<br>      &quot;url&quot;: &quot;https://api.githubcopilot.com/mcp/&quot;<br>    },<br>    &quot;linear&quot;: {<br>      &quot;type&quot;: &quot;sse&quot;,<br>      &quot;url&quot;: &quot;https://mcp.linear.app/sse&quot;<br>    }<br>  }<br>}</pre><h3>The Authentication Reality</h3><p>Here’s what I’m exploring: <strong>simplified OAuth setup where possible</strong>.</p><p>Traditional approach:</p><ol><li>Go to GitHub’s developer settings</li><li>Create a new OAuth app</li><li>Configure redirect URIs</li><li>Copy client ID and secret</li><li>Repeat for each service</li><li>Manage secrets securely</li><li>Handle token refresh logic</li><li>Deal with different OAuth flows per service</li></ol><p><strong>The idealized MCP approach</strong> (when fully implemented):</p><ol><li>Run npx mcp-client-gen</li><li>Configure your MCP server endpoints</li><li>Authenticate through browser redirect</li><li>Tokens stored locally for reuse</li></ol><p><strong>Reality check</strong>: This relies on RFC 7591 Dynamic Client Registration, which:</p><ul><li><strong>Isn’t universally supported</strong> — Many services don’t allow dynamic registration</li><li><strong>Has security implications</strong> — Some orgs prohibit dynamic client creation</li><li><strong>Still requires browser auth</strong> — You’ll be redirected to approve access</li><li><strong>Needs secure token storage</strong> — We handle refresh, but you manage security</li></ul><p>For services without dynamic registration, you’ll still need to create OAuth apps manually and provide credentials in your config.</p><h3>The Potential for Cost Optimization</h3><p><strong>The Efficiency Problem</strong>: When using AI assistants with APIs, inefficient patterns can increase costs. For example, making 100 individual API calls instead of batched requests means more tokens for tool invocations, responses, and error handling.</p><p><strong>The Proposed Solution</strong>: Generate automation scripts using MCP Client Generator that could:</p><ul><li>Batch operations efficiently</li><li>Use proper pagination</li><li>Cache responses when appropriate</li><li>Provide type safety to prevent errors that trigger retries</li></ul><p>While I don’t have exact metrics yet (the tool is still in development), the potential for cost reduction is significant based on the difference between individual vs batched API operations.</p><h3>Potential Use Cases</h3><p><strong>Cross-Platform Data Sync</strong>: Scripts that keep GitHub issues and Notion project pages in sync, automatically updating status changes across platforms.</p><p><strong>Automated Reporting</strong>: Weekly reports pulling data from GitHub (commits, PRs), Jira (completed tickets), and Slack (team activity), then creating summaries in Notion.</p><p><strong>Incident Response</strong>: When monitoring systems detect issues, automatically create Slack threads, open GitHub issues, and update status pages with proper context.</p><p><strong>Content Pipeline</strong>: Write technical content in one platform and cross-post to Dev.to, LinkedIn, and company blogs with platform-specific formatting.</p><h3>Current State &amp; Roadmap</h3><h3>Actually Implemented (You can test these today)</h3><ul><li>✅ CLI interface and configuration parsing</li><li>✅ Interactive prompts with smart defaults</li><li>✅ Basic scaffolding for client generation</li><li>✅ Multiple config format support (.mcp.json, .cursor/, .vscode/)</li></ul><h3>Not Yet Working (Under Development)</h3><ul><li>❌ MCP server introspection (cannot connect to servers yet)</li><li>❌ OAuth authentication (no auth flow implemented)</li><li>❌ Type generation from live servers (uses mocks currently)</li><li>❌ Error handling and retry logic</li><li>❌ Streaming support</li></ul><p><strong>Reality check</strong>: The core generation pipeline exists, but it doesn’t actually connect to MCP servers yet. Think of it as a foundation waiting for the protocol implementation.</p><h3>Prerequisites (Current)</h3><ul><li>MCP servers must support HTTP transport</li><li>Servers need proper schema exposure</li><li>Node.js 18+ or Bun runtime</li></ul><h3>Performance Considerations</h3><ul><li>Initial connection overhead (estimated ~200ms)</li><li>Type generation at build time (not runtime)</li><li>Tree-shakable output for optimal bundle size</li></ul><h3>Addressing Common Questions</h3><h3>“Why not just use existing SDKs?”</h3><p>This is a valid point. Traditional SDKs are mature and well-tested. However, they present challenges when you need:</p><ul><li>Unified patterns across multiple services</li><li>Consistent authentication handling</li><li>Type safety that stays in sync with API changes</li><li>Integration with AI coding assistants</li></ul><p>MCP Client Generator aims to complement, not replace, existing SDKs. It’s specifically designed for scenarios requiring unified access to multiple MCP-enabled services.</p><h3>When to stick with traditional SDKs</h3><ul><li><strong>Production systems</strong> requiring battle-tested reliability</li><li><strong>Complex operations</strong> that need SDK-specific optimizations</li><li><strong>Single service integration</strong> where you don’t need cross-platform consistency</li><li><strong>Teams with existing SDK expertise</strong> and established patterns</li></ul><h3>When this tool might help (once complete)</h3><ul><li><strong>Prototyping</strong> multi-service integrations quickly</li><li><strong>AI-assisted development</strong> where consistent patterns reduce token usage</li><li><strong>Small teams</strong> managing many integrations without dedicated expertise per SDK</li><li><strong>Exploratory projects</strong> testing MCP capabilities</li></ul><h3>“What’s the learning curve?”</h3><p>Fair question. While MCP itself is new, the generated clients use familiar JavaScript/TypeScript patterns. If you can use an SDK, you can use the generated clients. The main learning curve is understanding MCP configuration, which we’re working to simplify with better documentation and examples.</p><h3>“What about existing integration platforms?”</h3><p>Tools like Zapier, n8n, and Make excel at no-code workflows. MCP Client Generator targets a different need:</p><ul><li>Type safety in your code</li><li>Custom business logic</li><li>Direct API access without middleware</li><li>Integration with your existing codebase</li><li>Cost-effective high-volume operations</li></ul><h3>“Is dynamic client registration secure?”</h3><p>RFC 7591 Dynamic Client Registration is an OAuth 2.1 standard with both benefits and risks:</p><p><strong>Security benefits</strong>:</p><ul><li>Clients register with limited scopes</li><li>Short-lived, refreshable tokens</li><li>No hardcoded secrets in code</li><li>Unique registration per application</li></ul><p><strong>Security considerations</strong>:</p><ul><li><strong>Token storage</strong>: Generated clients store tokens locally — secure your development environment</li><li><strong>Scope creep</strong>: Dynamic registration might request broader permissions than needed</li><li><strong>Audit trails</strong>: Harder to track dynamically created clients in your OAuth provider</li><li><strong>Organizational policies</strong>: Many enterprises prohibit dynamic registration</li></ul><p><strong>Best practice</strong>: Use dynamic registration for development/prototyping. For production, consider traditional OAuth apps with proper secret management (environment variables, secret stores, etc.).</p><p>Always review security implications for your specific use case and comply with your organization’s security policies.</p><h3>“What if MCP doesn’t achieve widespread adoption?”</h3><p>A legitimate concern. MCP is emerging technology currently supported by:</p><ul><li><strong>Claude Desktop</strong> — Anthropic’s desktop app with MCP support</li><li><strong>Cline (formerly Claude Dev)</strong> — VS Code extension using MCP</li><li><strong>Continue.dev</strong> — Open-source AI code assistant</li><li><strong>Official MCP servers</strong> — <a href="https://github.com/modelcontextprotocol/servers">GitHub repo</a> includes filesystem, GitHub, GitLab, Slack, and Google Drive implementations</li></ul><p>The ecosystem is small but growing. Even if adoption remains limited, the code generation patterns have value beyond MCP. The project could adapt to other protocols if needed.</p><h3>“How do you handle API changes and maintenance?”</h3><p><strong>The maintenance challenge is real.</strong> Here’s the proposed approach:</p><ul><li><strong>Regeneration workflow</strong>: Run mcp-client-gen again to update types when APIs change</li><li><strong>Version pinning</strong>: Lock to specific MCP server versions in your config</li><li><strong>Git diff review</strong>: Generated code changes are reviewable like any dependency update</li><li><strong>Fallback strategy</strong>: Keep previous generated versions if servers break compatibility</li></ul><p><strong>Reality check</strong>: This adds a build step and requires you to manage regeneration. It’s a tradeoff between automation and control.</p><h3>Why I’m Sharing This Prototype Now</h3><p>AI-assisted development is changing how we build integrations, but the tooling hasn’t caught up. I’m sharing this early prototype to:</p><ol><li><strong>Validate the problem</strong>: Do others face similar multi-service integration challenges?</li><li><strong>Gather feedback</strong>: What would make this actually useful?</li><li><strong>Find collaborators</strong>: Who wants to help build this?</li></ol><p>This isn’t a launch — it’s an invitation to experiment together.</p><h3>What’s Next</h3><p>Immediate priorities:</p><ul><li>Complete MCP protocol implementation</li><li>Robust server introspection</li><li>Comprehensive error handling</li><li>Streaming support for real-time APIs</li><li>Plugin system for custom transformations</li></ul><p>But I’m most interested in community input on priorities.</p><h3>How You Can Help</h3><h3>For Developers</h3><ul><li><strong>Test the proof of concept</strong>: Try npx mcp-client-gen and share feedback</li><li><strong>Contribute to development</strong>:</li><li>MCP server introspection</li><li>Authentication flows</li><li>Error handling patterns</li><li>Test coverage</li></ul><h3>For Potential Users</h3><ul><li><strong>Share your use cases</strong>: What MCP servers do you need?</li><li><strong>Provide feedback</strong>: What would make this useful for your workflow?</li><li><strong>Help with documentation</strong>: Explain concepts to newcomers</li></ul><h3>For MCP Server Implementers</h3><ul><li><strong>Feedback on approach</strong>: What patterns work best?</li><li><strong>Schema standards</strong>: How should servers expose capabilities?</li></ul><h3>Get Started</h3><pre># Try the interactive mode<br>npx mcp-client-gen</pre><pre># Or quick mode with defaults<br>npx mcp-client-gen -y</pre><ul><li><strong>NPM</strong>: <a href="https://www.npmjs.com/package/mcp-client-gen">npmjs.com/package/mcp-client-gen</a></li><li><strong>GitHub</strong>: <a href="https://github.com/kriasoft/mcp-client-gen">github.com/kriasoft/mcp-client-gen</a></li><li><strong>Issues</strong>: <a href="https://github.com/kriasoft/mcp-client-gen/issues">Report bugs or request features</a></li><li><strong>Discussions</strong>: <a href="https://github.com/kriasoft/mcp-client-gen/discussions">Join the conversation</a></li></ul><h3>The Vision</h3><p>I believe we’re at an inflection point in API integration, especially with AI-assisted development. While traditional SDKs remain valuable, there’s room for new approaches that better serve modern development workflows.</p><p>The future I’m building toward:</p><ul><li>Configuration-driven service connections</li><li>Type safety as a default</li><li>Abstracted authentication complexity</li><li>Efficient clients for both AI and human developers</li></ul><p>This is an early-stage project with ambitious goals. If you’re interested in shaping how developers interact with MCP services, I’d love your input and collaboration.</p><p><em>The MCP Client Generator is MIT licensed and available on </em><a href="https://github.com/kriasoft/mcp-client-gen"><em>GitHub</em></a><em>. If this project helps you build amazing integrations, consider </em><a href="https://github.com/sponsors/koistya"><em>sponsoring the development</em></a><em> to support continued progress.</em></p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=c860193ca902" width="1" height="1" alt="">]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[Zero-Wait PR Previews: The Pre-Configured Slots Pattern]]></title>
            <link>https://levelup.gitconnected.com/zero-wait-pr-previews-the-pre-configured-slots-pattern-72d711bdd70d?source=rss-692b968dbc82------2</link>
            <guid isPermaLink="false">https://medium.com/p/72d711bdd70d</guid>
            <category><![CDATA[github-actions]]></category>
            <category><![CDATA[developer]]></category>
            <category><![CDATA[javascript]]></category>
            <category><![CDATA[devops]]></category>
            <category><![CDATA[programming]]></category>
            <dc:creator><![CDATA[Konstantin Tarkus]]></dc:creator>
            <pubDate>Fri, 25 Jul 2025 14:18:17 GMT</pubDate>
            <atom:updated>2025-07-25T16:26:01.107Z</atom:updated>
            <content:encoded><![CDATA[<figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*rCTyD6ysm3fHT1KBW1MuBA.jpeg" /></figure><p>Ever waited for <strong>PR preview environments</strong> to spin up? Yeah, me too. Here’s a pattern that changed the game for our team: pre-configured <strong>deployment slots</strong> with deterministic routing.</p><h3>The Problem</h3><p>Traditional PR preview workflows go something like this:</p><p>1. Open PR<br> 2. CI/CD provisions a new environment<br> 3. Wait… ⏳<br> 4. Deploy code<br> 5. Wait some more… ⏳<br> 6. Finally get your preview URL</p><p>The provisioning step is the killer. Whether you’re using Kubernetes namespaces, cloud functions, or edge workers, creating resources takes time.</p><h3><strong>The Solution: Pre-Configured Slots</strong></h3><p>What if we flipped the script? Instead of creating environments on-demand, we pre-configure a fixed set of deployment slots:</p><pre>tokyo    🔗 https://tokyo.example.com<br>paris    🔗 https://paris.example.com<br>london   🔗 https://london.example.com<br>berlin   🔗 https://berlin.example.com<br>sydney   🔗 https://sydney.example.com<br>madrid   🔗 https://madrid.example.com<br>moscow   🔗 https://moscow.example.com<br>cairo    🔗 https://cairo.example.com<br>dubai    🔗 https://dubai.example.com<br>rome     🔗 https://rome.example.com</pre><p>Then use a deterministic hash to map PR numbers to slots:</p><pre>- uses: kriasoft/pr-codename@v1<br>  id: pr<br>  <br>- run: wrangler deploy --env ${{ steps.pr.outputs.codename }}</pre><p><strong>PR #1234</strong> always maps to <strong>tokyo</strong>. <strong>PR #1235</strong> always maps to <strong>paris</strong>. No provisioning, no waiting.</p><h3><strong>How It Works</strong></h3><p>The magic happens in three parts:</p><h4><strong>1. Pre-Configure Your Slots</strong></h4><p>First, set up your deployment slots. Here’s a Cloudflare Workers example:</p><pre># wrangler.toml<br>[env.tokyo]<br>name = &quot;preview-tokyo&quot;<br>route = &quot;tokyo.example.com/*&quot;<br><br>[env.paris]<br>name = &quot;preview-paris&quot;<br>route = &quot;paris.example.com/*&quot;<br><br>[env.london]<br>name = &quot;preview-london&quot;<br>route = &quot;london.example.com/*&quot;<br><br># ... repeat for all slots</pre><h4><strong>2. Deterministic Mapping</strong></h4><p>The <a href="https://github.com/marketplace/actions/pr-codename">PR Codename Action</a> uses a simple hash function to consistently map PR numbers to slot names:</p><pre>const words = [&quot;tokyo&quot;, &quot;paris&quot;, &quot;london&quot;, &quot;berlin&quot;, /* ... */];<br>const index = prNumber % words.length;<br>return words[index];</pre><p>The above is just an example, in reality it uses <a href="https://github.com/kriasoft/codenames/blob/main/docs/adr/1-hash-function.md"><strong>FNV-1a hashing algorithm</strong></a>.</p><h4><strong>3. Deploy to the Slot</strong></h4><p>Your GitHub Action workflow becomes dead simple:</p><pre>name: Deploy PR Preview<br><br>on:<br>  pull_request:<br>    types: [opened, synchronize]<br><br>jobs:<br>  deploy:<br>    runs-on: ubuntu-latest<br>    steps:<br>      - uses: actions/checkout@v4<br>      <br>      - uses: kriasoft/pr-codename@v1<br>        id: pr<br>        <br>      - name: Deploy to slot<br>        run: |<br>          npm ci<br>          npm run build<br>          wrangler deploy --env ${{ steps.pr.outputs.codename }}<br>          <br>      - name: Comment PR<br>        uses: actions/github-script@v7<br>        with:<br>          script: |<br>            github.rest.issues.createComment({<br>              issue_number: context.issue.number,<br>              owner: context.repo.owner,<br>              repo: context.repo.repo,<br>              body: &#39;🚀 Preview deployed to https://${{ steps.pr.outputs.codename }}.example.com&#39;<br>            })</pre><h3><strong>The Benefits</strong></h3><p>This pattern isn’t just a neat trick; it fundamentally changes the rhythm of your development cycle.</p><p><strong>🚀 Zero-Wait Deploys<br></strong>The biggest win. By eliminating the on-demand provisioning step, deployments start <em>immediately</em>. What used to be a 2–3 minute coffee break is now a 30-second task. Your developers stay in the flow, and your pipeline gets a whole lot faster.</p><p><strong>🔗 URLs You Can Actually Share<br></strong>Forget long, ugly, auto-generated URLs. With this pattern, <strong>PR #1234</strong> always maps to <strong>https://tokyo.example.com</strong>. This URL is:</p><ul><li><strong>Memorable:</strong> You can actually remember it.</li><li><strong>Shareable:</strong> Perfect for dropping in a Slack channel, a Linear ticket, or even saying out loud during a Zoom call. No more “Hey, can you find that preview link for me?”</li><li><strong>Bookmarkable:</strong> QA testers and product managers can bookmark slots for features they’re tracking.</li></ul><p><strong>💰 No More Cloud Bill Surprises<br></strong>Dynamic environments are notorious for leaving behind orphaned resources that quietly drain your budget. With a fixed number of slots, your infrastructure costs become predictable. You know exactly what’s running, and you never have to hunt down forgotten preview apps again.</p><p><strong>🧹 Cleanup? What Cleanup?<br></strong>When a PR is merged or closed, there’s no complex teardown script to run. The slot simply becomes available for the next PR. You can even have a workflow that automatically deploys the <strong>main</strong> branch to the slot to keep it fresh. It’s a self-cleaning system.</p><h3><strong>Real-World Considerations</strong></h3><h4><strong>How Many Slots?</strong></h4><p>We’ve found 10–15 slots work well for most teams. The math:</p><ul><li>10 slots + 50 open PRs = each slot serves ~5 PRs</li><li>Only the latest deployment to each slot is accessible</li><li>Most teams only actively review a handful of PRs at once</li></ul><h4><strong>Collision Handling</strong></h4><p>Yes, PRs can map to the same slot. PR #1 and PR #11 both map to the same environment with 10 slots. <strong>This means newer deployments overwrite older ones</strong> — so if you’re reviewing PR #1 and someone pushes PR #11, your preview disappears.</p><p>In practice, this works for many teams because:</p><ol><li>Developers typically work on recent PRs</li><li>Old PR previews naturally expire</li><li>You can always trigger a redeploy to refresh</li></ol><p><strong>When slots don’t work well:</strong> Large teams, high PR velocity, or when multiple people need to review the same PR simultaneously.</p><h4>Database &amp; Stateful Services</h4><p>The biggest challenge with any preview environment is handling databases and stateful services. With slots, you have a few options:</p><ul><li><strong>Shared database:</strong> Fast and cheap, but schema migrations from one PR can break others</li><li><strong>Database per slot:</strong> Better isolation, but requires seeding data for each slot</li><li><strong>Database branching services:</strong> Tools like Neon offer instant database branches (premium option)</li></ul><p>For simple stateless apps, this isn’t an issue. For complex apps with databases, it’s the main implementation challenge.</p><h4><strong>Security Notes</strong></h4><ul><li>Use environment-specific secrets for each slot</li><li>Consider adding basic auth to preview domains</li><li>Implement automatic cleanup for stale deployments\</li></ul><h4><strong>Beyond Basic Previews</strong></h4><p>This pattern unlocks some cool possibilities:</p><p><strong>Persistent Test Environments</strong>: QA can bookmark specific slots for testing.<br><strong>A/B Testing</strong>: Map feature flags to slots for instant switching.<br><strong>Geographic Testing</strong>: Actually deploy slots to different regions.</p><h3><strong>Try It Yourself</strong></h3><p>Getting started is pretty straightforward:</p><ol><li>Install the action:</li></ol><pre>- uses: kriasoft/pr-codename@v1<br>  id: pr</pre><p>2. Use the codename in your deploy:</p><pre>deploy --env ${{ steps.pr.outputs.codename }}</pre><p>3. Enjoy instant PR previews 🚀</p><p>The full source is on <a href="https://github.com/kriasoft/pr-codename">GitHub</a> if you want to customize the word list or hashing algorithm.</p><h3>Slots vs On-Demand: Quick Comparison</h3><p>Before you dive in, it’s worth understanding how the pre-configured slots pattern stacks up against the traditional on-demand ephemeral environments. While this post focuses on slots, knowing the trade-offs helps you make the right choice for your team.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*Lz7NlXxxal02pqgPpic0xw.png" /><figcaption>Pre-Configured Deployment Slots vs On-Demand Ephemeral Deployments</figcaption></figure><p><strong>Best for Slots:</strong> Small teams, simple apps, tight budgets</p><p><strong>Best for On-Demand:</strong> Growing teams, complex apps, quality-focused</p><p><em>Nothing prevents you from mixing both approaches — use slots for rapid prototyping and on-demand for critical features.</em></p><h3><strong>Wrapping Up</strong></h3><p>Sometimes the best optimization is avoiding work altogether. By pre-configuring deployment slots and using deterministic routing, we eliminated the biggest bottleneck in our PR workflow.</p><p>Give it a shot and let me know how it works for your team. Happy deploying!</p><p><em>What patterns have you used for PR previews? Drop a comment below 👇 always curious to hear different approaches!</em></p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=72d711bdd70d" width="1" height="1" alt=""><hr><p><a href="https://levelup.gitconnected.com/zero-wait-pr-previews-the-pre-configured-slots-pattern-72d711bdd70d">Zero-Wait PR Previews: The Pre-Configured Slots Pattern</a> was originally published in <a href="https://levelup.gitconnected.com">Level Up Coding</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.</p>]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[Building Type-Safe WebSocket Applications with Bun and Zod]]></title>
            <link>https://levelup.gitconnected.com/building-type-safe-websocket-applications-with-bun-and-zod-f0aef259a53e?source=rss-692b968dbc82------2</link>
            <guid isPermaLink="false">https://medium.com/p/f0aef259a53e</guid>
            <category><![CDATA[web-development]]></category>
            <category><![CDATA[programming]]></category>
            <category><![CDATA[websocket]]></category>
            <category><![CDATA[typescript]]></category>
            <category><![CDATA[javascript]]></category>
            <dc:creator><![CDATA[Konstantin Tarkus]]></dc:creator>
            <pubDate>Sat, 03 May 2025 23:42:28 GMT</pubDate>
            <atom:updated>2025-11-06T20:58:07.966Z</atom:updated>
            <content:encoded><![CDATA[<figure><img alt="" src="https://cdn-images-1.medium.com/max/950/1*tjyOr2QX-kbACvvJNx_mcw.png" /></figure><h3>Introduction</h3><p>In the ever-evolving world of web development, real-time interactions have become less of a luxury and more of an expectation. Whether you’re building a chat application, a collaborative document editor, or a multiplayer game, the need for bidirectional communication is clear, and WebSockets are often the technology of choice.</p><p>But let’s be honest: working with WebSockets can sometimes feel like trying to organize a party where guests randomly shout things at each other across the room. Messages fly back and forth, payload structures are inconsistent, and before you know it, your elegant application architecture looks more like a tangled ball of holiday lights that you’ve promised yourself you’ll sort out “next year.”</p><h4>Enter Bun and Zod</h4><p><a href="https://bun.sh/">Bun</a> has been making waves as a fast JavaScript runtime with built-in WebSocket support that’s both performant and easy to work with. Its native WebSocket implementation (based on uWebSockets) outperforms many alternatives, making it an excellent foundation for real-time applications.</p><p>Meanwhile, <a href="https://zod.dev/">Zod</a> has revolutionized runtime type validation in the JavaScript ecosystem. It provides a way to define schemas that guarantee the shape and type of your data, catching errors before they wreak havoc in your application.</p><h4>The Challenge of WebSocket Communication</h4><p>When building applications with WebSockets, several challenges typically arise:</p><ol><li><strong>Type safety across the wire</strong>: Unlike HTTP requests with well-defined endpoints and schemas, WebSocket messages can be a wild west of untyped JSON.</li><li><strong>Message routing complexity</strong>: As your application grows, so does the variety of messages you need to handle. Without a structured system, this often results in sprawling switch statements or complex conditionals.</li><li><strong>Error handling</strong>: When a message doesn’t match your expectations, how do you gracefully handle it and provide meaningful feedback?</li><li><strong>Connection lifecycle management</strong>: Who’s connected? What rooms are they in? How do you manage authentication state across a persistent connection?</li></ol><h4>Introducing WS-Kit</h4><p>To address these challenges, we’ve created <a href="https://kriasoft.com/ws-kit/"><strong>WS-Kit</strong></a> — a type-safe WebSocket router for Bun and other platforms. It combines pluggable validators (Zod, Valibot, custom) with Bun’s WebSocket implementation to create a structured, maintainable approach to real-time messaging.</p><p>At its core, <strong>WS-Kit</strong> gives you:</p><ul><li>A way to define message types with Zod schemas</li><li>A router that automatically validates incoming messages against these schemas</li><li>Handlers that receive only properly typed message payloads</li><li>Built-in support for broadcasting and room-based communication</li><li>Clean error handling patterns</li></ul><p>Instead of wrestling with raw WebSocket messages, you can think in terms of typed routes, similar to how you’d structure a REST API. This approach brings clarity and maintainability to what would otherwise be chaotic message passing.</p><p>In this tutorial, we’ll build a real-time application from the ground up using Bun and WS-Kit. We’ll start with the basics of WebSocket communication in Bun, then gradually introduce type safety with Zod, and finally implement more advanced patterns like authentication and room-based messaging.</p><p>By the end, you’ll have a solid foundation for building robust, type-safe real-time applications that can scale with your needs. No more digging through message payloads with console.log at 2 AM, wondering why your users are seeing gibberish on their screens instead of the latest game state.</p><p>So grab your favorite beverage, fire up your code editor, and let’s bring some order to the WebSocket chaos. Your future self — the one who has to maintain this code six months from now — will thank you.</p><h3>Part 1: WebSockets Fundamentals in Bun</h3><h4>What Are WebSockets and Why Use Them?</h4><p>Remember the days of polling a server every few seconds to check for updates? Like repeatedly asking “Are we there yet?” on a road trip, except the server is the increasingly annoyed parent. That’s the world WebSockets were designed to rescue us from.</p><p>Unlike traditional HTTP connections that follow a request-response pattern, WebSockets establish a persistent, two-way communication channel between clients and servers. Once established, both sides can send messages to each other at any time without the overhead of creating new connections. This makes WebSockets perfect for:</p><ul><li>Real-time chat applications</li><li>Live dashboards and data visualizations</li><li>Multiplayer games</li><li>Collaborative editing tools</li><li>Notification systems</li><li>Stock tickers and sports scores</li></ul><p>In essence, anywhere you need low-latency, bidirectional communication, WebSockets are your friend.</p><h4>Bun’s Native WebSocket Implementation</h4><p><a href="https://bun.sh/">Bun</a> comes with a blazing-fast, native WebSocket implementation built right in. No need to reach for additional packages like ws or socket.io (though they&#39;re excellent tools in their own right). Bun&#39;s implementation is:</p><ul><li><strong>Fast</strong>: Built on top of Bun’s optimized JavaScript runtime</li><li><strong>Memory-efficient</strong>: Uses less memory than Node.js alternatives</li><li><strong>Standards-compliant</strong>: Follows the WebSocket protocol (RFC 6455)</li><li><strong>Feature-rich</strong>: Includes built-in support for the PubSub pattern</li></ul><p>This native implementation means you can start building real-time applications immediately without any external dependencies for the WebSocket functionality itself.</p><h4>Setting Up a Basic WebSocket Server in Bun</h4><p>Let’s create a simple WebSocket echo server to demonstrate how easy it is to get started with Bun. Create a new file called server.ts:</p><pre>import { serve } from &quot;bun&quot;;<br><br>serve({<br>  port: 3000,<br><br>  fetch(req, server) {<br>    // Extract URL from the request<br>    const url = new URL(req.url);<br><br>    // Handle WebSocket upgrade requests<br>    if (url.pathname === &quot;/ws&quot;) {<br>      // Upgrade HTTP request to WebSocket connection<br>      const success = server.upgrade(req);<br><br>      // Return a fallback response if upgrade fails<br>      if (!success) {<br>        return new Response(&quot;WebSocket upgrade failed&quot;, { status: 400 });<br>      }<br><br>      // The connection is handled by the websocket handlers<br>      return undefined;<br>    }<br><br>    // Handle regular HTTP requests<br>    return new Response(<br>      &quot;Hello from Bun! Try connecting to /ws with a WebSocket client.&quot;,<br>    );<br>  },<br><br>  // Define what happens when a WebSocket connects<br>  websocket: {<br>    // Called when a WebSocket connection is established<br>    open(ws) {<br>      console.log(&quot;WebSocket connection opened&quot;);<br>      ws.send(<br>        &quot;Welcome to the echo server! Send me a message and I&#39;ll send it right back.&quot;,<br>      );<br>    },<br><br>    // Called when a message is received<br>    message(ws, message) {<br>      console.log(`Received: ${message}`);<br>      // Echo the message back<br>      ws.send(`You said: ${message}`);<br>    },<br><br>    // Called when the connection closes<br>    close(ws, code, reason) {<br>      console.log(`WebSocket closed with code ${code} and reason: ${reason}`);<br>    },<br><br>    // Called when there&#39;s an error<br>    error(ws, error) {<br>      console.error(`WebSocket error: ${error}`);<br>    },<br>  },<br>});<br><br>console.log(&quot;WebSocket echo server listening on ws://localhost:3000/ws&quot;);</pre><p>To run this example:</p><pre>bun run server.ts</pre><h3>Connecting from a Browser Client</h3><p>Now let’s create a simple HTML client to connect to our WebSocket server:</p><pre>&lt;!DOCTYPE html&gt;<br>&lt;html lang=&quot;en&quot;&gt;<br>  &lt;head&gt;<br>    &lt;meta charset=&quot;UTF-8&quot; /&gt;<br>    &lt;title&gt;WebSocket Test&lt;/title&gt;<br>    &lt;style&gt;<br>      body {<br>        font-family: system-ui, sans-serif;<br>        max-width: 800px;<br>        margin: 0 auto;<br>        padding: 20px;<br>      }<br>      #messages {<br>        height: 300px;<br>        border: 1px solid #ccc;<br>        margin-bottom: 10px;<br>        padding: 10px;<br>        overflow-y: auto;<br>      }<br>      #messageForm {<br>        display: flex;<br>        gap: 10px;<br>      }<br>      #messageInput {<br>        flex-grow: 1;<br>        padding: 8px;<br>      }<br>    &lt;/style&gt;<br>  &lt;/head&gt;<br>  &lt;body&gt;<br>    &lt;h1&gt;Bun WebSocket Echo Test&lt;/h1&gt;<br>    &lt;div id=&quot;status&quot;&gt;Disconnected&lt;/div&gt;<br>    &lt;div id=&quot;messages&quot;&gt;&lt;/div&gt;<br>    &lt;form id=&quot;messageForm&quot;&gt;<br>      &lt;input type=&quot;text&quot; id=&quot;messageInput&quot; placeholder=&quot;Type a message...&quot; /&gt;<br>      &lt;button type=&quot;submit&quot;&gt;Send&lt;/button&gt;<br>    &lt;/form&gt;<br><br>    &lt;script&gt;<br>      const statusEl = document.getElementById(&quot;status&quot;);<br>      const messagesEl = document.getElementById(&quot;messages&quot;);<br>      const messageFormEl = document.getElementById(&quot;messageForm&quot;);<br>      const messageInputEl = document.getElementById(&quot;messageInput&quot;);<br><br>      // Create a WebSocket connection<br>      const socket = new WebSocket(&quot;ws://localhost:3000/ws&quot;);<br><br>      // Connection opened<br>      socket.addEventListener(&quot;open&quot;, (event) =&gt; {<br>        statusEl.textContent = &quot;Connected&quot;;<br>        statusEl.style.color = &quot;green&quot;;<br>        addMessage(&quot;System&quot;, &quot;Connected to server&quot;);<br>      });<br><br>      // Listen for messages<br>      socket.addEventListener(&quot;message&quot;, (event) =&gt; {<br>        addMessage(&quot;Server&quot;, event.data);<br>      });<br><br>      // Connection closed<br>      socket.addEventListener(&quot;close&quot;, (event) =&gt; {<br>        statusEl.textContent = &quot;Disconnected&quot;;<br>        statusEl.style.color = &quot;red&quot;;<br>        addMessage(&quot;System&quot;, `Disconnected: Code ${event.code}`);<br>      });<br><br>      // Connection error<br>      socket.addEventListener(&quot;error&quot;, (event) =&gt; {<br>        statusEl.textContent = &quot;Error&quot;;<br>        statusEl.style.color = &quot;red&quot;;<br>        addMessage(&quot;System&quot;, &quot;Connection error&quot;);<br>        console.error(&quot;WebSocket error:&quot;, event);<br>      });<br><br>      // Send message<br>      messageFormEl.addEventListener(&quot;submit&quot;, (e) =&gt; {<br>        e.preventDefault();<br>        const message = messageInputEl.value;<br>        if (message &amp;&amp; socket.readyState === WebSocket.OPEN) {<br>          socket.send(message);<br>          addMessage(&quot;You&quot;, message);<br>          messageInputEl.value = &quot;&quot;;<br>        }<br>      });<br><br>      // Helper to add message to the UI<br>      function addMessage(sender, content) {<br>        const messageEl = document.createElement(&quot;div&quot;);<br>        messageEl.innerHTML = `&lt;strong&gt;${sender}:&lt;/strong&gt; ${content}`;<br>        messagesEl.appendChild(messageEl);<br>        messagesEl.scrollTop = messagesEl.scrollHeight;<br>      }<br>    &lt;/script&gt;<br>  &lt;/body&gt;<br>&lt;/html&gt;</pre><p>Open this HTML file in a browser, and you should be able to send messages to your Bun WebSocket server and see the echoed responses.</p><h3>Understanding the WebSocket Lifecycle</h3><p>WebSockets follow a specific lifecycle:</p><ol><li><strong>Connection</strong> — The client initiates a handshake by sending an HTTP request with an Upgrade: websocket header. If the server accepts, it responds with a 101 Switching Protocols status.</li><li><strong>Open</strong> — After a successful handshake, the WebSocket connection is established and the open event fires.</li><li><strong>Message Exchange</strong> — Both client and server can send messages at any time.</li><li><strong>Closing</strong> — Either side can initiate closing the connection with a close code and reason.</li><li><strong>Closed</strong> — The connection is terminated. No more messages can be sent.</li></ol><h3>The Challenge of Raw WebSocket Messages</h3><p>While our echo server is simple, real applications quickly become more complex. As soon as you start building a non-trivial application, you’ll encounter challenges:</p><ol><li><strong>Message Format</strong>: Should you use JSON? Binary? Some custom format?</li><li><strong>Message Types</strong>: How do you distinguish between different kinds of messages?</li><li><strong>Routing Logic</strong>: How do you direct messages to the appropriate handlers?</li><li><strong>Error Handling</strong>: What happens when a message isn’t formatted correctly?</li></ol><p>Let’s upgrade our example to handle JSON messages with a type field:</p><pre>import { serve } from &quot;bun&quot;;<br><br>type ChatMessage = {<br>  type: string;<br>  content?: any;<br>};<br><br>serve({<br>  port: 3000,<br><br>  fetch(req, server) {<br>    const url = new URL(req.url);<br>    if (url.pathname === &quot;/ws&quot;) {<br>      const success = server.upgrade(req);<br>      return success<br>        ? undefined<br>        : new Response(&quot;WebSocket upgrade failed&quot;, { status: 400 });<br>    }<br>    return new Response(&quot;Hello from Bun!&quot;);<br>  },<br><br>  websocket: {<br>    open(ws) {<br>      console.log(&quot;Connection opened&quot;);<br>    },<br><br>    message(ws, data) {<br>      try {<br>        // Parse the incoming message<br>        const message = JSON.parse(data as string) as ChatMessage;<br><br>        // Handle different message types<br>        switch (message.type) {<br>          case &quot;CHAT&quot;:<br>            console.log(`Chat message: ${message.content.text}`);<br>            // Echo back with a timestamp<br>            ws.send(<br>              JSON.stringify({<br>                type: &quot;CHAT_ECHO&quot;,<br>                content: {<br>                  original: message.content.text,<br>                  timestamp: new Date().toISOString(),<br>                },<br>              }),<br>            );<br>            break;<br><br>          case &quot;PING&quot;:<br>            ws.send(<br>              JSON.stringify({<br>                type: &quot;PONG&quot;,<br>                content: { timestamp: new Date().toISOString() },<br>              }),<br>            );<br>            break;<br><br>          default:<br>            ws.send(<br>              JSON.stringify({<br>                type: &quot;ERROR&quot;,<br>                content: { message: `Unknown message type: ${message.type}` },<br>              }),<br>            );<br>            break;<br>        }<br>      } catch (error) {<br>        console.error(&quot;Error processing message:&quot;, error);<br>        ws.send(<br>          JSON.stringify({<br>            type: &quot;ERROR&quot;,<br>            content: { message: &quot;Could not parse message&quot; },<br>          }),<br>        );<br>      }<br>    },<br><br>    close(ws, code, reason) {<br>      console.log(`Connection closed: ${code} ${reason}`);<br>    },<br>  },<br>});<br><br>console.log(&quot;Improved WebSocket server running on ws://localhost:3000/ws&quot;);</pre><h4>The Problem with This Approach</h4><p>Even in this simple example, we’re already seeing issues:</p><ol><li><strong>Type Safety</strong>: The as ChatMessage cast doesn&#39;t guarantee the message actually has the right structure.</li><li><strong>Error Prone</strong>: It’s easy to typo a message type or forget a field.</li><li><strong>Scaling Issues</strong>: As you add more message types, the switch statement becomes unwieldy.</li><li><strong>Maintenance Burden</strong>: There’s no centralized definition of message structures.</li></ol><p>This is where <strong>WS-Kit</strong> comes in, providing a structured approach to handling WebSocket messages with pluggable validators. It turns our messy switch statement into clear, type-safe routes with validation baked in.</p><p>In the next section, we’ll explore how to solve these problems using <a href="https://kriasoft.com/ws-kit/">WS-Kit</a> with Zod schemas for type-safety.</p><h3>Part 2: The Type-Safety Challenge</h3><h4>The Wild West of WebSocket Messages</h4><p>If you’ve been following along, you now have a basic WebSocket server running in Bun. Messages are flying back and forth, connections are being established and closed — everything seems great! But then you try to build something real, and suddenly it feels like you’re trying to herd cats in the dark. With a blindfold on. While riding a unicycle.</p><p>The challenge with WebSocket communication is that, unlike REST APIs with their well-defined endpoints and request/response structures, WebSockets are essentially a continuous stream of messages. There’s no built-in mechanism to ensure that:</p><ol><li>Messages have the right structure</li><li>Required fields are present</li><li>Values have the correct types</li><li>Handlers receive only messages they’re designed to process</li></ol><p>This is where many WebSocket applications start to crumble under their own complexity. Let’s explore the key challenges in detail.</p><h4>The “What Did I Just Receive?” Problem</h4><p>Take a look at this common WebSocket message handler pattern:</p><pre>ws.addEventListener(&quot;message&quot;, (event) =&gt; {<br>  const data = JSON.parse(event.data);<br><br>  if (data.type === &quot;chat_message&quot;) {<br>    // Is data.content defined? Is it a string? Who knows!<br>    chatSystem.processMessage(data.content);<br>  } else if (data.type === &quot;user_joined&quot;) {<br>    // Does data.userId exist? Is it a number or string?<br>    notifyUserJoined(data.userId);<br>  } else if (data.type === &quot;typing_indicator&quot;) {<br>    // Is data.isTyping a boolean or something else?<br>    updateTypingStatus(data.userId, data.isTyping);<br>  }<br>  // And so on...<br>});</pre><p>This approach has several issues:</p><ol><li><strong>No guarantee of structure</strong>: Just because data.type is &#39;chat_message&#39; doesn&#39;t mean data.content exists.</li><li><strong>Type coercion traps</strong>: JavaScript’s loose typing means data.isTyping could be the string &quot;false&quot; instead of the boolean false.</li><li><strong>Typo landmines</strong>: Mistype &#39;chat_message&#39; as &#39;chat_mesage&#39; and your handler won&#39;t trigger.</li><li><strong>Implicit dependencies</strong>: It’s not clear what fields each message type requires.</li></ol><h4>The Evolution of Error Messages</h4><p>As your application grows, so does the sophistication (or desperation) of your error handling:</p><p><strong>Stage 1: Blissful Ignorance</strong></p><pre>ws.addEventListener(&#39;message&#39;, (event) =&gt; {<br>  const data = JSON.parse(event.data);<br>  processChatMessage(data.room, data.text); // What could go wrong?<br>});</pre><p><strong>Stage 2: The console.log Debugging Phase</strong></p><pre>ws.addEventListener(&#39;message&#39;, (event) =&gt; {<br>  const data = JSON.parse(event.data);<br>  console.log(&quot;Received:&quot;, data); // Let me see what I&#39;m dealing with<br>  if (data.room &amp;&amp; data.text) {<br>    processChatMessage(data.room, data.text);<br>  }<br>});</pre><p><strong>Stage 3: Trust Issues</strong></p><pre>ws.addEventListener(&#39;message&#39;, (event) =&gt; {<br>  try {<br>    const data = JSON.parse(event.data);<br>    if (!data || typeof data !== &#39;object&#39;) {<br>      throw new Error(&#39;Invalid message format&#39;);<br>    }<br>    <br>    if (!data.type || typeof data.type !== &#39;string&#39;) {<br>      throw new Error(&#39;Missing or invalid type field&#39;);<br>    }<br>    <br>    if (data.type === &#39;chat_message&#39;) {<br>      if (!data.room || typeof data.room !== &#39;string&#39;) {<br>        throw new Error(&#39;Missing or invalid room field&#39;);<br>      }<br>      <br>      if (!data.text || typeof data.text !== &#39;string&#39;) {<br>        throw new Error(&#39;Missing or invalid text field&#39;);<br>      }<br>      <br>      processChatMessage(data.room, data.text);<br>    }<br>    // And so on for EVERY message type...<br>  } catch (error) {<br>    console.error(&quot;Error processing message:&quot;, error);<br>    ws.send(JSON.stringify({<br>      type: &#39;error&#39;,<br>      message: error.message<br>    }));<br>  }<br>});</pre><p>By Stage 3, a third of your codebase is dedicated to validation, and you’re seriously considering a career change to something less frustrating… like herding actual cats.</p><h4>The TypeScript Mirage</h4><p>“But wait,” you might say, “I’m using TypeScript! I’ve defined interfaces for all my message types!”</p><pre>interface BaseChatMessage {<br>  type: string;<br>}<br><br>interface ChatMessage extends BaseChatMessage {<br>  type: &#39;chat_message&#39;;<br>  room: string;<br>  text: string;<br>}<br><br>interface UserJoinedMessage extends BaseChatMessage {<br>  type: &#39;user_joined&#39;;<br>  userId: string;<br>  username: string;<br>}<br><br>// More message types...<br><br>type AllMessageTypes = ChatMessage | UserJoinedMessage | /* ... */;<br><br>ws.addEventListener(&#39;message&#39;, (event) =&gt; {<br>  const data = JSON.parse(event.data) as AllMessageTypes; // The infamous &quot;trust me&quot; cast<br><br>  switch (data.type) {<br>    case &#39;chat_message&#39;:<br>      // TypeScript now thinks data is ChatMessage<br>      processChatMessage(data.room, data.text);<br>      break;<br>    case &#39;user_joined&#39;:<br>      // TypeScript now thinks data is UserJoinedMessage<br>      notifyUserJoined(data.userId, data.username);<br>      break;<br>  }<br>});</pre><p>This looks better! TypeScript gives you nice autocomplete and seems to understand your message structure. But there’s an illusion at play here: that as AllMessageTypes cast is basically you telling TypeScript, &quot;Trust me, this JSON is properly formatted.&quot; But at runtime, all those lovely types disappear, and you&#39;re back to the Wild West.</p><p>What if someone sends this?</p><pre>{<br>  &quot;type&quot;: &quot;chat_message&quot;,<br>  &quot;rum&quot;: &quot;general&quot;,  // Typo: &quot;rum&quot; instead of &quot;room&quot;<br>  &quot;text&quot;: &quot;Hello world!&quot;<br>}</pre><p>TypeScript won’t save you. Your code will try to process data.room, which is undefined, potentially causing errors downstream.</p><h4>The Runtime Validation Gap</h4><p>The core issue is the gap between compile-time types (what TypeScript checks) and runtime values (what actually arrives over the wire). This is where validation libraries like Zod come in.</p><p>Zod lets you define schemas that serve as both TypeScript types AND runtime validators:</p><pre>import { z } from &quot;zod&quot;;<br><br>// Define message schemas<br>const ChatMessageSchema = z.object({<br>  type: z.literal(&quot;chat_message&quot;),<br>  room: z.string(),<br>  text: z.string(),<br>});<br><br>const UserJoinedSchema = z.object({<br>  type: z.literal(&quot;user_joined&quot;),<br>  userId: z.string(),<br>  username: z.string(),<br>});<br><br>// Infer TypeScript types from schemas<br>type ChatMessage = z.infer&lt;typeof ChatMessageSchema&gt;;<br>type UserJoinedMessage = z.infer&lt;typeof UserJoinedSchema&gt;;<br><br>// Use in handler<br>ws.addEventListener(&quot;message&quot;, (event) =&gt; {<br>  const data = JSON.parse(event.data);<br><br>  try {<br>    if (data.type === &quot;chat_message&quot;) {<br>      const validatedData = ChatMessageSchema.parse(data);<br>      processChatMessage(validatedData.room, validatedData.text);<br>    } else if (data.type === &quot;user_joined&quot;) {<br>      const validatedData = UserJoinedSchema.parse(data);<br>      notifyUserJoined(validatedData.userId, validatedData.username);<br>    }<br>  } catch (error) {<br>    console.error(&quot;Validation error:&quot;, error);<br>    // Send error back to client<br>  }<br>});</pre><p>This is much more robust! Now if someone sends a malformed message, Zod will catch it and provide detailed error information.</p><h4>The Routing Challenge</h4><p>But we still have another problem: as your application grows, this giant message handler becomes unmaintainable. You need a way to:</p><ol><li>Define message types and their validation schemas in one place</li><li>Route incoming messages to the appropriate handlers</li><li>Handle error cases consistently</li><li>Provide type safety throughout the process</li></ol><p>This is where <strong>WS-Kit</strong> comes in. It combines message type definition, validation, and routing into a clean, type-safe API.</p><h3>Enter WS-Kit</h3><p><strong>WS-Kit</strong> is designed to solve these challenges by providing:</p><ol><li>A way to define message types with Zod schemas</li><li>Automatic validation of incoming messages</li><li>Routing to type-specific handlers</li><li>Clean error handling patterns</li></ol><p>All code examples in this guide use createRouter imported from @ws-kit/zod, which automatically configures the router with Zod validation. This is the recommended way to set up WS-Kit.</p><p>Instead of a giant switch statement or if/else chain, you can write code like this:</p><pre>import { z, message, createRouter } from &quot;@ws-kit/zod&quot;;<br>import { serve } from &quot;@ws-kit/bun&quot;;<br><br>// Define message types with schemas<br>const ChatMessage = message(&quot;CHAT_MESSAGE&quot;, {<br>  room: z.string(),<br>  text: z.string(),<br>});<br><br>const JoinRoom = message(&quot;JOIN_ROOM&quot;, {<br>  room: z.string(),<br>});<br><br>// Create router<br>const router = createRouter();<br><br>// Define handlers for each message type<br>router.on(ChatMessage, (ctx) =&gt; {<br>  // ctx.payload is fully typed and validated!<br>  const { room, text } = ctx.payload;<br><br>  // Do something with the message<br>  console.log(`Message in ${room}: ${text}`);<br><br>  // Send response<br>  ctx.send(ChatMessage, { room, text: &quot;Echo: &quot; + text });<br>});<br><br>router.on(JoinRoom, (ctx) =&gt; {<br>  const { room } = ctx.payload;<br>  ctx.subscribe(room); // Subscribe to room using Bun&#39;s built-in PubSub<br>  console.log(`Client joined room: ${room}`);<br>});<br><br>// Start server with router<br>serve(router, {<br>  port: 3000,<br>});</pre><p>With this approach:</p><ol><li>Message schemas are defined clearly in one place</li><li>Incoming messages are automatically validated</li><li>Handlers only receive messages they’re supposed to handle</li><li>TypeScript provides full type safety at every step</li><li>Invalid messages generate helpful error responses</li></ol><h4>The Benefits of Type-Safe WebSockets</h4><p>Using a typed approach with validation provides several key benefits:</p><ol><li><strong>Robust error handling</strong>: Catch malformed messages early with detailed error information</li><li><strong>Self-documenting code</strong>: Your message schemas serve as documentation for your protocol</li><li><strong>IDE support</strong>: Get autocomplete and type checking as you work with messages</li><li><strong>Safer refactoring</strong>: Change message structures with confidence, as TypeScript will find usages</li><li><strong>Clearer mental model</strong>: Discrete message types make the system easier to understand</li></ol><h4>From Chaos to Order</h4><p>With a type-safe approach using <strong>WS-Kit</strong> and Zod, we’ve moved from the Wild West of WebSocket messages to a structured, maintainable system. No more casting and hoping for the best. No more giant switch statements. No more manual validation code.</p><p>In the next section, we’ll dive deeper into <strong>WS-Kit</strong> and explore how it can be used to build a complete real-time chat application with authentication, rooms, and more.</p><h3>Part 3: Introducing WS-Kit</h3><h4>The Missing Piece in WebSocket Development</h4><p>In the previous sections, we explored WebSockets in Bun and the challenges of maintaining type safety in a real-time messaging environment. Now it’s time to introduce the solution to our WebSocket woes: <strong>WS-Kit</strong>.</p><p>Think of <strong>WS-Kit</strong> as that friend who always keeps their kitchen organized — the one who has separate containers for different types of pasta and labels everything. Maybe a bit obsessive, but you’re secretly grateful when you need to find the rigatoni at 2 AM. That’s what <strong>WS-Kit</strong> does for your WebSocket messages: it keeps everything organized, labeled, and exactly where it should be.</p><h4>What is WS-Kit?</h4><p><a href="https://github.com/kriasoft/ws-kit"><strong>WS-Kit</strong></a> is a lightweight, type-safe WebSocket router for Bun and other platforms. It provides a structured way to handle WebSocket connections and route messages to different handlers based on message types, all with full TypeScript support and pluggable validator integration (Zod, Valibot, custom).</p><p>Instead of building your own message routing system from scratch (and let’s be honest, the first version would probably be a giant switch statement), <strong>WS-Kit</strong> gives you a battle-tested solution that’s ready to use.</p><h3>Core Philosophy</h3><p>The core philosophy behind <strong>WS-Kit</strong> is simple:</p><ol><li><strong>Pluggable, not prescriptive</strong>: Work with any validator (Zod, Valibot, custom) and any platform (Bun, Cloudflare, custom adapters)</li><li><strong>Type safety everywhere</strong>: From message definition to handler execution</li><li><strong>Runtime validation</strong>: Catch errors before they cause problems</li><li><strong>Clean separation</strong>: Organize handlers by message type</li><li><strong>Minimal overhead</strong>: Keep things fast and lightweight</li></ol><h4>Key Features in Detail</h4><p>Let’s dig into the key features that make <strong>WS-Kit</strong> stand out:</p><h4>Type-Safe Messaging with Zod Schemas</h4><p>At the heart of <strong>WS-Kit</strong> is the message function. This function allows you to define message types with their associated payloads using Zod schemas:</p><pre>import { z, message } from &quot;@ws-kit/zod&quot;;<br><br>// Define a message type for joining a chat room<br>export const JoinRoom = message(&quot;JOIN_ROOM&quot;, {<br>  roomId: z.string(),<br>});<br><br>// Define a message for sending a chat message<br>export const SendMessage = message(&quot;SEND_MESSAGE&quot;, {<br>  roomId: z.string(),<br>  message: z.string().min(1).max(500), // Add constraints<br>  attachments: z<br>    .array(<br>      z.object({<br>        type: z.enum([&quot;image&quot;, &quot;file&quot;]),<br>        url: z.string().url(),<br>      }),<br>    )<br>    .optional(),<br>});</pre><p>The magic here is twofold:</p><ol><li><strong>TypeScript Types</strong>: The message function automatically generates TypeScript types that you can use throughout your codebase</li><li><strong>Runtime Validation</strong>: When a message arrives, it’s automatically validated against the schema before your handler is called</li></ol><p>This means you can confidently access ctx.payload.roomId in your handler, knowing it&#39;s a string that passed validation. No more defensive coding with if (typeof data.roomId === &#39;string&#39;) checks everywhere!</p><h4>Intuitive Routing System</h4><p>With <strong>WS-Kit</strong>, you define handlers for specific message types:</p><pre>import { z, createRouter } from &quot;@ws-kit/zod&quot;;<br>import { JoinRoom, SendMessage } from &quot;./schemas&quot;;<br><br>const router = createRouter();<br><br>// Handle JOIN_ROOM messages<br>router.on(JoinRoom, (ctx) =&gt; {<br>  const { roomId } = ctx.payload; // Fully typed and validated!<br>  console.log(`Client wants to join room: ${roomId}`);<br><br>  // Join the room using Bun&#39;s built-in PubSub<br>  ctx.subscribe(roomId);<br><br>  // Send confirmation<br>  ctx.send(JoinRoom, { roomId }); // Type-checked!<br>});<br><br>// Handle SEND_MESSAGE messages<br>router.on(SendMessage, (ctx) =&gt; {<br>  const { roomId, message, attachments } = ctx.payload;<br>  console.log(`New message in ${roomId}: ${message}`);<br><br>  // No need to check if attachments exists - type system handles it<br>  const hasAttachments = attachments &amp;&amp; attachments.length &gt; 0;<br><br>  // Broadcast to room (using Bun&#39;s built-in PubSub)<br>  // More on this in the broadcast section<br>});</pre><p>Each handler receives a context object with:</p><ul><li>ws: The WebSocket connection</li><li>payload: The validated message payload (fully typed!)</li><li>meta: Additional metadata about the message</li><li>send(): A helper method for sending responses</li></ul><p>If a message arrives with an unknown type or fails validation, it’s automatically rejected with an appropriate error message — no need to write that boilerplate yourself.</p><h3>Leveraging Bun’s Native WebSocket Performance</h3><p><strong>WS-Kit</strong> is designed to be a thin layer on top of Bun’s already-fast WebSocket implementation. It doesn’t reinvent the wheel — it just adds guardrails to keep you on the road.</p><p>The library adds minimal overhead to message processing, focusing on routing and validation while letting Bun handle the heavy lifting of WebSocket connections, frame parsing, and PubSub functionality.</p><h3>Flexible Integration</h3><p>One of the strengths of <strong>WS-Kit</strong> is how easily it integrates with different server setups:</p><pre>import { createRouter, z } from &quot;@ws-kit/zod&quot;;<br>import { serve } from &quot;@ws-kit/bun&quot;;<br><br>// WebSocket router<br>const router = createRouter();<br><br>// Define your message handlers here<br>router.on(YourMessage, (ctx) =&gt; {<br>  // Handle message<br>});<br><br>// High-level serve() with auto-configuration<br>serve(router, {<br>  port: 3000,<br>});<br><br>// Or for advanced setups, use Hono or any HTTP framework:<br>import { Hono } from &quot;hono&quot;;<br>import { createBunHandler } from &quot;@ws-kit/bun&quot;;<br><br>const app = new Hono();<br>app.get(&quot;/&quot;, (c) =&gt; c.text(&quot;Welcome to Hono!&quot;));<br><br>const wsHandler = createBunHandler(router);<br><br>Bun.serve({<br>  port: 3000,<br>  fetch(req, server) {<br>    if (new URL(req.url).pathname === &quot;/ws&quot;) {<br>      return wsHandler(req, server);<br>    }<br>    return app.fetch(req);<br>  },<br>  websocket: router.websocket,<br>});</pre><p>The library is framework-agnostic — it works standalone, with Hono, Elysia, or any other HTTP framework you prefer.</p><h4>Connection Lifecycle Management</h4><p><strong>WS-Kit</strong> provides handlers for the entire WebSocket lifecycle:</p><pre>// Handle new connections<br>router.onOpen((ctx) =&gt; {<br>  console.log(`New client connected: ${ctx.ws.data.clientId}`);<br><br>  // Send welcome message<br>  ctx.send(Welcome, { message: &quot;Welcome to the server!&quot; });<br>});<br><br>// Handle message types (as seen earlier)<br>router.on(JoinRoom, (ctx) =&gt; {<br>  /* ... */<br>});<br><br>// Handle disconnections<br>router.onClose((ctx) =&gt; {<br>  console.log(`Client disconnected: ${ctx.ws.data.clientId}`);<br>  console.log(`Close code: ${ctx.code}`);<br>  console.log(`Close reason: ${ctx.reason}`);<br><br>  // Clean up any resources<br>  if (ctx.ws.data.roomId) {<br>    leaveRoom(ctx.ws.data.roomId, ctx.ws.data.clientId);<br>  }<br>});</pre><p>Each handler has access to the WebSocket connection’s metadata through ctx.ws.data, allowing you to store and retrieve session information.</p><h3>Authentication and Security</h3><p>Security is a critical concern in WebSocket applications. <strong>WS-Kit</strong> provides a clean way to handle authentication during the WebSocket upgrade process:</p><pre>import { z, createRouter } from &quot;@ws-kit/zod&quot;;<br>import { serve } from &quot;@ws-kit/bun&quot;;<br>import { verifyToken } from &quot;./auth&quot;; // Your authentication logic<br><br>type AppData = {<br>  userId?: string;<br>  userRole?: string;<br>};<br><br>// Create router with type for connection metadata<br>const router = createRouter&lt;AppData&gt;();<br><br>// Your message handlers<br>router.on(SomeMessage, (ctx) =&gt; {<br>  // ctx.ws.data.userId is available here<br>});<br><br>// Start server with authentication<br>serve(router, {<br>  port: 3000,<br>  async authenticate(req) {<br>    // Extract and verify authentication token<br>    const authHeader = req.headers.get(&quot;Authorization&quot;);<br>    const token = authHeader?.split(&quot;Bearer &quot;)[1];<br><br>    // Optional: Reject connection if no token<br>    if (!token) {<br>      return undefined; // Rejects the connection<br>    }<br><br>    // Verify token and get user info<br>    const user = await verifyToken(token);<br><br>    // Return user data to be attached to ws.data<br>    return {<br>      userId: user?.id,<br>      userRole: user?.role,<br>    };<br>  },<br>});</pre><p>By authenticating during the upgrade process, you ensure that only authorized users can establish WebSocket connections. The user data is then available in all your handlers via ctx.ws.data.</p><h3>Broadcasting and Room Management</h3><p>WebSocket applications often need to broadcast messages to multiple clients. <strong>WS-Kit</strong> complements Bun’s built-in PubSub functionality with schema validation:</p><pre>import { z, createRouter } from &quot;@ws-kit/zod&quot;;<br>import { ChatMessage, UserJoined } from &quot;./schemas&quot;;<br><br>const router = createRouter();<br><br>router.on(ChatMessage, (ctx) =&gt; {<br>  const { roomId, message } = ctx.payload;<br>  const userId = ctx.ws.data.userId;<br><br>  // Broadcast the message to everyone in the room<br>  ctx.publish(roomId, ChatMessage, {<br>    roomId,<br>    userId,<br>    message,<br>    timestamp: Date.now(),<br>  });<br>});<br><br>router.on(JoinRoom, (ctx) =&gt; {<br>  const { roomId } = ctx.payload;<br>  const userId = ctx.ws.data.userId;<br><br>  // Subscribe to the room<br>  ctx.subscribe(roomId);<br>  ctx.ws.data.roomId = roomId;<br><br>  // Notify others<br>  ctx.publish(roomId, UserJoined, {<br>    roomId,<br>    userId,<br>    timestamp: Date.now(),<br>  });<br>});</pre><p>The ctx.publish() helper ensures that broadcast messages are validated against their schemas before being sent, providing the same type safety for broadcasts that you get with direct messaging.</p><h4>Error Handling</h4><p>Robust error handling is crucial for WebSocket applications. <strong>WS-Kit</strong> includes a standardized error system with error codes aligned with gRPC:</p><pre>import { z, createRouter } from &quot;@ws-kit/zod&quot;;<br><br>const router = createRouter();<br><br>router.on(JoinRoom, (ctx) =&gt; {<br>  const { roomId } = ctx.payload;<br><br>  // Check if room exists<br>  const roomExists = checkRoomExists(roomId);<br><br>  if (!roomExists) {<br>    // Send typed error response<br>    ctx.error(&quot;NOT_FOUND&quot;, `Room ${roomId} does not exist`, {<br>      roomId, // Additional debug info<br>    });<br>    return;<br>  }<br><br>  // Continue with normal flow...<br>});</pre><p>The library includes predefined error codes (UNAUTHENTICATED, PERMISSION_DENIED, INVALID_ARGUMENT, NOT_FOUND, RESOURCE_EXHAUSTED, etc.) for common scenarios, ensuring consistent error reporting.</p><h4>Modular Route Organization</h4><p>As your application grows, you can organize routes into separate modules:</p><pre>// chat.ts<br>import { z, createRouter } from &quot;@ws-kit/zod&quot;;<br>import { ChatMessage, JoinRoom } from &quot;./schemas&quot;;<br><br>// Create a router instance<br>export const chatRouter = createRouter();<br><br>// Add message handlers<br>chatRouter.on(ChatMessage, (ctx) =&gt; {<br>  /* ... */<br>});<br>chatRouter.on(JoinRoom, (ctx) =&gt; {<br>  /* ... */<br>});<br><br>// main.ts<br>import { z, createRouter } from &quot;@ws-kit/zod&quot;;<br>import { serve } from &quot;@ws-kit/bun&quot;;<br>import { chatRouter } from &quot;./chat&quot;;<br>import { userRouter } from &quot;./user&quot;;<br><br>const router = createRouter();<br><br>// Add modular routers<br>router.merge(chatRouter);<br>router.merge(userRouter);<br><br>// Start server<br>serve(router, { port: 3000 });</pre><p>This keeps your codebase organized and makes it easier to collaborate with team members.</p><h4>Why Choose WS-Kit?</h4><p>With so many WebSocket solutions out there, why choose <strong>WS-Kit</strong>?</p><ol><li><strong>Platform-agnostic</strong>: Pluggable adapters for Bun, Cloudflare, and custom platforms</li><li><strong>Validator-agnostic</strong>: Works with Zod, Valibot, or your own validation library</li><li><strong>TypeScript-first</strong>: Designed with type safety as a core principle</li><li><strong>Runtime validation</strong>: Catch errors before they cause problems</li><li><strong>Lightweight</strong>: Minimal overhead, just the features you need</li><li><strong>Progressive</strong>: Start simple and scale as needed</li></ol><p>It’s the Goldilocks of WebSocket libraries: not too heavy, not too bare-bones, but just right. Plus, you’re not locked into a single validator or platform.</p><h4>Getting Started with WS-Kit</h4><p>Ready to bring some order to your WebSocket chaos? Let’s get started:</p><pre>bun add @ws-kit/zod @ws-kit/bun zod<br>bun add @types/bun -D  # For TypeScript support</pre><p>In the next section, we’ll put everything together to build a complete real-time chat application using <strong>WS-Kit</strong>, demonstrating how the library makes complex WebSocket applications more manageable.</p><p>Say goodbye to giant switch statements and untyped message payloads. With <strong>WS-Kit</strong>, your WebSocket code can be as clean and organized as that friend’s pasta collection — just hopefully without the late-night carbohydrate cravings.</p><h3>Part 4: Building a Real-Time Chat Application</h3><p>Now that we understand the fundamentals of WebSockets in Bun and have been introduced to <strong>WS-Kit</strong>, let’s put everything together to build something practical: a real-time chat application.</p><p>After all, what better way to test our new WebSocket routing superpowers than by creating yet another chat app? Because clearly, what the world needs is one more place for people to share cat memes and debate whether pineapple belongs on pizza (it does, fight me).</p><h4>Project Setup</h4><p>First things first, let’s set up our project. Create a new folder for our chat application and initialize it:</p><pre>mkdir bun-chat-app<br>cd bun-chat-app<br>bun init -y</pre><p>Next, install the dependencies we’ll need:</p><pre>bun add @ws-kit/zod @ws-kit/bun zod<br>bun add @types/bun -D</pre><h4>Step 1: Define Our Message Schemas</h4><p>The heart of our type-safe approach is defining clear message schemas. Let’s create a file called schemas.ts to define all the message types our chat application will support:</p><pre>import { z, message } from &quot;@ws-kit/zod&quot;;<br><br>// User authentication<br>export const Authenticate = message(&quot;AUTHENTICATE&quot;, {<br>  token: z.string(),<br>});<br><br>export const AuthSuccess = message(&quot;AUTH_SUCCESS&quot;, {<br>  userId: z.string(),<br>  username: z.string(),<br>});<br><br>// Room management<br>export const JoinRoom = message(&quot;JOIN_ROOM&quot;, {<br>  roomId: z.string(),<br>});<br><br>export const LeaveRoom = message(&quot;LEAVE_ROOM&quot;, {<br>  roomId: z.string(),<br>});<br><br>export const UserJoined = message(&quot;USER_JOINED&quot;, {<br>  roomId: z.string(),<br>  userId: z.string(),<br>  username: z.string(),<br>});<br><br>export const UserLeft = message(&quot;USER_LEFT&quot;, {<br>  roomId: z.string(),<br>  userId: z.string(),<br>  username: z.string(),<br>});<br><br>export const RoomList = message(&quot;ROOM_LIST&quot;, {<br>  rooms: z.array(<br>    z.object({<br>      id: z.string(),<br>      name: z.string(),<br>      userCount: z.number(),<br>    }),<br>  ),<br>});<br><br>// Messaging<br>export const SendMessage = message(&quot;SEND_MESSAGE&quot;, {<br>  roomId: z.string(),<br>  text: z.string().min(1).max(1000),<br>  // Optional attachment<br>  attachment: z<br>    .object({<br>      type: z.enum([&quot;image&quot;, &quot;file&quot;]),<br>      url: z.string().url(),<br>      name: z.string().optional(),<br>    })<br>    .optional(),<br>});<br><br>export const ChatMessage = message(&quot;CHAT_MESSAGE&quot;, {<br>  messageId: z.string(),<br>  roomId: z.string(),<br>  userId: z.string(),<br>  username: z.string(),<br>  text: z.string(),<br>  timestamp: z.number(),<br>  attachment: z<br>    .object({<br>      type: z.enum([&quot;image&quot;, &quot;file&quot;]),<br>      url: z.string().url(),<br>      name: z.string().optional(),<br>    })<br>    .optional(),<br>});<br><br>// Typing indicators<br>export const TypingStart = message(&quot;TYPING_START&quot;, {<br>  roomId: z.string(),<br>});<br><br>export const TypingStop = message(&quot;TYPING_STOP&quot;, {<br>  roomId: z.string(),<br>});<br><br>export const UserTyping = message(&quot;USER_TYPING&quot;, {<br>  roomId: z.string(),<br>  userId: z.string(),<br>  username: z.string(),<br>});<br><br>// Connection metadata type<br>export type Meta = {<br>  userId?: string;<br>  username?: string;<br>  currentRoomId?: string;<br>  isAuthenticated?: boolean;<br>};</pre><p>Notice how we’ve organized our messages into logical groups: authentication, room management, messaging, and typing indicators. We’re also using Zod’s validation capabilities to ensure messages have the correct shape and content (like enforcing minimum and maximum message length).</p><h4>Step 2: Setting Up Our Mock User Database</h4><p>For simplicity, we’ll use an in-memory store for users and rooms instead of a real database:</p><pre>import { randomUUID } from &quot;crypto&quot;;<br><br>// User record<br>export type User = {<br>  id: string;<br>  username: string;<br>  token: string;<br>};<br><br>// Room record<br>export type Room = {<br>  id: string;<br>  name: string;<br>  users: Set&lt;string&gt;; // User IDs<br>};<br><br>// In-memory storage<br>const users = new Map&lt;string, User&gt;();<br>const tokens = new Map&lt;string, string&gt;(); // token -&gt; userId<br>const rooms = new Map&lt;string, Room&gt;();<br><br>// Seed with some default rooms<br>rooms.set(&quot;general&quot;, {<br>  id: &quot;general&quot;,<br>  name: &quot;General Chat&quot;,<br>  users: new Set(),<br>});<br><br>rooms.set(&quot;random&quot;, {<br>  id: &quot;random&quot;,<br>  name: &quot;Random Stuff&quot;,<br>  users: new Set(),<br>});<br><br>// User authentication methods<br>export function authenticateUser(token: string): User | null {<br>  const userId = tokens.get(token);<br>  if (!userId) return null;<br><br>  return users.get(userId) || null;<br>}<br><br>export function createUser(username: string): User {<br>  const id = randomUUID();<br>  const token = randomUUID();<br><br>  const user: User = { id, username, token };<br>  users.set(id, user);<br>  tokens.set(token, id);<br><br>  return user;<br>}<br><br>// Room methods<br>export function getRooms(): Room[] {<br>  return Array.from(rooms.values());<br>}<br><br>export function getRoom(roomId: string): Room | undefined {<br>  return rooms.get(roomId);<br>}<br><br>export function joinRoom(roomId: string, userId: string): boolean {<br>  const room = rooms.get(roomId);<br>  if (!room) return false;<br><br>  room.users.add(userId);<br>  return true;<br>}<br><br>export function leaveRoom(roomId: string, userId: string): boolean {<br>  const room = rooms.get(roomId);<br>  if (!room) return false;<br><br>  return room.users.delete(userId);<br>}<br><br>export function getUser(userId: string): User | undefined {<br>  return users.get(userId);<br>}</pre><p>This simple store handles user authentication, room management, and keeping track of who’s in which room.</p><h4>Step 3: Implementing Our WebSocket Handlers</h4><p>Now let’s implement handlers for each of our message types. Let’s create a file called chat-router.ts:</p><pre>import { z, createRouter } from &quot;@ws-kit/zod&quot;;<br>import { randomUUID } from &quot;crypto&quot;;<br>import * as schema from &quot;./schemas&quot;;<br>import {<br>  authenticateUser,<br>  createUser,<br>  getRoom,<br>  getRooms,<br>  joinRoom,<br>  leaveRoom,<br>  getUser,<br>} from &quot;./data-store&quot;;<br><br>// Create a router with our meta type<br>const router = createRouter&lt;schema.Meta&gt;();<br><br>// Handle new connections<br>router.onOpen((ctx) =&gt; {<br>  // clientId is automatically assigned by ws-kit framework<br>  console.log(`New client connected: ${ctx.ws.data.clientId}`);<br><br>  // Assign a random guest name until authenticated<br>  ctx.assignData({<br>    username: `Guest-${Math.floor(Math.random() * 10000)}`,<br>  });<br><br>  // Send room list to the new client<br>  const rooms = getRooms().map((room) =&gt; ({<br>    id: room.id,<br>    name: room.name,<br>    userCount: room.users.size,<br>  }));<br><br>  ctx.send(schema.RoomList, { rooms });<br>});<br><br>// Handle authentication<br>router.on(schema.Authenticate, (ctx) =&gt; {<br>  const { token } = ctx.payload;<br><br>  // Check if token exists in our store<br>  const user = authenticateUser(token);<br><br>  if (user) {<br>    // Authentication successful<br>    ctx.ws.data.isAuthenticated = true;<br>    ctx.ws.data.userId = user.id;<br>    ctx.ws.data.username = user.username;<br><br>    ctx.send(schema.AuthSuccess, {<br>      userId: user.id,<br>      username: user.username,<br>    });<br><br>    console.log(`User authenticated: ${user.username} (${user.id})`);<br>  } else {<br>    // Create a new user if token doesn&#39;t exist<br>    // In a real app, you&#39;d probably reject invalid tokens<br>    const newUser = createUser(<br>      ctx.ws.data.username || `User-${randomUUID().slice(0, 6)}`,<br>    );<br><br>    ctx.ws.data.isAuthenticated = true;<br>    ctx.ws.data.userId = newUser.id;<br>    ctx.ws.data.username = newUser.username;<br><br>    ctx.send(schema.AuthSuccess, {<br>      userId: newUser.id,<br>      username: newUser.username,<br>    });<br><br>    console.log(`New user created: ${newUser.username} (${newUser.id})`);<br>  }<br>});<br><br>// Handle joining a room<br>router.on(schema.JoinRoom, (ctx) =&gt; {<br>  const { roomId } = ctx.payload;<br>  const userId = ctx.ws.data.userId;<br>  const username = ctx.ws.data.username;<br><br>  // Check if user is authenticated<br>  if (!userId || !username) {<br>    ctx.error(&quot;UNAUTHENTICATED&quot;, &quot;You must be authenticated to join a room&quot;);<br>    return;<br>  }<br><br>  // Check if room exists<br>  const room = getRoom(roomId);<br>  if (!room) {<br>    ctx.error(&quot;NOT_FOUND&quot;, `Room ${roomId} does not exist`);<br>    return;<br>  }<br><br>  // If user is already in a room, leave it first<br>  if (ctx.ws.data.currentRoomId) {<br>    leaveRoom(ctx.ws.data.currentRoomId, userId);<br><br>    // Let others know user left the previous room<br>    ctx.publish(ctx.ws.data.currentRoomId, schema.UserLeft, {<br>      roomId: ctx.ws.data.currentRoomId,<br>      userId,<br>      username,<br>    });<br><br>    // Unsubscribe from previous room<br>    ctx.unsubscribe(ctx.ws.data.currentRoomId);<br>  }<br><br>  // Join the new room<br>  joinRoom(roomId, userId);<br>  ctx.ws.data.currentRoomId = roomId;<br><br>  // Subscribe to the room&#39;s messages<br>  ctx.subscribe(roomId);<br><br>  // Confirm to the user they&#39;ve joined<br>  ctx.send(schema.UserJoined, {<br>    roomId,<br>    userId,<br>    username,<br>  });<br><br>  // Let others know a new user joined<br>  ctx.publish(roomId, schema.UserJoined, {<br>    roomId,<br>    userId,<br>    username,<br>  });<br><br>  console.log(`User ${username} (${userId}) joined room: ${roomId}`);<br>});<br><br>// Handle leaving a room<br>router.on(schema.LeaveRoom, (ctx) =&gt; {<br>  const { roomId } = ctx.payload;<br>  const userId = ctx.ws.data.userId;<br>  const username = ctx.ws.data.username;<br><br>  if (!userId || !username) {<br>    ctx.error(&quot;UNAUTHENTICATED&quot;, &quot;You must be authenticated to leave a room&quot;);<br>    return;<br>  }<br><br>  // Check if user is in the room<br>  if (ctx.ws.data.currentRoomId !== roomId) {<br>    ctx.error(&quot;INVALID_ARGUMENT&quot;, &quot;You are not in this room&quot;);<br>    return;<br>  }<br><br>  // Leave the room<br>  leaveRoom(roomId, userId);<br>  ctx.ws.data.currentRoomId = undefined;<br><br>  // Unsubscribe from room<br>  ctx.unsubscribe(roomId);<br><br>  // Let others know user left<br>  ctx.publish(roomId, schema.UserLeft, {<br>    roomId,<br>    userId,<br>    username,<br>  });<br><br>  console.log(`User ${username} (${userId}) left room: ${roomId}`);<br>});<br><br>// Handle sending messages<br>router.on(schema.SendMessage, (ctx) =&gt; {<br>  const { roomId, text, attachment } = ctx.payload;<br>  const userId = ctx.ws.data.userId;<br>  const username = ctx.ws.data.username;<br><br>  if (!userId || !username) {<br>    ctx.error(&quot;UNAUTHENTICATED&quot;, &quot;You must be authenticated to send messages&quot;);<br>    return;<br>  }<br><br>  // Check if room exists<br>  if (!getRoom(roomId)) {<br>    ctx.error(&quot;NOT_FOUND&quot;, `Room ${roomId} does not exist`);<br>    return;<br>  }<br><br>  // Check if user is in the room they&#39;re trying to message<br>  if (ctx.ws.data.currentRoomId !== roomId) {<br>    ctx.error(<br>      &quot;PERMISSION_DENIED&quot;,<br>      &quot;You must join the room before sending messages&quot;,<br>    );<br>    return;<br>  }<br><br>  // Create a message object with ID and timestamp<br>  const messageId = randomUUID();<br>  const timestamp = Date.now();<br><br>  const chatMessage = {<br>    messageId,<br>    roomId,<br>    userId,<br>    username,<br>    text,<br>    timestamp,<br>    attachment,<br>  };<br><br>  // Broadcast the message to everyone in the room, including sender<br>  ctx.publish(roomId, schema.ChatMessage, chatMessage);<br><br>  console.log(<br>    `Message sent to room ${roomId} by ${username}: ${text.substring(0, 20)}${text.length &gt; 20 ? &quot;...&quot; : &quot;&quot;}`,<br>  );<br>});<br><br>// Handle typing indicators<br>router.on(schema.TypingStart, (ctx) =&gt; {<br>  const { roomId } = ctx.payload;<br>  const userId = ctx.ws.data.userId;<br>  const username = ctx.ws.data.username;<br><br>  if (!userId || !username || ctx.ws.data.currentRoomId !== roomId) return;<br><br>  // Broadcast typing indicator to everyone else in the room<br>  ctx.publish(roomId, schema.UserTyping, {<br>    roomId,<br>    userId,<br>    username,<br>  });<br>});<br><br>// Handle connection closure<br>router.onClose((ctx) =&gt; {<br>  const userId = ctx.ws.data.userId;<br>  const username = ctx.ws.data.username;<br>  const roomId = ctx.ws.data.currentRoomId;<br><br>  console.log(<br>    `Client disconnected: ${userId || ctx.ws.data.clientId}, code: ${ctx.code}`,<br>  );<br><br>  // If user was in a room, notify others and clean up<br>  if (userId &amp;&amp; username &amp;&amp; roomId) {<br>    leaveRoom(roomId, userId);<br><br>    // Let others know user left<br>    ctx.publish(roomId, schema.UserLeft, {<br>      roomId,<br>      userId,<br>      username,<br>    });<br>  }<br>});<br><br>export default router;</pre><p>That’s quite a bit of code, but it’s well-organized and each message type has its own dedicated handler. The beauty of this approach is that each handler receives a fully typed and validated payload, making it easy to work with the data without worrying about runtime errors.</p><h4>Step 4: Creating the Main Server</h4><p>Now let’s create the main server file that will bring everything together:</p><pre>import { z, createRouter } from &quot;@ws-kit/zod&quot;;<br>import { serve } from &quot;@ws-kit/bun&quot;;<br>import chatRouter from &quot;./chat-router&quot;;<br><br>// Create the main WebSocket router<br>const router = createRouter&lt;schema.Meta&gt;();<br><br>// Add our chat routes<br>router.merge(chatRouter);<br><br>// Start the server with WS-Kit<br>serve(router, {<br>  port: 3000,<br>});<br><br>console.log(&quot;Chat server running on http://localhost:3000&quot;);<br>console.log(&quot;WebSocket endpoint: ws://localhost:3000/ws&quot;);</pre><h3>Step 5: Creating a Simple Frontend with @ws-kit/client</h3><p>Let’s create a basic chat UI and use the @ws-kit/client SDK for WebSocket communication. This dramatically simplifies the client-side code compared to manual WebSocket handling.</p><p>First, install the client SDK:</p><pre>npm install @ws-kit/client</pre><blockquote>Note: The @ws-kit/client package provides the complete client SDK with full TypeScript support. Import from @ws-kit/client/zod for Zod-based validation or @ws-kit/client/valibot for Valibot-based validation, matching your server-side validator choice.</blockquote><p>Create a public folder for our static files:</p><pre>mkdir -p public</pre><p>Create the HTML file:</p><pre>&lt;!-- filepath: public/index.html --&gt;<br>&lt;!DOCTYPE html&gt;<br>&lt;html lang=&quot;en&quot;&gt;<br>  &lt;head&gt;<br>    &lt;meta charset=&quot;UTF-8&quot; /&gt;<br>    &lt;meta name=&quot;viewport&quot; content=&quot;width=device-width, initial-scale=1.0&quot; /&gt;<br>    &lt;title&gt;Bun Chat App&lt;/title&gt;<br>    &lt;link rel=&quot;stylesheet&quot; href=&quot;styles.css&quot; /&gt;<br>  &lt;/head&gt;<br>  &lt;body&gt;<br>    &lt;div class=&quot;app-container&quot;&gt;<br>      &lt;div class=&quot;sidebar&quot;&gt;<br>        &lt;div class=&quot;user-info&quot;&gt;<br>          &lt;span id=&quot;username&quot;&gt;Not logged in&lt;/span&gt;<br>          &lt;button id=&quot;login-btn&quot;&gt;Login&lt;/button&gt;<br>        &lt;/div&gt;<br>        &lt;div id=&quot;connection-status&quot; class=&quot;connection-status&quot;&gt;Disconnected&lt;/div&gt;<br>        &lt;h3&gt;Rooms&lt;/h3&gt;<br>        &lt;ul id=&quot;room-list&quot; class=&quot;room-list&quot;&gt;&lt;/ul&gt;<br>      &lt;/div&gt;<br><br>      &lt;div class=&quot;chat-container&quot;&gt;<br>        &lt;div id=&quot;room-header&quot; class=&quot;room-header&quot;&gt;Select a room&lt;/div&gt;<br><br>        &lt;div id=&quot;messages&quot; class=&quot;messages&quot;&gt;&lt;/div&gt;<br><br>        &lt;div id=&quot;typing-indicator&quot; class=&quot;typing-indicator&quot;&gt;&lt;/div&gt;<br><br>        &lt;form id=&quot;message-form&quot; class=&quot;message-form&quot;&gt;<br>          &lt;input<br>            type=&quot;text&quot;<br>            id=&quot;message-input&quot;<br>            placeholder=&quot;Type a message...&quot;<br>            disabled<br>          /&gt;<br>          &lt;button type=&quot;submit&quot; id=&quot;send-btn&quot; disabled&gt;Send&lt;/button&gt;<br>        &lt;/form&gt;<br>      &lt;/div&gt;<br>    &lt;/div&gt;<br><br>    &lt;script src=&quot;app.js&quot; type=&quot;module&quot;&gt;&lt;/script&gt;<br>  &lt;/body&gt;<br>&lt;/html&gt;</pre><p>Add styling (same as before, with one addition for connection status):</p><pre>* {<br>  margin: 0;<br>  padding: 0;<br>  box-sizing: border-box;<br>  font-family:<br>    system-ui,<br>    -apple-system,<br>    BlinkMacSystemFont,<br>    &quot;Segoe UI&quot;,<br>    Roboto,<br>    Oxygen,<br>    Ubuntu,<br>    Cantarell,<br>    &quot;Open Sans&quot;,<br>    &quot;Helvetica Neue&quot;,<br>    sans-serif;<br>}<br><br>body {<br>  background-color: #f5f5f5;<br>}<br><br>.app-container {<br>  display: flex;<br>  height: 100vh;<br>  max-width: 1200px;<br>  margin: 0 auto;<br>  background-color: white;<br>  box-shadow: 0 0 10px rgba(0, 0, 0, 0.1);<br>}<br><br>.sidebar {<br>  width: 250px;<br>  background-color: #f0f0f0;<br>  padding: 20px;<br>  border-right: 1px solid #ddd;<br>}<br><br>.user-info {<br>  display: flex;<br>  justify-content: space-between;<br>  align-items: center;<br>  margin-bottom: 20px;<br>  padding-bottom: 10px;<br>  border-bottom: 1px solid #ddd;<br>}<br><br>.connection-status {<br>  font-size: 0.85em;<br>  padding: 8px;<br>  margin-bottom: 15px;<br>  border-radius: 4px;<br>  text-align: center;<br>  background-color: #ffe6e6;<br>  color: #d32f2f;<br>}<br><br>.connection-status.open {<br>  background-color: #e6ffe6;<br>  color: #388e3c;<br>}<br><br>.connection-status.connecting {<br>  background-color: #fff3e0;<br>  color: #f57c00;<br>}<br><br>.room-list {<br>  list-style: none;<br>}<br><br>.room-item {<br>  padding: 8px 10px;<br>  margin-bottom: 5px;<br>  border-radius: 4px;<br>  cursor: pointer;<br>}<br><br>.room-item:hover {<br>  background-color: #e0e0e0;<br>}<br><br>.room-item.active {<br>  background-color: #2c3e50;<br>  color: white;<br>}<br><br>.chat-container {<br>  flex: 1;<br>  display: flex;<br>  flex-direction: column;<br>}<br><br>.room-header {<br>  padding: 15px 20px;<br>  background-color: #2c3e50;<br>  color: white;<br>  font-weight: bold;<br>}<br><br>.messages {<br>  flex: 1;<br>  overflow-y: auto;<br>  padding: 20px;<br>}<br><br>.message {<br>  margin-bottom: 15px;<br>}<br><br>.message .header {<br>  display: flex;<br>  margin-bottom: 5px;<br>}<br><br>.message .username {<br>  font-weight: bold;<br>  margin-right: 10px;<br>}<br><br>.message .time {<br>  color: #999;<br>  font-size: 0.8em;<br>}<br><br>.message .text {<br>  background-color: #f1f1f1;<br>  padding: 10px;<br>  border-radius: 10px;<br>  max-width: 80%;<br>  word-break: break-word;<br>}<br><br>.message.own {<br>  text-align: right;<br>}<br><br>.message.own .text {<br>  background-color: #3498db;<br>  color: white;<br>  margin-left: auto;<br>}<br><br>.message.system {<br>  text-align: center;<br>  font-style: italic;<br>  color: #666;<br>  margin: 10px 0;<br>}<br><br>.typing-indicator {<br>  padding: 5px 20px;<br>  color: #666;<br>  font-style: italic;<br>  min-height: 30px;<br>}<br><br>.message-form {<br>  display: flex;<br>  padding: 10px 20px;<br>  background-color: #f9f9f9;<br>  border-top: 1px solid #ddd;<br>}<br><br>.message-form input {<br>  flex: 1;<br>  padding: 10px;<br>  border: 1px solid #ddd;<br>  border-radius: 4px;<br>  margin-right: 10px;<br>}<br><br>.message-form button {<br>  padding: 10px 15px;<br>  background-color: #3498db;<br>  color: white;<br>  border: none;<br>  border-radius: 4px;<br>  cursor: pointer;<br>}<br><br>.message-form button:disabled {<br>  background-color: #ccc;<br>  cursor: not-allowed;<br>}</pre><p>Now, here’s the client code using @ws-kit/client (much simpler than manual WebSocket handling):</p><pre>// app.js<br>import { wsClient } from &quot;@ws-kit/client/zod&quot;;<br>import {<br>  Authenticate,<br>  AuthSuccess,<br>  JoinRoom,<br>  UserJoined,<br>  UserLeft,<br>  ChatMessage,<br>  SendMessage,<br>  RoomList,<br>  TypingStart,<br>  UserTyping,<br>} from &quot;./shared/schemas.js&quot;;<br><br>// DOM elements<br>const usernameElement = document.getElementById(&quot;username&quot;);<br>const loginButton = document.getElementById(&quot;login-btn&quot;);<br>const roomList = document.getElementById(&quot;room-list&quot;);<br>const roomHeader = document.getElementById(&quot;room-header&quot;);<br>const messagesContainer = document.getElementById(&quot;messages&quot;);<br>const typingIndicator = document.getElementById(&quot;typing-indicator&quot;);<br>const messageForm = document.getElementById(&quot;message-form&quot;);<br>const messageInput = document.getElementById(&quot;message-input&quot;);<br>const sendButton = document.getElementById(&quot;send-btn&quot;);<br>const connectionStatus = document.getElementById(&quot;connection-status&quot;);<br><br>// App state<br>let currentUser = {<br>  userId: null,<br>  username: null,<br>};<br>let currentRoomId = null;<br>let rooms = [];<br><br>// Create the WebSocket client with auto-reconnection<br>const client = wsClient({<br>  url: `ws://${window.location.host}/ws`,<br>  autoConnect: true,<br>  reconnect: {<br>    enabled: true,<br>    maxAttempts: 5,<br>    initialDelayMs: 300,<br>    maxDelayMs: 10_000,<br>    jitter: &quot;full&quot;,<br>  },<br>  auth: {<br>    getToken: () =&gt; localStorage.getItem(&quot;chatToken&quot;),<br>    attach: &quot;query&quot;,<br>  },<br>});<br><br>// Monitor connection state<br>client.onState((state) =&gt; {<br>  connectionStatus.textContent = state.charAt(0).toUpperCase() + state.slice(1);<br>  connectionStatus.className = `connection-status ${state}`;<br><br>  // Enable/disable input based on connection<br>  const canSend = state === &quot;open&quot; &amp;&amp; currentUser.userId;<br>  messageInput.disabled = !canSend;<br>  sendButton.disabled = !canSend;<br>});<br><br>// Handle room list<br>client.on(RoomList, (msg) =&gt; {<br>  rooms = msg.payload.rooms;<br>  renderRoomList();<br>});<br><br>// Handle authentication success<br>client.on(AuthSuccess, (msg) =&gt; {<br>  const { userId, username } = msg.payload;<br>  currentUser.userId = userId;<br>  currentUser.username = username;<br><br>  // Store token for next session<br>  localStorage.setItem(&quot;chatToken&quot;, Math.random().toString(36).substring(7));<br><br>  usernameElement.textContent = username;<br>  loginButton.textContent = &quot;Logout&quot;;<br><br>  // Enable message input<br>  messageInput.disabled = false;<br>  sendButton.disabled = false;<br>});<br><br>// Handle user joined<br>client.on(UserJoined, (msg) =&gt; {<br>  if (msg.payload.roomId === currentRoomId) {<br>    const isCurrentUser = msg.payload.userId === currentUser.userId;<br>    const text = isCurrentUser<br>      ? &quot;You joined the room&quot;<br>      : `${msg.payload.username} joined the room`;<br>    addSystemMessage(text);<br>  }<br>});<br><br>// Handle user left<br>client.on(UserLeft, (msg) =&gt; {<br>  if (msg.payload.roomId === currentRoomId) {<br>    const isCurrentUser = msg.payload.userId === currentUser.userId;<br>    const text = isCurrentUser<br>      ? &quot;You left the room&quot;<br>      : `${msg.payload.username} left the room`;<br>    addSystemMessage(text);<br>  }<br>});<br><br>// Handle chat message<br>client.on(ChatMessage, (msg) =&gt; {<br>  if (msg.payload.roomId === currentRoomId) {<br>    const { userId, username, text, timestamp } = msg.payload;<br>    const isOwnMessage = userId === currentUser.userId;<br><br>    const messageEl = document.createElement(&quot;div&quot;);<br>    messageEl.className = `message ${isOwnMessage ? &quot;own&quot; : &quot;&quot;}`;<br>    messageEl.innerHTML = `<br>      &lt;div class=&quot;header&quot;&gt;<br>        &lt;span class=&quot;username&quot;&gt;${isOwnMessage ? &quot;You&quot; : username}&lt;/span&gt;<br>        &lt;span class=&quot;time&quot;&gt;${new Date(timestamp).toLocaleTimeString()}&lt;/span&gt;<br>      &lt;/div&gt;<br>      &lt;div class=&quot;text&quot;&gt;${escapeHtml(text)}&lt;/div&gt;<br>    `;<br><br>    messagesContainer.appendChild(messageEl);<br>    scrollToBottom();<br>  }<br>});<br><br>// Handle typing indicator<br>client.on(UserTyping, (msg) =&gt; {<br>  if (<br>    msg.payload.roomId === currentRoomId &amp;&amp;<br>    msg.payload.userId !== currentUser.userId<br>  ) {<br>    typingIndicator.textContent = `${msg.payload.username} is typing...`;<br><br>    setTimeout(() =&gt; {<br>      typingIndicator.textContent = &quot;&quot;;<br>    }, 3000);<br>  }<br>});<br><br>// Error handling<br>client.onError((error, context) =&gt; {<br>  console.error(&quot;WebSocket error:&quot;, error.message, context);<br><br>  if (context.type === &quot;validation&quot;) {<br>    addSystemMessage(&quot;Received invalid message from server&quot;, true);<br>  } else if (context.type === &quot;parse&quot;) {<br>    addSystemMessage(&quot;Failed to parse message&quot;, true);<br>  }<br>});<br><br>// Render room list<br>function renderRoomList() {<br>  roomList.innerHTML = &quot;&quot;;<br>  rooms.forEach((room) =&gt; {<br>    const li = document.createElement(&quot;li&quot;);<br>    li.className = `room-item ${room.id === currentRoomId ? &quot;active&quot; : &quot;&quot;}`;<br>    li.textContent = `${room.name} (${room.userCount})`;<br>    li.addEventListener(&quot;click&quot;, () =&gt; joinRoom(room.id));<br>    roomList.appendChild(li);<br>  });<br>}<br><br>// Join a room<br>function joinRoom(roomId) {<br>  if (currentRoomId === roomId) return;<br><br>  messagesContainer.innerHTML = &quot;&quot;;<br>  typingIndicator.textContent = &quot;&quot;;<br>  currentRoomId = roomId;<br><br>  const room = rooms.find((r) =&gt; r.id === roomId);<br>  if (room) {<br>    roomHeader.textContent = room.name;<br>  }<br><br>  client.send(JoinRoom, { roomId });<br>  renderRoomList();<br>}<br><br>// Send a chat message<br>function sendChatMessage(text) {<br>  if (!text.trim() || !currentRoomId) return;<br><br>  const sent = client.send(SendMessage, {<br>    roomId: currentRoomId,<br>    text: text.trim(),<br>  });<br><br>  if (sent) {<br>    messageInput.value = &quot;&quot;;<br>  } else {<br>    addSystemMessage(&quot;Failed to send message&quot;, true);<br>  }<br>}<br><br>// Add system message<br>function addSystemMessage(text, isError = false) {<br>  const messageEl = document.createElement(&quot;div&quot;);<br>  messageEl.className = `message system ${isError ? &quot;error&quot; : &quot;&quot;}`;<br>  messageEl.textContent = text;<br>  messagesContainer.appendChild(messageEl);<br>  scrollToBottom();<br>}<br><br>// Utilities<br>function scrollToBottom() {<br>  messagesContainer.scrollTop = messagesContainer.scrollHeight;<br>}<br><br>function escapeHtml(text) {<br>  const div = document.createElement(&quot;div&quot;);<br>  div.textContent = text;<br>  return div.innerHTML;<br>}<br><br>// Event listeners<br>loginButton.addEventListener(&quot;click&quot;, () =&gt; {<br>  if (currentUser.userId) {<br>    // Logout<br>    localStorage.removeItem(&quot;chatToken&quot;);<br>    currentUser = { userId: null, username: null };<br>    usernameElement.textContent = &quot;Not logged in&quot;;<br>    loginButton.textContent = &quot;Login&quot;;<br>    client.close();<br>  } else {<br>    // Login<br>    client.send(Authenticate, { token: &quot;demo-token&quot; });<br>  }<br>});<br><br>messageForm.addEventListener(&quot;submit&quot;, (e) =&gt; {<br>  e.preventDefault();<br>  sendChatMessage(messageInput.value);<br>});<br><br>messageInput.addEventListener(&quot;input&quot;, () =&gt; {<br>  if (currentRoomId &amp;&amp; client.isConnected) {<br>    client.send(TypingStart, { roomId: currentRoomId });<br>  }<br>});<br><br>// Optional: Log connection state changes<br>client.onState((state) =&gt; {<br>  console.log(&quot;Connection state:&quot;, state);<br>});</pre><h4>Step 6: Running the Application</h4><p>With everything in place, let’s run our chat application:</p><pre>bun run server.ts</pre><p>Now open your browser to http://localhost:3000, and you should see the chat interface. You can:</p><ol><li>Click the login button to get a random username</li><li>Join one of the two default rooms (General Chat or Random Stuff)</li><li>Send messages and see them appear in real-time</li><li>See typing indicators when other users are typing</li></ol><p>You can even open multiple browser tabs to simulate different users!</p><h4>Extending the Application</h4><p>This chat application is just a starting point. With the robust foundation provided by WS-Kit, you can easily extend it with additional features:</p><ol><li><strong>Direct messaging</strong>: Add a new message schema for private messaging between users</li><li><strong>User profiles</strong>: Store and display more information about users</li><li><strong>Message history</strong>: Add persistence to store chat history</li><li><strong>Room creation</strong>: Allow users to create their own chat rooms</li><li><strong>Rich media</strong>: Improve the attachment support for images, videos, etc.</li><li><strong>Moderation tools</strong>: Add features for admins to moderate chats</li></ol><h4>Conclusion</h4><p>We’ve built a complete real-time chat application using Bun, WebSockets, and <strong>WS-Kit</strong>. The application features:</p><ul><li>Type-safe messaging with Zod schemas</li><li>Room-based chat with join/leave notifications</li><li>User authentication</li><li>Real-time message delivery</li><li>Typing indicators</li><li>Error handling</li></ul><p>Despite the simplicity of our example, it showcases the power of a type-safe approach to WebSocket messaging. By defining our message schemas upfront and using <strong>WS-Kit</strong> to handle validation and routing, we’ve created a codebase that’s easy to understand, extend, and maintain.</p><p>No more giant switch statements. No more type coercion surprises. No more undefined property errors. Just clean, type-safe WebSocket messaging that scales with your application’s needs.</p><p>So the next time someone asks you to build “just a simple chat app” (which, let’s be honest, is never simple), you’ll have the tools you need to build it properly from the start. Your future self — the one who has to maintain this code six months from now while sipping coffee at 2 AM — will thank you.</p><h3>Part 5: Advanced Patterns</h3><h4>Beyond the Basics: Leveling Up Your WebSocket Game</h4><p>Now that we’ve built a functional chat application, let’s dive into some advanced patterns that can take your WebSocket applications from “it works” to “wow, that’s impressive!” After all, anyone can build a chat app — it’s like the “Hello World” of WebSockets — but production-grade applications require more sophisticated techniques.</p><p>Think of these patterns as the difference between knowing how to play “Hot Cross Buns” on the recorder and performing a jazz improvisation. Same instrument, vastly different results. Let’s jazz things up!</p><blockquote><strong>Note on Client vs Server Patterns</strong></blockquote><blockquote>The @ws-kit/client SDK handles many advanced patterns automatically on the client side. This section covers:</blockquote><blockquote><strong>Client-side (handled by </strong><strong>@ws-kit/client SDK):</strong><br>✅ Connection pooling and state management (client.state)<br>✅ Automatic reconnection with exponential backoff<br>✅ Request/response correlation (RPC via client.request())<br>✅ Heartbeat monitoring<br>✅ Message queueing while disconnected<br>✅ Centralized error handling (client.onError())</blockquote><blockquote><strong>Server-side patterns (covered in this section)</strong>:<br>- Multi-client connection tracking and registration<br>- Rate limiting per user or connection<br>- Broadcasting to subsets of clients<br>- Advanced pub/sub with selective message delivery<br>- Protocol negotiation and feature detection</blockquote><blockquote>For most applications, the SDK’s built-in features are sufficient. The patterns here address production scenarios requiring custom server-side orchestration.</blockquote><h4>Connection Pools and Client Tracking</h4><p>In production applications, you’ll often need to keep track of connected clients beyond what’s directly available in the WebSocket object. This is especially important for features like:</p><ul><li>Displaying online/offline status</li><li>User activity monitoring</li><li>Rate limiting</li><li>Resource cleanup</li></ul><p>Here’s a robust connection pool implementation using <strong>WS-Kit</strong>:</p><pre>import { z, createRouter } from &quot;@ws-kit/zod&quot;;<br>import type { ServerWebSocket } from &quot;bun&quot;;<br><br>type ClientInfo = {<br>  userId: string;<br>  username: string;<br>  connectedAt: number;<br>  lastActivity: number;<br>  rooms: Set&lt;string&gt;;<br>};<br><br>class ConnectionPool&lt;T&gt; {<br>  private clients = new Map&lt;string, ServerWebSocket&lt;T &amp; ClientInfo&gt;&gt;();<br>  private userConnections = new Map&lt;string, Set&lt;string&gt;&gt;();<br><br>  /**<br>   * Register a new connection<br>   */<br>  add(<br>    clientId: string,<br>    ws: ServerWebSocket&lt;T &amp; ClientInfo&gt;,<br>    userId?: string,<br>  ): void {<br>    // Store connection by client ID<br>    this.clients.set(clientId, ws as ServerWebSocket&lt;T &amp; ClientInfo&gt;);<br><br>    // Track by user ID if available<br>    if (userId) {<br>      if (!this.userConnections.has(userId)) {<br>        this.userConnections.set(userId, new Set());<br>      }<br>      this.userConnections.get(userId)!.add(clientId);<br>    }<br>  }<br><br>  /**<br>   * Update user ID for an existing connection<br>   */<br>  associateWithUser(clientId: string, userId: string): void {<br>    const ws = this.clients.get(clientId);<br>    if (!ws) return;<br><br>    ws.data.userId = userId;<br><br>    if (!this.userConnections.has(userId)) {<br>      this.userConnections.set(userId, new Set());<br>    }<br>    this.userConnections.get(userId)!.add(clientId);<br>  }<br><br>  /**<br>   * Remove a connection<br>   */<br>  remove(clientId: string): void {<br>    const ws = this.clients.get(clientId);<br>    if (!ws) return;<br><br>    const userId = ws.data.userId;<br><br>    // Remove from clients map<br>    this.clients.delete(clientId);<br><br>    // Clean up user association<br>    if (userId &amp;&amp; this.userConnections.has(userId)) {<br>      const connections = this.userConnections.get(userId)!;<br>      connections.delete(clientId);<br><br>      if (connections.size === 0) {<br>        this.userConnections.delete(userId);<br>      }<br>    }<br>  }<br><br>  /**<br>   * Update activity timestamp<br>   */<br>  updateActivity(clientId: string): void {<br>    const ws = this.clients.get(clientId);<br>    if (ws) {<br>      ws.data.lastActivity = Date.now();<br>    }<br>  }<br><br>  /**<br>   * Check if user is online (has any active connections)<br>   */<br>  isUserOnline(userId: string): boolean {<br>    return (<br>      this.userConnections.has(userId) &amp;&amp;<br>      this.userConnections.get(userId)!.size &gt; 0<br>    );<br>  }<br><br>  /**<br>   * Get all connections for a user<br>   */<br>  getUserConnections(userId: string): ServerWebSocket&lt;T &amp; ClientInfo&gt;[] {<br>    if (!this.userConnections.has(userId)) return [];<br><br>    return Array.from(this.userConnections.get(userId)!)<br>      .map((clientId) =&gt; this.clients.get(clientId))<br>      .filter(Boolean) as ServerWebSocket&lt;T &amp; ClientInfo&gt;[];<br>  }<br><br>  /**<br>   * Get all clients<br>   */<br>  getAllClients(): ServerWebSocket&lt;T &amp; ClientInfo&gt;[] {<br>    return Array.from(this.clients.values());<br>  }<br><br>  /**<br>   * Send message to all connections of a specific user<br>   */<br>  sendToUser&lt;S, P&gt;(userId: string, schema: S, payload: P): void {<br>    const connections = this.getUserConnections(userId);<br><br>    for (const ws of connections) {<br>      // Using the WebSocketRouter&#39;s message format<br>      ws.send(<br>        JSON.stringify({<br>          type: (schema as any).type,<br>          payload,<br>        }),<br>      );<br>    }<br>  }<br>}<br><br>export default ConnectionPool;</pre><p>Now let’s integrate it with our router:</p><pre>import { z, createRouter } from &quot;@ws-kit/zod&quot;;<br>import { randomUUID } from &quot;crypto&quot;;<br>import ConnectionPool from &quot;./connection-pool&quot;;<br>import * as schema from &quot;./schemas&quot;;<br>import chatRouter from &quot;./chat-router&quot;;<br><br>type AppData = schema.Meta &amp; {<br>  clientId: string;<br>  connectedAt: number;<br>  lastActivity: number;<br>};<br><br>// Create the WebSocket router<br>const router = createRouter&lt;AppData&gt;();<br><br>// Create connection pool<br>const pool = new ConnectionPool&lt;AppData&gt;();<br><br>// Add connection tracking<br>router.onOpen((ctx) =&gt; {<br>  // Generate a unique ID for this connection<br>  const clientId = randomUUID();<br><br>  // Set initial connection metadata<br>  ctx.ws.data.clientId = clientId;<br>  ctx.ws.data.connectedAt = Date.now();<br>  ctx.ws.data.lastActivity = Date.now();<br><br>  // Add to connection pool<br>  pool.add(clientId, ctx.ws);<br><br>  console.log(`Client connected: ${clientId}`);<br>});<br><br>// Add activity tracking middleware<br>router.use((ctx, next) =&gt; {<br>  // Update last activity timestamp<br>  ctx.ws.data.lastActivity = Date.now();<br>  pool.updateActivity(ctx.ws.data.clientId);<br><br>  // Continue processing<br>  return next();<br>});<br><br>// When user authenticates, associate their connection with their user ID<br>router.on(schema.Authenticate, (ctx) =&gt; {<br>  // Authentication logic...<br><br>  // Associate connection with user<br>  if (ctx.ws.data.userId) {<br>    pool.associateWithUser(ctx.ws.data.clientId, ctx.ws.data.userId);<br>  }<br><br>  // Continue with normal flow...<br>});<br><br>// Handle disconnection<br>router.onClose((ctx) =&gt; {<br>  console.log(`Client disconnected: ${ctx.ws.data.clientId}`);<br>  pool.remove(ctx.ws.data.clientId);<br>});<br><br>// Add our chat routes<br>router.merge(chatRouter);<br><br>// Expose pool to other modules<br>export { pool };</pre><p>With this connection pool, you can now easily:</p><ul><li>Send messages to all of a user’s devices (multi-device support)</li><li>Check if users are online</li><li>Implement presence detection</li><li>Monitor connection statistics</li></ul><h4>Rate Limiting and Throttling</h4><p>Nothing ruins a WebSocket service faster than a client that sends messages at the speed of light (or a poorly written client that got stuck in a message loop). Let’s implement a rate limiter middleware:</p><pre>import { z, createRouter } from &quot;@ws-kit/zod&quot;;<br><br>type RateLimitOptions = {<br>  // Maximum messages per window<br>  maxMessages: number;<br><br>  // Time window in milliseconds<br>  windowMs: number;<br><br>  // Optional exception for specific message types<br>  excludeTypes?: string[];<br>};<br><br>type RateLimitData = {<br>  counter: number;<br>  resetAt: number;<br>};<br><br>// Store rate limit data by client ID<br>const limiters = new Map&lt;string, RateLimitData&gt;();<br><br>// Clean up stale rate limit data every 5 minutes<br>setInterval(<br>  () =&gt; {<br>    const now = Date.now();<br>    for (const [clientId, data] of limiters.entries()) {<br>      if (data.resetAt &lt;= now) {<br>        limiters.delete(clientId);<br>      }<br>    }<br>  },<br>  5 * 60 * 1000,<br>);<br><br>// Create rate limiter middleware<br>export function createRateLimiter(options: RateLimitOptions) {<br>  const { maxMessages, windowMs, excludeTypes = [] } = options;<br><br>  return async function rateLimiterMiddleware(ctx, next) {<br>    // Skip rate limiting for excluded message types<br>    if (excludeTypes.includes(ctx.type)) {<br>      await next();<br>      return;<br>    }<br><br>    const clientId = ctx.ws.data.clientId;<br><br>    if (!clientId) {<br>      // Can&#39;t rate limit without client ID<br>      return next();<br>    }<br><br>    const now = Date.now();<br>    let limiter = limiters.get(clientId);<br><br>    // Initialize or reset if window has passed<br>    if (!limiter || limiter.resetAt &lt;= now) {<br>      limiter = {<br>        counter: 0,<br>        resetAt: now + windowMs,<br>      };<br>      limiters.set(clientId, limiter);<br>    }<br><br>    // Check if rate limit exceeded<br>    if (limiter.counter &gt;= maxMessages) {<br>      const secondsRemaining = Math.ceil((limiter.resetAt - now) / 1000);<br><br>      // Send error message with proper details<br>      ctx.error(<br>        &quot;RESOURCE_EXHAUSTED&quot;,<br>        &quot;Rate limit exceeded&quot;,<br>        { secondsRemaining, maxMessages, windowMs },<br>        { retryable: true, retryAfterMs: limiter.resetAt - now },<br>      );<br><br>      return; // Stop processing<br>    }<br><br>    // Increment counter and continue<br>    limiter.counter++;<br>    await next();<br>  };<br>}</pre><p>Now let’s apply this middleware to our router:</p><pre>import { z, createRouter } from &quot;@ws-kit/zod&quot;;<br>import { createRateLimiter } from &quot;./rate-limiter&quot;;<br><br>const router = createRouter&lt;AppData&gt;();<br><br>// Apply rate limiting<br>router.use(<br>  createRateLimiter({<br>    maxMessages: 20, // 20 messages<br>    windowMs: 10_000, // per 10 seconds<br>    excludeTypes: [<br>      // Don&#39;t rate limit typing indicators<br>      &quot;TYPING_START&quot;,<br>      &quot;TYPING_STOP&quot;,<br>    ],<br>  }),<br>);<br><br>// Rest of your server setup...</pre><h4>Custom PubSub with Selective Message Delivery</h4><p>While <strong>WS-Kit</strong> provides built-in pub/sub through ctx.publish() and ctx.subscribe(), sometimes you need advanced filtering based on user properties. This section shows a custom implementation for scenarios requiring fine-grained control:</p><p><strong>For most applications, WS-Kit’s native pub/sub is sufficient</strong>:</p><pre>// Simple room-based broadcasting with WS-Kit<br>router.on(schema.ChatMessage, (ctx) =&gt; {<br>  const { roomId, text } = ctx.payload;<br><br>  // Publish to all subscribers in room (with validation)<br>  ctx.publish(roomId, schema.ChatMessage, {<br>    roomId,<br>    userId: ctx.ws.data.userId,<br>    username: ctx.ws.data.username,<br>    text,<br>    timestamp: Date.now(),<br>  });<br>});<br><br>router.on(schema.JoinRoom, (ctx) =&gt; {<br>  const { roomId } = ctx.payload;<br>  ctx.subscribe(roomId); // Join room<br>});<br><br>router.on(schema.LeaveRoom, (ctx) =&gt; {<br>  const { roomId } = ctx.payload;<br>  ctx.unsubscribe(roomId); // Leave room<br>});</pre><blockquote><strong>When to Use EnhancedPubSub</strong>: WS-Kit’s native ctx.publish() and ctx.subscribe() are sufficient for most applications, providing simple topic-based broadcasting with automatic message validation. Consider implementing a custom PubSub extension only when you need role-based filtering, metadata-based message delivery, or complex subscriber filtering logic that goes beyond basic topic subscriptions. For typical chat applications, room management, and notification systems, stick with the native approach shown above.</blockquote><p>**For advanced filtering use cases, here’s a custom PubSub extension:</p><pre>import { z, createRouter, message } from &quot;@ws-kit/zod&quot;;<br>import type { ServerWebSocket } from &quot;bun&quot;;<br><br>// Define a topic subscriber with filtering options<br>type Subscriber&lt;T&gt; = {<br>  ws: ServerWebSocket&lt;T&gt;;<br>  filter?: (meta: T) =&gt; boolean;<br>};<br><br>class EnhancedPubSub&lt;T&gt; {<br>  private topics = new Map&lt;string, Set&lt;Subscriber&lt;T&gt;&gt;&gt;();<br><br>  /**<br>   * Subscribe a client to a topic with optional filter<br>   */<br>  subscribe(<br>    ws: ServerWebSocket&lt;T&gt;,<br>    topic: string,<br>    filter?: (meta: T) =&gt; boolean,<br>  ): void {<br>    if (!this.topics.has(topic)) {<br>      this.topics.set(topic, new Set());<br>    }<br><br>    this.topics.get(topic)!.add({ ws, filter });<br>  }<br><br>  /**<br>   * Unsubscribe a client from a topic<br>   */<br>  unsubscribe(ws: ServerWebSocket&lt;T&gt;, topic: string): void {<br>    if (!this.topics.has(topic)) return;<br><br>    const subscribers = this.topics.get(topic)!;<br>    const toRemove = Array.from(subscribers).filter((sub) =&gt; sub.ws === ws);<br><br>    for (const sub of toRemove) {<br>      subscribers.delete(sub);<br>    }<br><br>    if (subscribers.size === 0) {<br>      this.topics.delete(topic);<br>    }<br>  }<br><br>  /**<br>   * Unsubscribe a client from all topics<br>   */<br>  unsubscribeAll(ws: ServerWebSocket&lt;T&gt;): void {<br>    for (const [topic, subscribers] of this.topics.entries()) {<br>      this.unsubscribe(ws, topic);<br>    }<br>  }<br><br>  /**<br>   * Publish a message to all subscribers of a topic<br>   */<br>  publish&lt;S, P&gt;(<br>    sourceSender: ServerWebSocket&lt;T&gt; | null,<br>    topic: string,<br>    schema: S,<br>    payload: P,<br>    skipSender: boolean = true,<br>  ): number {<br>    if (!this.topics.has(topic)) return 0;<br><br>    const subscribers = this.topics.get(topic)!;<br>    let sentCount = 0;<br><br>    const message = JSON.stringify({<br>      type: (schema as any).type,<br>      payload,<br>    });<br><br>    for (const { ws, filter } of subscribers) {<br>      // Skip sender if requested<br>      if (skipSender &amp;&amp; ws === sourceSender) continue;<br><br>      // Apply filter if one exists<br>      if (filter &amp;&amp; !filter(ws.data)) continue;<br><br>      // Send the message<br>      ws.send(message);<br>      sentCount++;<br>    }<br><br>    return sentCount;<br>  }<br><br>  /**<br>   * Get count of subscribers for a topic<br>   */<br>  subscriberCount(topic: string): number {<br>    return this.topics.has(topic) ? this.topics.get(topic)!.size : 0;<br>  }<br><br>  /**<br>   * Get all topics a client is subscribed to<br>   */<br>  getSubscribedTopics(ws: ServerWebSocket&lt;T&gt;): string[] {<br>    const result: string[] = [];<br><br>    for (const [topic, subscribers] of this.topics.entries()) {<br>      if (Array.from(subscribers).some((sub) =&gt; sub.ws === ws)) {<br>        result.push(topic);<br>      }<br>    }<br><br>    return result;<br>  }<br>}<br><br>export default EnhancedPubSub;</pre><p>Now we can use this advanced PubSub system to implement features like:</p><pre>import EnhancedPubSub from &quot;./enhanced-pubsub&quot;;<br>import { z, createRouter } from &quot;@ws-kit/zod&quot;;<br>import * as schema from &quot;./schemas&quot;;<br><br>type AppData = schema.Meta;<br><br>const router = createRouter&lt;AppData&gt;();<br>const pubsub = new EnhancedPubSub&lt;AppData&gt;();<br><br>// Handle room joining with role-based filters<br>router.on(schema.JoinRoom, (ctx) =&gt; {<br>  const { roomId } = ctx.payload;<br>  const userId = ctx.ws.data.userId;<br>  const username = ctx.ws.data.username;<br>  const userRole = ctx.ws.data.userRole || &quot;user&quot;;<br><br>  // Subscribe with filter - only receive messages for your role level and below<br>  pubsub.subscribe(ctx.ws, roomId, (clientData) =&gt; {<br>    const messageMinRole = clientData.messageMinRole || &quot;user&quot;;<br><br>    if (messageMinRole === &quot;admin&quot; &amp;&amp; userRole !== &quot;admin&quot;) {<br>      return false; // Filter out admin-only messages<br>    }<br>    if (<br>      messageMinRole === &quot;moderator&quot; &amp;&amp;<br>      userRole !== &quot;admin&quot; &amp;&amp;<br>      userRole !== &quot;moderator&quot;<br>    ) {<br>      return false; // Filter out moderator-only messages<br>    }<br><br>    return true;<br>  });<br><br>  // Let others know user joined<br>  pubsub.publish(ctx.ws, roomId, schema.UserJoined, {<br>    roomId,<br>    userId,<br>    username,<br>  });<br><br>  console.log(`User ${username} (${userId}) joined room: ${roomId}`);<br>});<br><br>// Send message only to admins and moderators<br>router.on(schema.ModAction, (ctx) =&gt; {<br>  const { roomId, action } = ctx.payload;<br><br>  // Only allow moderators and admins to send mod actions<br>  const userRole = ctx.ws.data.userRole;<br>  if (userRole !== &quot;moderator&quot; &amp;&amp; userRole !== &quot;admin&quot;) {<br>    ctx.error(<br>      &quot;PERMISSION_DENIED&quot;,<br>      &quot;You don&#39;t have permission to perform moderator actions&quot;,<br>    );<br>    return;<br>  }<br><br>  // Set minimum role to receive this message<br>  ctx.ws.data.messageMinRole = &quot;moderator&quot;;<br><br>  // Publish to room (only mods/admins will receive it due to filter)<br>  pubsub.publish(ctx.ws, roomId, schema.ModAction, {<br>    roomId,<br>    userId: ctx.ws.data.userId,<br>    username: ctx.ws.data.username,<br>    action,<br>  });<br><br>  // Reset the message minimum role<br>  ctx.ws.data.messageMinRole = &quot;user&quot;;<br>});<br><br>// Clean up subscriptions when user leaves<br>router.onClose((ctx) =&gt; {<br>  pubsub.unsubscribeAll(ctx.ws);<br>});<br><br>export default router;</pre><h4>Request/Response Pattern (RPC)</h4><p>Real-time applications often need reliable request/response patterns for operations like fetching data, updating settings, or triggering actions. <strong>WS-Kit</strong> provides built-in RPC support with automatic correlation IDs, timeouts, and type safety.</p><h4>Server-Side RPC Handler</h4><p>Define request and response schemas, then handle with router.rpc():</p><pre>import { z, message, createRouter } from &quot;@ws-kit/zod&quot;;<br><br>// Define request and response schemas<br>const FetchProfile = message(&quot;FETCH_PROFILE&quot;, { userId: z.string() });<br>const ProfileResponse = message(&quot;PROFILE_RESPONSE&quot;, {<br>  id: z.string(),<br>  name: z.string(),<br>  email: z.string(),<br>});<br><br>const router = createRouter&lt;AppData&gt;();<br><br>// Handle RPC request<br>router.rpc(FetchProfile, async (ctx) =&gt; {<br>  const { userId } = ctx.payload;<br><br>  try {<br>    // Fetch user profile from database<br>    const profile = await fetchUserProfileFromDb(userId);<br><br>    if (!profile) {<br>      // Send error response<br>      ctx.error(&quot;NOT_FOUND&quot;, `User ${userId} not found`);<br>      return;<br>    }<br><br>    // Send typed response (automatically correlates with request)<br>    ctx.reply(ProfileResponse, profile);<br>  } catch (error) {<br>    ctx.error(&quot;INTERNAL&quot;, &quot;Failed to fetch profile&quot;);<br>  }<br>});</pre><p><strong>Key RPC features</strong>:</p><ul><li>✅ Automatic correlation ID generation</li><li>✅ Built-in timeout handling</li><li>✅ Full type safety on both request and response</li><li>✅ Structured error responses with gRPC-standard error codes</li></ul><blockquote><strong>Error Codes</strong>: WS-Kit uses gRPC-standard error codes for consistency across your application. Common codes include: NOT_FOUND (resource doesn&#39;t exist), PERMISSION_DENIED (insufficient permissions), INVALID_ARGUMENT (malformed request), INTERNAL (server error), RESOURCE_EXHAUSTED (rate limit exceeded), UNAUTHENTICATED (missing or invalid credentials), and UNAVAILABLE (service temporarily down). Use these standard codes in ctx.error() for predictable client-side error handling.</blockquote><h4>Client-Side RPC Call</h4><p>On the client, client.request() handles correlation automatically:</p><pre>// Client code using @ws-kit/client/zod<br>import {<br>  wsClient,<br>  TimeoutError,<br>  ServerError,<br>  ConnectionClosedError,<br>} from &quot;@ws-kit/client/zod&quot;;<br>import { FetchProfile, ProfileResponse } from &quot;./shared/schemas.js&quot;;<br><br>const client = wsClient({ url: &quot;ws://localhost:3000/ws&quot; });<br><br>async function getUserProfile(userId) {<br>  try {<br>    // Send request and wait for typed response<br>    const response = await client.request(<br>      FetchProfile,<br>      { userId },<br>      ProfileResponse,<br>      { timeoutMs: 5000 }, // 5 second timeout<br>    );<br><br>    console.log(&quot;Profile:&quot;, response.payload);<br>    // response.payload is fully typed: { id: string, name: string, email: string }<br><br>    return response.payload;<br>  } catch (error) {<br>    if (error instanceof TimeoutError) {<br>      console.error(`Request timed out after ${error.timeoutMs}ms`);<br>    } else if (error instanceof ServerError) {<br>      console.error(`Server error: ${error.code}`, error.context);<br>    } else if (error instanceof ConnectionClosedError) {<br>      console.error(&quot;Connection closed before reply&quot;);<br>    }<br>    throw error;<br>  }<br>}<br><br>// Usage<br>const profile = await getUserProfile(&quot;user-123&quot;);</pre><p><strong>Client request features</strong>:</p><ul><li>✅ Automatic correlationId generation (UUIDv4)</li><li>✅ Configurable timeout (default: 30 seconds)</li><li>✅ AbortSignal support for cancellation</li><li>✅ Typed responses with validation</li><li>✅ Automatic reconnection with queued requests</li></ul><h4>Cancellation with AbortSignal</h4><p>The client SDK supports standard AbortSignal for cancelling in-flight RPC requests. This is useful when users navigate away from a page, close a modal, or when you want to implement request debouncing. Cancelled requests are cleaned up immediately without waiting for timeouts.</p><pre>const controller = new AbortController();<br><br>const promise = client.request(<br>  FetchProfile,<br>  { userId: &quot;user-123&quot; },<br>  ProfileResponse,<br>  { signal: controller.signal },<br>);<br><br>// Cancel the request<br>setTimeout(() =&gt; controller.abort(), 2000);<br><br>try {<br>  const response = await promise;<br>} catch (error) {<br>  if (error instanceof StateError &amp;&amp; error.message.includes(&quot;aborted&quot;)) {<br>    console.log(&quot;Request was cancelled by user&quot;);<br>  }<br>}</pre><p>The @ws-kit/client SDK automatically handles correlation, timeouts, and retries, so you don&#39;t need to implement custom request tracking. Just use client.request() as shown above.</p><h4>Connection Health Monitoring with Heartbeats</h4><p>WebSocket connections can silently die or become “zombies” where the TCP connection is technically open but no longer passing messages. <strong>WS-Kit</strong> provides built-in heartbeat support through router configuration:</p><blockquote><strong>Note</strong>: WS-Kit’s heartbeat system operates on two layers: (1) the framework’s automatic WebSocket ping/pong frames for detecting broken connections, and (2) optional application-level custom heartbeat messages for measuring client latency and application responsiveness. The example below demonstrates both layers working together.</blockquote><pre>import { z, createRouter, message } from &quot;@ws-kit/zod&quot;;<br>import { serve } from &quot;@ws-kit/bun&quot;;<br>import type { Meta } from &quot;./schemas&quot;;<br><br>// Define custom heartbeat messages for application-level monitoring<br>export const HeartbeatPing = message(&quot;HEARTBEAT_PING&quot;, {<br>  timestamp: z.number(),<br>});<br><br>export const HeartbeatPong = message(&quot;HEARTBEAT_PONG&quot;, {<br>  timestamp: z.number(),<br>  latency: z.number().optional(),<br>});<br><br>// Setup router with built-in heartbeat<br>const router = createRouter&lt;Meta&gt;({<br>  heartbeat: {<br>    intervalMs: 30_000, // Send heartbeat every 30 seconds<br>    timeoutMs: 5_000, // Expect response within 5 seconds<br>    onStaleConnection: (clientId, ws) =&gt; {<br>      console.log(`Stale connection detected: ${clientId}`);<br>      // Connection is automatically closed by framework<br>      // Use this callback for cleanup if needed<br>    },<br>  },<br>});<br><br>// Optional: handle custom heartbeat messages for latency measurement<br>router.on(HeartbeatPing, (ctx) =&gt; {<br>  const { timestamp } = ctx.payload;<br>  const latency = Date.now() - timestamp;<br><br>  ctx.send(HeartbeatPong, {<br>    timestamp,<br>    latency,<br>  });<br>});<br><br>// Setup server with heartbeat enabled<br>serve(router, {<br>  port: 3000,<br>});</pre><p>The @ws-kit/client SDK handles heartbeat monitoring automatically when configured:</p><pre>import { wsClient } from &quot;@ws-kit/client/zod&quot;;<br><br>const client = wsClient({<br>  url: &quot;ws://localhost:3000/ws&quot;,<br>  heartbeat: {<br>    // Optional: SDK can detect stale connections<br>    // Heartbeat is handled transparently via WebSocket ping/pong<br>  },<br>});<br><br>// Monitor connection health via state changes<br>client.onState((state) =&gt; {<br>  if (state === &quot;closed&quot;) {<br>    console.warn(&quot;Connection closed, client will auto-reconnect&quot;);<br>  } else if (state === &quot;open&quot;) {<br>    console.log(&quot;Connection healthy and open&quot;);<br>  }<br>});<br><br>// Optional: Measure latency with custom heartbeat messages<br>const HeartbeatPing = message(&quot;HEARTBEAT_PING&quot;, { timestamp: z.number() });<br>const HeartbeatPong = message(&quot;HEARTBEAT_PONG&quot;, { timestamp: z.number() });<br><br>client.on(HeartbeatPong, (msg) =&gt; {<br>  const latency = Date.now() - msg.payload.timestamp;<br>  console.log(`Latency: ${latency}ms`);<br>});<br><br>// Measure latency periodically<br>setInterval(() =&gt; {<br>  if (client.isConnected) {<br>    client.send(HeartbeatPing, { timestamp: Date.now() });<br>  }<br>}, 30_000);</pre><p><strong>Key advantages of WS-Kit’s built-in heartbeat:</strong></p><ul><li>Automatic detection of stale connections</li><li>No manual connection tracking needed</li><li>Configurable intervals and timeouts</li><li>Framework handles connection cleanup</li><li>Can be disabled by omitting heartbeat config</li></ul><h4>Connection Upgrades and Protocol Negotiation</h4><p>In sophisticated applications, you might need to negotiate protocol features or upgrade connections to support different functionality:</p><pre>import { z, createRouter, message } from &quot;@ws-kit/zod&quot;;<br>import type { Meta } from &quot;./schemas&quot;;<br><br>// Define feature flags<br>export enum Feature {<br>  COMPRESSION = &quot;compression&quot;,<br>  ENCRYPTION = &quot;encryption&quot;,<br>  BATCHING = &quot;batching&quot;,<br>  BINARY_MESSAGES = &quot;binary_messages&quot;,<br>}<br><br>// Negotiation message schemas<br>export const ClientCapabilities = message(&quot;CLIENT_CAPABILITIES&quot;, {<br>  protocolVersion: z.string(),<br>  features: z.array(z.nativeEnum(Feature)),<br>  compressionFormats: z.array(z.string()).optional(),<br>});<br><br>export const ServerCapabilities = message(&quot;SERVER_CAPABILITIES&quot;, {<br>  protocolVersion: z.string(),<br>  supportedFeatures: z.array(z.nativeEnum(Feature)),<br>  enabledFeatures: z.array(z.nativeEnum(Feature)),<br>  compressionFormat: z.string().optional(),<br>});<br><br>// Setup protocol negotiation<br>export function setupProtocolNegotiation(router) {<br>  // Server supported features<br>  const supportedFeatures = [Feature.COMPRESSION, Feature.BATCHING];<br><br>  // Handle client capabilities message<br>  router.on(ClientCapabilities, (ctx) =&gt; {<br>    const { protocolVersion, features, compressionFormats } = ctx.payload;<br><br>    // Check protocol version compatibility<br>    if (!isCompatibleVersion(protocolVersion)) {<br>      ctx.error(<br>        &quot;INVALID_ARGUMENT&quot;,<br>        `Unsupported protocol version: ${protocolVersion}. Server requires 1.x`,<br>      );<br><br>      // Terminate connection - incompatible protocol<br>      setTimeout(<br>        () =&gt; ctx.ws.close(1002, &quot;Incompatible protocol version&quot;),<br>        100,<br>      );<br>      return;<br>    }<br><br>    // Determine which features to enable<br>    const enabledFeatures = supportedFeatures.filter((feature) =&gt;<br>      features.includes(feature),<br>    );<br><br>    // Store enabled features in connection metadata<br>    ctx.ws.data.enabledFeatures = enabledFeatures;<br><br>    // Determine compression format if requested<br>    let compressionFormat: string | undefined;<br><br>    if (<br>      enabledFeatures.includes(Feature.COMPRESSION) &amp;&amp;<br>      compressionFormats &amp;&amp;<br>      compressionFormats.length &gt; 0<br>    ) {<br>      // Choose first supported compression format<br>      if (compressionFormats.includes(&quot;gzip&quot;)) {<br>        compressionFormat = &quot;gzip&quot;;<br>      } else if (compressionFormats.includes(&quot;deflate&quot;)) {<br>        compressionFormat = &quot;deflate&quot;;<br>      }<br><br>      ctx.ws.data.compressionFormat = compressionFormat;<br>    }<br><br>    // Send server capabilities<br>    ctx.send(ServerCapabilities, {<br>      protocolVersion: &quot;1.0&quot;,<br>      supportedFeatures,<br>      enabledFeatures,<br>      compressionFormat,<br>    });<br><br>    console.log(<br>      `Negotiated protocol with ${ctx.ws.data.clientId}: ${enabledFeatures.join(&quot;, &quot;)}`,<br>    );<br>  });<br>}<br><br>// Check if client version is compatible with server<br>function isCompatibleVersion(clientVersion: string): boolean {<br>  // Simple version check - in real app you&#39;d use semver<br>  return clientVersion.startsWith(&quot;1.&quot;);<br>}</pre><h4>Conclusion: The Power of Advanced Patterns</h4><p>By implementing these advanced patterns, you’ve taken your WebSocket application from a simple message-passing system to a robust, production-ready communication platform. We’ve covered:</p><ol><li><strong>Connection management</strong> with tracking, pooling, and user association</li><li><strong>Rate limiting</strong> to protect against accidental or malicious overload</li><li><strong>Enhanced PubSub</strong> with selective message delivery based on user properties</li><li><strong>Request/response patterns</strong> for reliable communication with acknowledgments</li><li><strong>Connection health monitoring</strong> with heartbeats to detect zombie connections</li><li><strong>Protocol negotiation</strong> for feature detection and progressive enhancement</li></ol><p>Each of these patterns addresses real-world challenges you’ll face when deploying WebSocket applications at scale. The beauty of using <strong>WS-Kit</strong> is that its clean, type-safe foundation makes it easy to layer these advanced patterns on top without creating a tangled mess of code.</p><p>Remember, in the world of WebSockets, the difference between a toy project and a production system isn’t just in the basic functionality — it’s in how gracefully your application handles edge cases, failures, and scale. With these patterns in your toolkit, you’re well-equipped to build WebSocket applications that don’t just work in the happy path, but thrive in the chaotic reality of the real world.</p><p>And the next time someone casually suggests “Let’s just add real-time messaging to our app, how hard could it be?”, you can smile knowingly — and then build it right the first time.</p><h3>Wrapping It Up</h3><p>We’ve taken quite a journey together, exploring how to build robust, type-safe WebSocket applications with Bun and <strong>WS-Kit</strong>. From the basics of WebSocket communication to advanced patterns like connection management, authentication, and error handling, we’ve covered the essentials of crafting real-time applications that are both maintainable and scalable.</p><h4>Why WS-Kit Stands Out</h4><p><strong>WS-Kit</strong> represents a modern approach to WebSocket development:</p><ul><li><strong>Platform-agnostic</strong>: Works with Bun, Cloudflare Durable Objects, and custom adapters</li><li><strong>Validator-agnostic</strong>: Choose between Zod, Valibot, or your own validation library</li><li><strong>Production-ready</strong>: Built on lessons learned from years of real-time systems experience</li><li><strong>Actively developed</strong>: Continuously improved based on community feedback</li></ul><p>The library is designed to be the foundation for production applications while remaining simple enough for quick prototypes.</p><h3>Getting Support</h3><p>If you encounter any issues, have questions, or want to contribute to the project, check out the <a href="https://github.com/kriasoft/ws-kit"><strong>WS-Kit</strong></a> repository on GitHub. You can also connect with the community and maintainers on <a href="https://discord.gg/aW29wXyb7w">Discord</a> to share your experiences and get help troubleshooting any problems you might face.</p><h4>Final Thoughts</h4><p>Building real-time applications doesn’t have to be complex or error-prone. With the right tools and patterns, you can focus on creating amazing user experiences without getting bogged down in the details of WebSocket message routing or type validation.</p><p>Whether you’re building a simple chat application or a sophisticated collaborative platform, <a href="https://kriasoft.com/ws-kit/"><strong>WS-Kit</strong></a> provides the foundation you need to create reliable, type-safe real-time experiences with confidence.</p><p>Now go forth and build something amazing! And remember, in the fast-moving world of WebSockets, type safety isn’t just a luxury — it’s your best friend.</p><p>Happy coding!</p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=f0aef259a53e" width="1" height="1" alt=""><hr><p><a href="https://levelup.gitconnected.com/building-type-safe-websocket-applications-with-bun-and-zod-f0aef259a53e">Building Type-Safe WebSocket Applications with Bun and Zod</a> was originally published in <a href="https://levelup.gitconnected.com">Level Up Coding</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.</p>]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[Enabling efficient front-end development]]></title>
            <link>https://medium.com/swlh/enabling-efficient-front-end-development-c379ef0e52?source=rss-692b968dbc82------2</link>
            <guid isPermaLink="false">https://medium.com/p/c379ef0e52</guid>
            <category><![CDATA[programming]]></category>
            <category><![CDATA[startup]]></category>
            <category><![CDATA[front-end-development]]></category>
            <category><![CDATA[web-development]]></category>
            <category><![CDATA[devops]]></category>
            <dc:creator><![CDATA[Konstantin Tarkus]]></dc:creator>
            <pubDate>Mon, 17 Jul 2023 16:47:53 GMT</pubDate>
            <atom:updated>2023-07-18T18:30:56.812Z</atom:updated>
            <content:encoded><![CDATA[<h4>The role of web infrastructure engineering teams</h4><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*Fnf1CB418-wioTT7" /><figcaption>Photo by <a href="https://unsplash.com/@anniespratt?utm_source=medium&amp;utm_medium=referral">Annie Spratt</a> on <a href="https://unsplash.com?utm_source=medium&amp;utm_medium=referral">Unsplash</a></figcaption></figure><h4>Introduction</h4><p>In today’s digital landscape, the importance of robust web infrastructure cannot be overstated. Large companies like Facebook have realized this and established dedicated web infrastructure engineering teams to support their front-end developers. However, for small startups, the cost of maintaining such a team might be prohibitive. In this blog post, we will explore the benefits of having a web infrastructure engineering team and discuss an alternative approach for startups that allows them to focus on application code while ensuring a reliable and up-to-date web infrastructure.</p><h4>The value of a dedicated web infrastructure engineering team</h4><p>Large companies invest in web infrastructure engineering teams for several reasons. These teams act as a bridge between development and operations, specializing in designing, building, and maintaining the underlying infrastructure required for the front-end applications. Here are some key benefits they provide:</p><ol><li><strong>Expertise and specialization</strong>: Web infrastructure engineers have in-depth knowledge of various tools, frameworks, and technologies that power modern web applications. They understand the complexities of scaling, performance optimization, and security. Their expertise ensures the infrastructure is designed and implemented in a way that maximizes efficiency and minimizes downtime.</li><li><strong>Continuous integration and deployment (CI/CD)</strong>: Web infrastructure teams establish robust CI/CD workflows that enable seamless deployment of code changes. They automate processes, such as building, testing, and deploying applications, ensuring that developers can focus on writing code rather than worrying about deployment pipelines.</li><li><strong>Performance monitoring and optimization</strong>: These teams proactively monitor the performance of web applications, identifying bottlenecks and optimizing the infrastructure to deliver the best user experience. They leverage tools for performance monitoring, load testing, and analytics to ensure optimal application performance.</li><li><strong>Security and compliance</strong>: Web infrastructure engineers are responsible for implementing robust security measures, protecting user data, and ensuring compliance with industry regulations. They stay updated with the latest security practices and help in mitigating potential risks.</li></ol><h4>Contracting with a part-time infrastructure engineer</h4><p>While small startups may not have the resources to maintain a dedicated web infrastructure engineering team, they can still benefit from infrastructure expertise by contracting with a part-time infrastructure engineer. This approach allows the core development team to focus solely on application code while ensuring a reliable and up-to-date web infrastructure. Here’s why this arrangement can be advantageous:</p><ol><li><strong>Cost-effective solution</strong>: Hiring a full-time web infrastructure engineer can be expensive for startups, especially in the early stages. Contracting with a part-time engineer allows businesses to access the necessary expertise without bearing the cost of a full-time employee.</li><li><strong>Focus on core competencies</strong>: By offloading infrastructure responsibilities to a part-time engineer, the core development team can focus on building and enhancing the application. This separation of concerns improves productivity and allows each team member to contribute to their area of expertise.</li><li><strong>Smooth CI/CD workflows</strong>: A part-time infrastructure engineer can establish and maintain robust CI/CD workflows that automate build, test, and deployment processes. This ensures that the application deployment pipeline remains efficient and reliable, freeing up developers’ time.</li><li><strong>Infrastructure maintenance and updates</strong>: Web technologies evolve rapidly, and keeping up with the latest frameworks, libraries, and security updates can be challenging. A part-time infrastructure engineer can ensure that the core web infrastructure remains up-to-date, secure, and performs optimally, relieving the development team from such concerns.</li></ol><h4>How does it work?</h4><p>Finding a skilled web infrastructure engineer for your startup can be easier than you think. Many of these professionals actively contribute to open-source projects on GitHub. Once your team has settled on a tech stack, you can filter GitHub projects using relevant keywords like [React] or [Boilerplate]. Explore the most popular projects and take note of the maintainers. Reach out to those who have relevant experience and discuss the possibility of contracting their services.</p><p>By the way, if you’re embarking on a startup idea with React, Node.js, Google Cloud stack, I recommend starting with this resource: <a href="https://koistya.gumroad.com/l/react-starter-kit">gumroad.com/l/react-starter-kit</a> (a limited-time offer by me).</p><h4>How much does it cost?</h4><p>The cost of contracting a part-time web infrastructure engineer typically ranges from $300 to $1500 per month, depending on the complexity of the project and allocated working hours. As your company grows, you can always transition to hiring a full-time DevOps or web infrastructure engineer, or even establish a dedicated team to handle your expanding needs.</p><h4>Conclusion</h4><p>While large companies like Facebook have the resources to maintain dedicated web infrastructure engineering teams, startups often face budgetary constraints. By contracting with a part-time infrastructure engineer, small businesses can leverage specialized expertise without the overhead of a full-time team. This arrangement empowers the core development team to focus on application code, knowing that the web infrastructure remains up-to-date, secure, and operates smoothly. By adopting this approach, startups can enhance efficiency, reduce distractions, and pave the way for scalable growth.</p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=c379ef0e52" width="1" height="1" alt=""><hr><p><a href="https://medium.com/swlh/enabling-efficient-front-end-development-c379ef0e52">Enabling efficient front-end development</a> was originally published in <a href="https://medium.com/swlh">The Startup</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.</p>]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[Using EJS with Vite]]></title>
            <link>https://medium.com/@koistya/using-ejs-with-vite-7502a4f79e44?source=rss-692b968dbc82------2</link>
            <guid isPermaLink="false">https://medium.com/p/7502a4f79e44</guid>
            <category><![CDATA[javascript]]></category>
            <category><![CDATA[typescript]]></category>
            <category><![CDATA[web-development]]></category>
            <category><![CDATA[website]]></category>
            <category><![CDATA[cloudflare]]></category>
            <dc:creator><![CDATA[Konstantin Tarkus]]></dc:creator>
            <pubDate>Thu, 15 Jun 2023 11:05:42 GMT</pubDate>
            <atom:updated>2023-06-15T11:20:35.470Z</atom:updated>
            <content:encoded><![CDATA[<h3>Why use EJS with Vite?</h3><p>Let’s consider an example scenario: you are building a web app that will run at <strong>CDN</strong> edge locations using <strong>Cloudflare Workers</strong>. In this scenario, you may have the following requirements:</p><ul><li>You need to configure the <strong>reverse proxy</strong> for certain third-party websites, such as <strong>Framer</strong>, <strong>Intercom Helpdesk</strong>, etc.</li><li>You should be able to inject <strong>custom HTML/JS snippets</strong> into the pages of these websites.</li><li>The code snippets should function correctly in <strong>different environments</strong>, such as <strong>production</strong> and <strong>test/QA</strong>.</li><li>To optimize the application bundle, it is necessary to <strong>pre-compile</strong> these templates instead of including a template library.</li></ul><p>In such cases, using EJS in combination with Vite can be a beneficial choice.</p><h3>What does it look like?</h3><p>The HTML snippet is conveniently placed into a separate file with HTML/JS syntax highlighting and code completion (views/analytics.ejs):</p><pre>&lt;script async src=&quot;https://www.googletagmanager.com/gtag/js?id=&lt;%- env.GA_MEASUREMENT_ID %&gt;&quot;&gt;&lt;/script&gt;<br>&lt;script&gt;<br>  window.dataLayer = window.dataLayer || [];<br>  function gtag() {<br>    dataLayer.push(arguments);<br>  }<br>  gtag(&quot;js&quot;, new Date());<br>  gtag(&quot;config&quot;, &quot;&lt;%- env.GA_MEASUREMENT_ID %&gt;&quot;);<br>&lt;/script&gt;</pre><p>While the Cloudflare Worker script injects it into an (HTML) landing page loaded from Framer:</p><pre>import { Hono } from &quot;hono&quot;;<br>import analytics from &quot;../views/analytics.ejs&quot;;<br><br>export const app = new Hono&lt;Env&gt;();<br><br>// Serve landing pages, inject Google Analytics<br>app.use(&quot;*&quot;, async ({ req, env }, next) =&gt; {<br>  const url = new URL(req.url);<br><br>  // Skip non-landing pages<br>  if (![&quot;/&quot;, &quot;/about&quot;, &quot;/home&quot;].includes(url.pathname)) {<br>    return next();<br>  }<br><br>  const res = await fetch(&quot;https://example.framer.app/&quot;, req.raw);<br><br>  return new HTMLRewriter()<br>    .on(&quot;body&quot;, {<br>      element(el) {<br>        el.onEndTag((tag) =&gt; {<br>          try {<br>            tag.before(analytics(env), { html: true });<br>          } catch (err) {<br>            console.error(err);<br>          }<br>        });<br>      },<br>    })<br>    .transform(res.clone());<br>});</pre><h3>How do I pre-compile EJS templates with Vite?</h3><p>Install ejs and @types/ejs NPM modules as development dependencies (yarn add ejs @types/ejs -D).</p><p>Add the following plugin to your vite.config.ts file:</p><pre>import { compile } from &quot;ejs&quot;;<br>import { readFile } from &quot;node:fs/promises&quot;;<br>import { relative, resolve } from &quot;node:path&quot;;<br>import { defineConfig } from &quot;vite&quot;;<br><br>export default defineConfig({<br>  ...<br>  plugins: [<br>    {<br>      name: &quot;ejs&quot;,<br>      async transform(_, id) {<br>        if (id.endsWith(&quot;.ejs&quot;)) {<br>          const src = await readFile(id, &quot;utf-8&quot;);<br>          const code = compile(src, {<br>            client: true,<br>            strict: true,<br>            localsName: &quot;env&quot;,<br>            views: [resolve(__dirname, &quot;views&quot;)],<br>            filename: relative(__dirname, id),<br>          }).toString();<br>          return `export default ${code}`;<br>        }<br>      },<br>    },<br>  ],<br>});</pre><h3>How to make .ejs imports work with TypeScript?</h3><ol><li>Add **/*.ejs to the list of included files in your tsconfig.json file.</li><li>Add the following type declaration to you global.d.ts file:</li></ol><pre>declare module &quot;*.ejs&quot; {<br>  /**<br>   * Generates HTML markup from an EJS template.<br>   *<br>   * @param locals an object of data to be passed into the template.<br>   * @param escape callback used to escape variables<br>   * @param include callback used to include files at runtime with `include()`<br>   * @param rethrow callback used to handle and rethrow errors<br>   *<br>   * @return Return type depends on `Options.async`.<br>   */<br>  const fn: (<br>    locals?: Data,<br>    escape?: EscapeCallback,<br>    include?: IncludeCallback,<br>    rethrow?: RethrowCallback,<br>  ) =&gt; string;<br>  export default fn;<br>}</pre><p>The <a href="https://github.com/kriasoft/relay-starter-kit">kriasoft/relay-starter-kit</a> is a comprehensive full-stack web application project template that comes pre-configured with all the mentioned features (located in the /edge folder).</p><p>If you require any assistance with web infrastructure and DevOps, feel free to contact me on <a href="https://www.codementor.io/@koistya">Codementor</a> or <a href="https://discord.gg/bSsv7XM">Discord</a>. I’m here to help! Happy coding!</p><h3>References</h3><ul><li><a href="https://ejs.co/">https://ejs.co/</a></li><li><a href="https://vitejs.dev/">https://vitejs.dev/</a></li><li><a href="https://developers.cloudflare.com/workers/">https://developers.cloudflare.com/workers/</a></li><li><a href="https://github.com/kriasoft/relay-starter-kit">https://github.com/kriasoft/relay-starter-kit</a></li></ul><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=7502a4f79e44" width="1" height="1" alt="">]]></content:encoded>
        </item>
    </channel>
</rss>